Parsers
  • 13 Dec 2022
  • 2 Minutes to read
  • Contributors
  • Dark
    Light
  • PDF

Parsers

  • Dark
    Light
  • PDF

Article summary

parser_type

usage: required
The parser_type tells the Tag.bio system how to instantiate a data loading function. There are 40+ options for parser_type. The most commonly used are categorical and numeric, which share their designation with the two primary data types represented within the Tag.bio system.

{
  "parser_type": "ptpt", 
  ...
}


table_alias

usage: required
The table_alias attribute tells the system how to assign this parser to a table object as a data loading function for its source data table.

{
  "parser_type": "ptpt",
  "table_alias": "tata",
  ...
}


collection

usage: required
The collection attribute will rename the data processed by a parser to something more useful and human-readable. It is best practice to create legible names, but, if a collection is not specified, the collection name will be automatically assigned the column name.

{
  "parser_type": "ptpt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1"
}



nesting a parser

It's possible to specify an inner parser as the value of the collection or variable attributes. Doing so will define the collection or variable names by values in a different column.

{
  "parser_type": "ptpt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": {
    "parser_type": "ptpt",
    "column": "ccc1"
  }
}
{
  "parser_type": "ptpt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1",
  "variable": {
    "parser_type": "ptpt",
    "column": "ccc2"
  }
}


where

usage: optional
The where utilizes a conditional parser to determine which rows should be processed by the parent parser.

Rows that fail to evaluate as true will be ignored.

{
  "parser_type": "ptpt",
  "table_alias": "tata"
  "where": {
    "parser_type": "categorical-match",
    "column": "city",
    "operator": "=",
    "value": "San Francisco"
  },
  ...
}


null_indicators

usage: optional
The null_indicators attribute will specify which values, if found in the source data, will be considered as null.

Values in the array can be categorical or numeric.

{
  "parser_type": "ptpt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1",
  "null_indicators": [
    "iiii",
    ####
  ]
}


null_value

usage: optional
This null_value attribute is used to replace any null values with a categorical, numeric, or nested parser you specify.

{
  "parser_type": "tttt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1",
  "null_indicators": [
    "iiii",
    ####
  ],
  "null_value": "unavailable"
}


annotation

usage: optional
The annotation will specify an array of inner parsers which will parse other columns to annotate the variables generated by the parent parser.

{
{
  "parser_type": "tttt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1",
  "annotation": {
    "parser_type": "tttt",
    "column": "ccc2"
  }
}
}


groups

usage: optional
When groups is specified, it will indicate to the system that all collections and variables produced by the parser will only be accessible for authorized user groups.

{
  "parser_type": "tttt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1",
  "groups": [
    "admin"
  ]
}

Was this article helpful?

What's Next