Parsers

parser_type

usage: required
The parser_type tells the Tag.bio system how to instantiate a data loading function. There are 40+ options for parser_type. The most commonly used are categorical and numeric, which share their designation with the two primary data types represented within the Tag.bio system.

{
  "parser_type": "ptpt", 
  ...
}

table_alias

usage: required
The table_alias attribute tells the system how to assign this parser to a table object as a data loading function for its source data table.

{
  "parser_type": "ptpt",
  "table_alias": "tata",
  ...
}

collection

usage: required
The collection attribute will rename the data processed by a parser to something more useful and human-readable. It is best practice to create legible names, but, if a collection is not specified, the collection name will be automatically assigned the column name.

{
  "parser_type": "ptpt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1"
}

nesting a parser

It's possible to specify an inner parser as the value of the collection or variable attributes. Doing so will define the collection or variable names by values in a different column.

{
  "parser_type": "ptpt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": {
    "parser_type": "ptpt",
    "column": "ccc1"
  }
}

{
  "parser_type": "ptpt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1",
  "variable": {
    "parser_type": "ptpt",
    "column": "ccc2"
  }
}

where

usage: optional
The where utilizes a conditional parser to determine which rows should be processed by the parent parser.

Rows that fail to evaluate as true will be ignored.

{
  "parser_type": "ptpt",
  "table_alias": "tata"
  "where": {
    "parser_type": "categorical-match",
    "column": "city",
    "operator": "=",
    "value": "San Francisco"
  },
  ...
}

null_indicators

usage: optional
The null_indicators attribute will specify which values, if found in the source data, will be considered as null.

Values in the array can be categorical or numeric.

{
  "parser_type": "ptpt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1",
  "null_indicators": [
    "iiii",
    ####
  ]
}

null_value

usage: optional
This null_value attribute is used to replace any null values with a categorical, numeric, or nested parser you specify.

{
  "parser_type": "tttt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1",
  "null_indicators": [
    "iiii",
    ####
  ],
  "null_value": "unavailable"
}

annotation

usage: optional
The annotation will specify an array of inner parsers which will parse other columns to annotate the variables generated by the parent parser.

{
{
  "parser_type": "tttt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1",
  "annotation": {
    "parser_type": "tttt",
    "column": "ccc2"
  }
}
}

groups

usage: optional
When groups is specified, it will indicate to the system that all collections and variables produced by the parser will only be accessible for authorized user groups.

{
  "parser_type": "tttt",
  "table_alias": "tata",
  "column": "cccc",
  "collection": "ccc1",
  "groups": [
    "admin"
  ]
}