---
title: "config.json"
slug: "config-json"
updated: 2022-07-27T15:19:26Z
published: 2022-07-27T15:19:26Z
---

> ## Documentation Index
> Fetch the complete documentation index at: https://code.tag.bio/llms.txt
> Use this file to discover all available pages before exploring further.

# config.json

The **config** file is the focal point for loading & modeling data in a Data Product.

The **config** is a single text file, formatted as a JSON object, that instructs the Data Product how to load data from one or more delimited files (or SQL databases).

## Naming convention

The **config** file is typically named `config.json`, and located in the Data Product directory path `config/config.json`.

## Overall schema

There are three primary attributes in the **config** file schema: the *entity_table* object, the *other_tables* array, and the *parsers* array. There are some other optional attributes as well, like *data_dictionary*.

Here's the simplest version of a **config** file. This version assumes that all **table** objects register their own **parsers**.

```
{
  // A table object, or a string reference to a file 
  // containing a table object (best practice)
  //
  "entity_table": {...}, 
  
  // An array of table objects and / or string references 
  // to files containing table objects (best practice)
  //
  "other_tables": [...]
}
```

Here's another variant where the **config** file registers **parsers**:

```
{
  "entity_table": {...}, 
  "other_tables": [...], 
  
  // An array of parsers and / or string references
  // to files containing parsers (best practice)
  //
  "parsers": [...]
}
```

## Attribute - entity_table

The value for the *entity_table* attribute is a **table** object, or a string reference to a file containing a **table** object (best practice). The **table** object must be in *entity_table* form.

```
{
  // An embedded table object, entity_table form
  //
  "entity_table": {...},
  
  ...
}
```

```
{
  // A string reference to a file containing a table object, 
  // entity_table form (best practice)
  //
  "entity_table": "config/tables/table_eeee.json",
  
  ...
}
```

See the [Data Configuration - Table objects](https://tagbio-developer-docs.document360.io/docs/table-objects) page for detailed information around **table** objects, and the *entity_table* form.

## Attribute - other_tables

The value for the *other_tables* attribute is an array of **table** objects and / or string references to files containing **table** objects (best practice). The **table** objects listed must be in *other_tables* form.

```
{
  "entity_table": {...}, 
  
  "other_tables": [
  
    // An embedded table object, other_tables form
    //
    {...}, 

    // A string reference to a file containing a table object,
    // other_tables form (best practice)
    //
    "config/tables/table_oooo.json",
    
    ...
  ]
}
```

See the [Data Configuration - Table objects](https://tagbio-developer-docs.document360.io/docs/table-objects) page for detailed information around **table** objects, and the *other_tables* form.

## Attribute - parsers

The *parsers* attribute is an array of **parsers** and / or string references to files containing **parsers** (best practice).

Each **parser** in the array will load a **variable** or a **collection** of **variables** into the Data Product, from either the *entity_table*, or from one of the *other_tables*.

To autodetect and load all columns from all tables, without transformation or renaming, set the value of *parsers* to `"auto"`:

```
{
  "entity_table": {...}, 
  "other_tables": [...],
  
  // Will autogenerate parser functions for all tables
  //
  "parsers": "auto" 
}
```

Auto-generation of **parsers** is typically not utilized for a mature Data Product, due to the flexibility and power of customized **parsers**. We can register customized **parsers** here in the **config** file, or from within each **table** object.

```
{
  "entity_table": {...}, 
  "other_tables": [...],
  
  "parsers": [
  
    // A path to a file with parsers 
    // for entity_table eeee (best practice)
    //
    "config/parsers/parsers_eeee.json",

    // An embedded parser
    // for other_table oooo
    //
    {
      "parser_type":  "categorical",
      "table_alias": "oooo", 
      "column":"xxx1"
    },
    
    ...
  ]
}
```

See the [Data Configuration - Parsers](https://tagbio-developer-docs.document360.io/docs/parsers) page for detailed information about all the **parser** options.

## Attribute - data_dictionary

It's possible to have the **config** file automatically produce a data dictionary after loading data.

The value for the *data_dictionary* attribue is typically a file path to a .tsv file, e.g. `data_dictionary.tsv`. That file will contain a tab-delimited overview of all **collections** and **variables** created in the Data Product after the **config** file is processed.

```
{
  "entity_table": {...}, 
  "other_tables": [...],
  
  // The file that will contain the data_dictionary output
  //
  "data_dictionary": "data_dictionary.tsv"
}
```

**Be careful with this option in the case where you are working with sensitive / protected data**. You may want to make sure that the *data_dictionary* file is output outside the repo, or that the *data_dictionary* file is added to the *.gitignore*. Alternatively, consider using the object form of the *data_dictionary*, shown below, where you can implement a *redact* attribute that can prevent sensitive data from leaking into the file.

The value for *data_dictionary* can also be an object. In that case, you can specify additional useful attributes.

```
{
  "entity_table": {...}, 
  "other_tables": [...],
  
  "data_dictionary": {
  
    // The file that will contain the data_dictionary output
    //
    "file": "data_dictionary.tsv",
  
    // An array of collection names which should not have 
    // their variable names (e.g. patient IDs) written to data_dictionary
    //
    "redact": [
      "collection1",
      
      ...
    ],
  
    // This will limit the number of variables listed for each collection
    //
    "variable_limit": 10
  }
}
```

## Other attributes

```
{
  "entity_table": {...}, 
  "other_tables": [...], 
  "parsers": [...], 
  "data_dictionary": "data_dictionary.tsv",
  
  // These are attributes for a table object that can 
  // be set here and applied globally for all tables
  //
  "lines": ####,
  "entities": ####,
  "random": 0.###,
  "seed": ####,
  
  // These are attributes for a parser function that can 
  // be set here and applied globally for all parsers
  //
  "null_indicators": [...]
}
```
