How to make a data table from scratch or a template

Allie Cliffe
  • Updated

Workspace tables, with rows and columns for organizing data, are much like a spreadsheet built into the data page. So it's no surprise that you can use a spreadsheet editor to create and upload a file - a "load file" - to generate a new table in your workspace. The load file is in "tab separated values" or "tab delimited text" format and is called a TSV file in Terra. 

Step 1. Make a TSV file in a spreadsheet editor

Open your favorite spreadsheet editor and follow the formatting examples shown below to make an entity table or set table.

Click the table type below for templates, examples, and required formatting.

  • Download a template load file (sample.tsv)  here.

    What's in an entity table?

    When creating a table, you can use whatever name ("entity") you wish for your table, as long as your spreadsheet follows the format entity:your-name_id. Entity tables keep track of data - historically input data for a workflow, like samples, participants, specimens, or files. The minimum sample table includes an ID column and column for metadata or data files (i.e. FASTQ, BAM, CRAM, etc. - whatever form your data are).

    Creating a "sample" entity table

    The following shows an example data table that will have the name "sample" when uploaded to the Terra Data page. Each table row represents a different sample. As with all tables, the first column contains entity IDs. The second column shows example cloud paths to BAM files in a Google bucket. 

    Sample TSV in a spreadsheet

    entity:sample_id BAM
    participant1-blood gs://your-bucket-name/blood_sample_P1.bam
    participant1-spit gs://your-bucket-name/spit_sample_P1.bam
    participant2-blood gs://your-bucket-name/blood_sample_P2.bam
    participant2-spit gs://your-bucket-name/spit_sample_P2.bam

    Formatting requirements Parts in red (i.e. "entity:sample_id") must be entered exactly as shown! You can replace "sample" with the enity that defines your table.

    In the rows, you'll use your own sample IDs (i.e. "your-participant1-blood") and the complete paths of the data files.

    Sample table in Terra

    template-sample-in-Terra_Screen_shot.png

  • Why use an entity_set table?

    A single data table might hold a mix of different samples you want to analyze. Samples may differ in species, developmental stages, sequencing methods, or other criteria. If you only want to run a workflow analysis on a subset of the samples in your data table, you might want to consider making a set table.

    A set table allows you to organize and save sets of samples for (repeat) downstream analysis and keep track of data files that are generated for a sample subset. Set tables always refer to entities in an entity table (i.e. a sample_set table references samples in a sample table; a specimen_set table references specimens in a specimen table), meaning a set table can only be created after you've made and uploaded the entity table it references.  The example below shows you how to make a sample_set table assuming you have a sample table.

    Option 1: Generate a set in Terra

    1. Select the rows to be included in the set.

    2. Click Edit (pencil icon) at left above the table. 

    3. Choose Save selection as set from the menu.
    Make-set-table-in-Terra_Screen_shot.png

    Option 2: Create a sample_set table in a spreadsheet

    Download a template TSV file (membership.tsv) here.

    The first column is the unique ID for each set and the second column is the sample_id of the sample in that set (from the sample table).

    There is a row for every member of a set. In the example below, there are two sample_set tables: spit (contains the samples participant1-spit and participant2-spit) and blood (contains the samples participant1-blood and participant2-blood).

    Sample_set TSV in a spreadsheet editor

    membership:sample_set_id sample
    spit participant1-spit
    spit participant2-spit
    blood participant1-blood
    blood participant2-blood

    Formatting requirements Parts in red (i.e. "membership:sample_set_id") must be entered exactly as shown! You can replace "sample" with the name of the table you are making sets of.

    You can customize the set IDs with your own values.

    Note that you must have a corresponding entity table (i.e. a sample table - in the example above) in the workspace. It contains links to the input data files from the samples in the set.

    Sample_set table in Terra

    The samples in each set are listed in the samples column, separated by a comma.
    template-sample_set-table-in-Terra_Screen_shot.png

    Other ways to create setsIn addition to manually creating set tables, you can create a set table on the fly when you're setting up a workflow analysis. Learn more in When to use a set table for a workflow.

    For hands-on practice using set tables for workflow input, see the Data Tables QuickStart Part 3 and Part 4.

Step 2. Save file as tab delimited text or tab-separated value

A load file has to be in "tab-separated values" or "tab-delimited text" format (Terra recognizes both). 

Your editor may give you a warning, but we assure you, it's fine! 
Data-QuickStart_Part2_Save-as-Tab-delimited-text.png 

What's the name of the table in your workspace?

It's worth noting that Terra ignores the actual file name; it's the "root entity" (in the first column header) that determines the table name in the data table. 

Step 3. Upload TSV to workspace

3.1. Click on the "+" Import Data button at the top of the left TABLES column (highlighted in the orange rectangle in the screenshot below).
Add-data-table_Upload-TSV_Screen_shot.png

3.2. Select Upload TSV from the menu (circled in orange).

3.3. Drag or click to select your TSV file.

Import-data-table-TSV_Import-table-popup_Screen_shot.png

3.4. Click the Start import job button at the bottom right (will turn blue when you select a file).

You'll see a Data import in progress banner at the top right of your workspace.
Add-data-table_TSV-import-in-progress_Screen_shot.png

3.5. When the import is complete, refresh the page to see the data table.
Add-data-table-TSV_Data-imported-successfully_Screen_shot.png

You should see your data right away. Click on the link (table name) to expand the table.

The subject table in the example above looks like this:
Add-data-table_Subject-table_Screen_shot.png

Uploading and deleting set tables

Upload order: entities first, then sets

Because set tables references entities in an entity table, you must upload the entity table first. For example, a sample_set table references a sample table. If you try to upload the sample_set table before the sample table, Terra will you give an error message.

Deleting entity tables deletes sets that reference the entities

Similarly, if you delete an entity table, Terra will automatically delete a set table that references it. In the example above, deleting the sample table will automatically delete the sample_set table.

Special tables: Pair (tumor-normal analysis)

If you're analyzing cancer data, you're probably familiar with tumor-normal pairs where a given participant has a sample from tumor tissue and normal tissue. To facilitate this type of analysis, Terra has predefined associations for participant, sample, and pair data tables. If you upload these tables in order, and specify that Terra should associate them, Terra will automatically link the tables together for use in workflows for somatic analysis.

Learn more in Adding pair tables to a workspace for tumor-normal analysis.

Download a template TSV file (pair.tsv) template here

Making tables programmatically 

You can automate the process of making and modifying tables using a special API called FISS. Learn how to avoid creating data tables manually in a spreadsheet in Managing data and automating workflows with the FISS API.

Next steps

If you already have a data table in your workspace, you can modify it to meet your analysis needs. Learn more in Modifying and editing a data table

Maybe you're ready to perform an analysis but you need some workspace-level metadata like reference files. Read Creating Workspace Data tables to learn how to make a Workspace Data table that can be used in downstream WDL workflows.

Additional resources

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.