Workspace tables are much like a spreadsheet built into the data page. So it's no surprise that you can use a spreadsheet editor to create and upload a file to generate a new table in your workspace. This "load file" is in "tab separated values" or "tab delimited text" format and is called a TSV file in Terra.
Step 1. Make a TSV file in a spreadsheet editor
Open your favorite spreadsheet editor and follow the formatting examples shown below to make an entity table or set table.
Click the table type below for templates, examples, and required formatting.
-
Download a template load file (sample.tsv) here.
What's in an entity table?
Entity tables keep track of data - historically input data for a workflow, like samples, participants, specimens, or files. The minimum sample table includes an ID column and column for metadata or data files (i.e., FASTQ, BAM, CRAM, etc. - whatever form your data take).
When creating a table, you can use whatever name you wish for your table. Note that Terra will assume the first column header is the name of the table.Creating a "sample" entity table
The following shows an example data table that will have the name "sample" when uploaded to the Terra Data page. Each table row represents a different sample. As with all Terra tables, the first column contains entity IDs. The second column shows example cloud paths to BAM files in a Google bucket.
Sample TSV in a spreadsheet
sample_id BAM participant1-blood gs://your-bucket-name/blood_sample_P1.bam participant1-spit gs://your-bucket-name/spit_sample_P1.bam participant2-blood gs://your-bucket-name/blood_sample_P2.bam participant2-spit gs://your-bucket-name/spit_sample_P2.bam Formatting requirements Parts in red (i.e., "_id") are optional. Note that Terra will append a `_id` to the end of the first column header when importing a TSV.
In the rows, you'll use your own sample IDs (i.e., "your-participant1-blood") and the complete paths of the data files.Sample table in Terra
-
Why use an entity_set table?
A single data table might hold a mix of different samples you want to analyze. Samples may differ in species, developmental stages, sequencing methods, or other criteria. If you only want to run a workflow analysis on a subset of the samples in your data table, consider making a set table.
A set table allows you to organize and save sets of samples for (repeat) downstream analysis and keep track of data files that are generated for a sample subset. Set tables always refer to entities in an entity table (i.e., a
sample_set
table references samples in asample
table; aspecimen_set
table references specimens in aspecimen
table). Therefore, a set table can only be created after you've made and uploaded the entity table it references.
The example below shows you how to make a sample_set table, assuming you already have a sample table.Option 1: Generate a set in Terra
1. Select the rows from the entity table to be included in the set.
2. Click Edit (pencil icon) at left above the table.
3. Choose Save selection as set from the menu.
Option 2: Create a sample_set table in a spreadsheet
Download a template TSV file (membership.tsv) here.
The first column is the unique ID for each set and the second column is the sample_id of the sample in that set (from the sample table).
There is a row for every member of a set. In the example below, the
sample_set
table contains two sets: spit (contains the samples participant1-spit and participant2-spit) and blood (contains the samples participant1-blood and participant2-blood).Sample_set TSV in a spreadsheet editor
membership:sample_set_id sample spit participant1-spit spit participant2-spit blood participant1-blood blood participant2-blood Formatting requirements Parts in red (i.e., "membership:sample_set_id") must be entered exactly as shown! You can replace "sample" with the name of the table for which you're making sets.
You can customize the set IDs with your own values.
Note: You must have a corresponding entity table (i.e., asample
table - in the example above) in the workspace. It contains links to the input data files from the samples in the set.Sample_set table in Terra
The samples in each set are listed in the samples column, separated by a comma.
Other ways to create setsIn addition to manually creating set tables, you can create a set table on the fly when you're setting up a workflow analysis. Learn more in When to use a set table for a workflow.
For hands-on practice using set tables for workflow input, see the Data Tables QuickStart Part 3 and Part 4.
Step 2. Save file as tab delimited text or tab-separated value
A load file has to be in "tab-separated values" or "tab-delimited text" format (Terra recognizes both).
Your editor may give you a warning, but we assure you, it's fine!
What's the name of the table in your workspace?
It's worth noting that Terra ignores the actual file name; it's the "root entity" (in the first column header) that determines the table name in the data table. For example, if you save your table using the name table1.txt
but the table's first column is named entity:bam_id
, the table will be named bam
in Terra.
Step 3. Upload TSV to workspace
3.1. Click on the Import Data button at the top of the left TABLES column (highlighted in the orange rectangle in the screenshot below).
3.2. Select the file import tab.
3.3. Drag or click to select your TSV file (circled in orange).
3.4. Click the Start import job button at the bottom right (will turn blue when you select a file).
When the upload is complete, you'll see the new data table listed in the left-hand panel of the Data tab. Click on the table's name to expand the table.
The participant table in the example above looks like this once imported into Terra:
Uploading and deleting set tables
Upload order: entities first, then sets
Because set tables reference entities in an entity table, you must upload the entity table first. For example, a sample_set
table references a sample
table. If you try to upload the sample_set
table before the sample
table, Terra will you give an error message.
Deleting entity tables deletes sets that reference the entities
Similarly, if you delete an entity table, Terra will automatically delete a set table that references it. In the example above, deleting the sample
table will automatically delete the sample_set
table.
Special tables: Pair (tumor-normal analysis)
If you're analyzing cancer data, you're familiar with tumor-normal pairs where a given participant has a sample from tumor tissue and one from normal tissue. To facilitate this type of analysis, Terra has predefined associations for participant, sample, and pair data tables. If you upload these tables in order, and specify that Terra should associate them, Terra will automatically link the tables together for use in workflows for somatic analysis.
Learn more in Adding pair tables to a workspace for tumor-normal analysis.
Download a template TSV file (pair.tsv) template here.
Making tables programmatically
You can automate the process of making and modifying tables using a special API called FISS. Learn how to do this in How to manage data with the FISS API.
Next steps
If you already have a data table in your workspace, you can modify it to meet your analysis needs. Learn more in How to modify and edit data tables.
Maybe you're ready to perform an analysis but you need some workspace-level metadata like reference files. Read Creating Workspace Data tables to learn how to make a Workspace Data table that can be used in downstream WDL workflows.