Workspace tables, with rows and columns for organizing data, are much like a spreadsheet built into the data page. So it's no surprise that you can use a spreadsheet editor to create and upload a file - a "load file" - to generate a new table in your workspace. The load file is in "tab separated values" or "tab delimited text" format and is called a TSV file in Terra.
Step 1. Make a TSV file in a spreadsheet editor
Open your favorite spreadsheet editor and follow the formatting examples shown below to make an entity table or set table.
Click the table type below for templates, examples, and required formatting.
Download a template load file (sample.tsv) here.
What's in an entity table?
When creating a table, you can use whatever name ("entity") you wish for your table, as long as your spreadsheet follows the format entity:your-name_id. Entity tables keep track of data - historically input data for a workflow, like samples, participants, specimens, or files. The minimum sample table includes an ID column and column for metadata or data files (i.e., FASTQ, BAM, CRAM, etc. - whatever form your data take).
Creating a "sample" entity table
The following shows an example data table that will have the name "sample" when uploaded to the Terra Data page. Each table row represents a different sample. As with all tables, the first column contains entity IDs. The second column shows example cloud paths to BAM files in a Google bucket.
Sample TSV in a spreadsheet
entity:sample_id BAM participant1-blood gs://your-bucket-name/blood_sample_P1.bam participant1-spit gs://your-bucket-name/spit_sample_P1.bam participant2-blood gs://your-bucket-name/blood_sample_P2.bam participant2-spit gs://your-bucket-name/spit_sample_P2.bam
Formatting requirements Parts in red (i.e., "entity:sample_id") must be entered exactly as shown! You can replace "sample" with the entity that defines your table.
In the rows, you'll use your own sample IDs (i.e., "your-participant1-blood") and the complete paths of the data files.
Sample table in Terra
Why use an entity_set table?
A single data table might hold a mix of different samples you want to analyze. Samples may differ in species, developmental stages, sequencing methods, or other criteria. If you only want to run a workflow analysis on a subset of the samples in your data table, consider making a set table.
A set table allows you to organize and save sets of samples for (repeat) downstream analysis and keep track of data files that are generated for a sample subset. Set tables always refer to entities in an entity table (i.e., a sample_set table references samples in a sample table; a specimen_set table references specimens in a specimen table), meaning a set table can only be created after you've made and uploaded the entity table it references. The example below shows you how to make a sample_set table, assuming you have a sample table.
Option 1: Generate a set in Terra
1. Select the rows to be included in the set.
2. Click Edit (pencil icon) at left above the table.
3. Choose Save selection as set from the menu.
Option 2: Create a sample_set table in a spreadsheet
Download a template TSV file (membership.tsv) here.
The first column is the unique ID for each set and the second column is the sample_id of the sample in that set (from the sample table).
There is a row for every member of a set. In the example below, there are two sample_set tables: spit (contains the samples participant1-spit and participant2-spit) and blood (contains the samples participant1-blood and participant2-blood).
Sample_set TSV in a spreadsheet editor
membership:sample_set_id sample spit participant1-spit spit participant2-spit blood participant1-blood blood participant2-blood
Formatting requirements Parts in red (i.e., "membership:sample_set_id") must be entered exactly as shown! You can replace "sample" with the name of the table for which you're making sets.
You can customize the set IDs with your own values.
Note: You must have a corresponding entity table (i.e., a
sampletable - in the example above) in the workspace. It contains links to the input data files from the samples in the set.
Sample_set table in Terra
The samples in each set are listed in the samples column, separated by a comma.
Other ways to create setsIn addition to manually creating set tables, you can create a set table on the fly when you're setting up a workflow analysis. Learn more in When to use a set table for a workflow.
For hands-on practice using set tables for workflow input, see the Data Tables QuickStart Part 3 and Part 4.
Step 2. Save file as tab delimited text or tab-separated value
A load file has to be in "tab-separated values" or "tab-delimited text" format (Terra recognizes both).
Your editor may give you a warning, but we assure you, it's fine!
What's the name of the table in your workspace?
It's worth noting that Terra ignores the actual file name; it's the "root entity" (in the first column header) that determines the table name in the data table.
Step 3. Upload TSV to workspace
3.1. Click on the Import Data button at the top of the left TABLES column (highlighted in the orange rectangle in the screenshot below).
3.2. Select Upload TSV from the menu (circled in orange).
3.3. Drag or click to select your TSV file.
3.4. Click the Start import job button at the bottom right (will turn blue when you select a file).
You'll see a Data import in progress banner at the top right of your workspace.
3.5. When the import is complete, the banner will update to a Data imported successfully banner.
Refresh the page to see the data table.
You should see your data right away. Click on the link (table name) to expand the table.
The participant table in the example above looks like this:
Uploading and deleting set tables
Upload order: entities first, then sets
Because set tables reference entities in an entity table, you must upload the entity table first. For example, a sample_set table references a sample table. If you try to upload the sample_set table before the sample table, Terra will you give an error message.
Deleting entity tables deletes sets that reference the entities
Similarly, if you delete an entity table, Terra will automatically delete a set table that references it. In the example above, deleting the sample table will automatically delete the sample_set table.
Special tables: Pair (tumor-normal analysis)
If you're analyzing cancer data, you're familiar with tumor-normal pairs where a given participant has a sample from tumor tissue and normal tissue. To facilitate this type of analysis, Terra has predefined associations for participant, sample, and pair data tables. If you upload these tables in order, and specify that Terra should associate them, Terra will automatically link the tables together for use in workflows for somatic analysis.
Learn more in Adding pair tables to a workspace for tumor-normal analysis.
Download a template TSV file (pair.tsv) template here.
Making tables programmatically
You can automate the process of making and modifying tables using a special API called FISS. Learn how to avoid creating data tables manually in a spreadsheet in Managing data and automating workflows with the FISS API.
If you already have a data table in your workspace, you can modify it to meet your analysis needs. Learn more in Modifying and editing a data table.
Maybe you're ready to perform an analysis but you need some workspace-level metadata like reference files. Read Creating Workspace Data tables to learn how to make a Workspace Data table that can be used in downstream WDL workflows.
Please sign in to leave a comment.