Whether you're creating a workspace from scratch or using a copy of another workspace, there are multiple options for adding a data table to the workspace so you can organize your data.
Options for adding data tables to a workspace
The Terra Data page has dedicated sections for different kinds of data tables; this overview goes over how to add tables to the Table section, which is typically dedicated to the data you want to analyze or transform in downstream analysis (input data) and its associated metadata.
Read the options below and then see step-by-step instructions (below) to generate a table from scratch.
Import from another workspace
You can copy data in a table from another workspace to your workspace by selecting the rows of data you want, clicking on the three vertical dots at the top right, and choosing "Export to workspace".
A note about cost and access when importing a table from another workspaceExporting the data table to another workspace or downloading the TSV metadata does not incur any egress costs because you are not moving/copying the data files that are referenced in the table, merely copying the metadata and references to the new workspace.
For readers of your workspace to access those files they will need read access to the bucket or workspace where the data are actually stored.
Import from Gen3, the Data Library, or other external servers
For external data resources directly connected to Terra, you'll be able to browse, select the data subset you want, and export to your workspace.
External repositories each have their own data browser and unique way of exproting data files. When you export data from the Data Library or an external source such as the Gen3 platform (shown in the video below), they may show up as multiple tables of predefined entities.
Add a table by making and uploading a "load file"Maybe there's no workspace with data in a table to copy, or you want to include a table for data you've just uploaded to your workspace bucket. You can create a table from scratch by generating a "load file" in a spreadsheet editor (outside of Terra) and uploading it by clicking on the blue + icon at the top of the Data page.
Learn how to create a TSV file from a template (you can find template TSV files and instructions here)
How to make a data table from scratch or a template
(in a spreadsheet editor)
Workspace tables, with rows and columns for organizing data, are much like a spreadsheet built into the data page. So it's no surprise that you can use a spreadsheet editor to create and upload a file - a "load file" - to generate a new table in your workspace.
What's in a load file? Each row is a unique entity (participant, sample, etc.).
Each column is a distinct attribute corresponding to that entity - (ie. the participant's sex, age, or height; the sample BAM, FASTA, etc.).
Step 1. Make a table "load file" (TSV) in a spreadsheet editor
Open your favorite spreadsheet editor and then follow the formatting examples shown below to make an entity table or set table.
Click the table type below for templates, examples, and required formatting
Download a template load file (sample.tsv) here.
What's in an entity table?
When creating a table, you can use whatever name ("entity") you wish for your table, as long as your spreadsheet follows the format entity:your-name_id. Entity tables keep track of data - historically input data for a workflow, like samples, participants, specimens, or files. The minimum sample table includes an ID column and column for metadata or data files (i.e. FASTQ, BAM, CRAM, etc. - whatever form your data are).
Example: Creating a "sample" entity table
The following shows an example data table that will have the name "sample" when uploaded to the Terra Data page. Each table row represents a different sample. As with all tables, the first column contains entity IDs. The second column shows example cloud paths to BAM files in a Google bucket.
entity:sample_id BAM participant1-blood gs://your-bucket-name/blood_sample_P1.bam participant1-spit gs://your-bucket-name/spit_sample_P1.bam participant2-blood gs://your-bucket-name/blood_sample_P2.bam participant2-spit gs://your-bucket-name/spit_sample_P2.bam
Formatting requirements Parts in red (i.e. "entity:sample_id") must be entered exactly as shown! You can replace "sample" with the enity that defines your table.
In the rows, you'll use your own sample IDs (i.e. "your-participant1-blood") and the complete paths of the data files.
Example: sample table in Terra
Download a template load file (membership.tsv) here.
What's in an entity_set table?
A single data table might hold a mix of different samples you want to analyze. Samples may differ in species, developmental stages, sequencing methods, or other criteria. If you only want to run a workflow analysis on a subset of the samples in your data table, you might want to consider making a set table.
A set table allows you to organize and save sets of samples for (repeat) downstream analysis and keep track of data files that are generated for a sample subset. Set tables always refer to entities in an entity table (i.e. a sample_set table references samples in a sample table; a specimen_set table references specimens in a specimen table), meaning a set table can only be created after you've made and uploaded the entity table it references. The example below shows you how to make a sample_set table assuming you have a sample table.
Example: Creating a sample_set table in a spreadsheet
The first column is the unique ID for each set and the second column is the sample_id (from the sample table).
There is a row for every member of a set. In the example below, the sample_set table references the samples in the "sample" entity table.
membership:sample_set_id sample spit participant1-spit spit participant2-spit blood participant1-blood blood participant2-blood
Formatting requirements Parts in red (i.e. "membership:sample_set_id") must be entered exactly as shown! You can replace "sample" with the enity your set groups together.
You can customize the set IDs with your own values.
Note that you must have a corresponding entity table (i.e. a
sampletable - in the example above) in the workspace that contains links to the input data files from the samples in the set.
Example: sample_set table in Terra
To find the samples in each set, click on the link in the "samples" column (see highlighted box for the two samples in the "blood" set).
Other ways to create sets
In addition manually creating set tables, you can create a set table on the fly when you're setting up a workflow analysis. Learn more in When to use a set table for a workflow.
For hands-on practice using set tables for workflow input, see the Data Tables QuickStart Part 3 and Part 4.
Step 2. Save file in "tab delimited text" or "tab-separated value" (tsv) format
A load file has to be in "tab-separated values" or "tab-delimited text" format (Terra recognizes both).
Your editor may give you a warning, but we assure you, it's fine!
What will be the name of the table in your workspace?
It's worth noting that Terra ignores the actual file name; it's the "root entity" (in the first column header) that determines the table name in the data table.
Step 3. Upload to the Data page in your workspace
3.1. Click on the "+" sign in the blue circle at the top of the left TABLES column(in the orange rectangle in the screenshot below).
3.2. Drag or click to select your TSV file.
3.3. Click the "upload" button.
Once you've uploaded a load file to your workspace, you should see your data right away. Click on the link (table name) to expand the table. A sample table with only one sample and one column of metadata (links to FASTQ files) would look like this:
Uploading and deleting set tables
When you create set tables, the table references entities in an entity table. For example, a sample_set table references a sample table. This means you must upload the entity table first. Otherwise, Terra will you give an error message. Similarly, if you have a set table uploaded and delete your entity table, Terra will automatically delete the set table that references it. In the example above, deleting our sample table will automatically delete our sample_set table.
Special tables: Pair for tumor-normal analysis
If you're analyzing cancer data, you're probably familiar with tumor-normal pairs where a given participant has a sample from tumor tissue and normal tissue. To facilitate this type of analysis, Terra has predefined associations for participant, sample, and pair data tables. If you upload these tables in a correct order, and specify that Terra should associate them, Terra will automatically link the tables together for use in downstream WDL workflows for somatic analysis.
Learn more in Adding pair tables to a workspace for tumor-normal analysis.
Programmatically making tables
You can automate the process of making and modifying tables using a special API called FISS. Learn how to avoid creating data tables manually in a spreadsheet in Managing data and automating workflows with the FISS API.
If you already have a data table in your workspace, you can modify it to meet your analysis needs. Learn more in Modifying and editing a data table.
Maybe you're ready to perform an analysis but you need some workspace-level metadata like reference files. Read Creating Workspace Data tables to learn how to make a Workspace Data table that can be used in downstream WDL workflows.