Learn how to link entity tables in Terra to analyze paired tumor and normal samples.
Overview: Pair tables for tumor-normal samples
Although Terra has two main types of data tables (entity and set), it can also associate predefined participant, sample, and pair data tables. These tables are just entity tables, but when used together with the appropriate naming convention, Terra can link them to facilitate analyses of paired tumor-normal samples taken from the same patient. Workflows that require pairs of tumor and normal samples (such as Mutect2) accept data in pair tables by default.
How does a pair table work?
A pair table is used to specify control and case samples for a particular participant (HCC1141 below).
For Terra to link a pair table to a participant and its samples in downstream workflows, you need two additional tables:
- A participant table that lists the participants.
- A sample table (shown below) that lists the genomic samples (for tumor and for normal) for each participant.
The pair table will reference the participant_ids used in the participant table and the sample_ids used in the sample table.
Creating a pair table
To create a pair table from scratch, follow the example shown below.
Example: pair.tsv in a spreadsheet
entity:pair_id | case_sample | control_sample | participant |
HCC1143-2020 | SM-74P4M | SM-74NEG | HCC1143 |
Formatting requirements
- The header entries in red (i.e., "entity:pair_id", "case_sample", "control_sample" and "participant" - shown above) must be typed exactly as shown
- Customize with your own pair, sample, and participant IDs.
- Remember the "sample" and "participant" IDs are taken from the "sample" and "participant" tables!
Uploading pair and associated tables
For Terra to link participant, sample, and pair tables, they must be uploaded in the appropriate order.
Required upload order
- Participant
- Sample
- Pair
To associate the tables, you must upload them in the correct order and select “Create participant, sample, pair associations” at the prompt in the pop-up window that appears when you try to upload the files.
If you upload the tables out of order or without selecting that checkbox, the workspace will give you a table loading error. Additionally, the workflow will fail because it reads from the pair table but has no knowledge of the sample table from which the pair table is supposed to read.
What to expect
When you've successfully uploaded a pair table, you will see a new table in the Data tab that looks something like this: