Pair tables for tumor-normal samples
Although Terra has two main types of data tables (entity and set), it can make a special association of predefined participant, sample, and pair data tables for the analysis of tumor-normal samples. These tables are just entity tables, but when used together with the appropriate naming convention, Terra can link them to facilitate the analysis of paired tumor and normal samples taken from the same patient. Somatic workflows that require pairs of tumor and normal samples accept data in pair tables by default.
How does a pair table work?
A pair table is used to specify control and case samples for a particular participant (HCC1141 below).
The pair table should reference the participant_ids used in the participant table and the sample_ids used in the sample table. To create a pair table from scratch, follow the example shown below.
Example: pair.tsv in a spreadsheet
- The header entries in red (i.e. "entity:pair_id", "case_sample", "control_sample" and "participant" - shown above) must be typed exactly as shown
- Customize with your own pair, sample, and participant IDs
- Remember the "sample" and "participant" IDs are from the "sample" and "participant" tables!
Example: Pair table in Terra
Uploading pair and associated tables
For Terra to link participant, sample, and pair tables, they must be uploaded in the appropriate order:
To associate the tables, you must upload in the correct order and select “Create participant, sample, pair associations” at the prompt in the pop-up window that appears when you try to upload the files.
Uploading pair tables example: Mutect2
A good example of when to use pair tables is when you're running the workflow Mutect2. The input table is a pair table, but the fields in the pair table reference data stored in the sample table that's associated with participant IDs from a participant table.
Order to upload (Mutect 2)
- Participant tsv
- Sample tsv
- Pair tsv
If you uploaded the tables out of order or without selecting that checkbox, the workspace will give you a table loading error. Additionally, the workflow will fail because it reads from the pair table but has no knowledge of the sample table from which the pair table is supposed to read.