Adding pair tables for tumor-normal analysis

Liz Kiernan
  • Updated

Learn how to link entity tables in Terra for the analysis of paired tumor and normal samples.

Overview: Pair tables for tumor-normal samples

Although Terra has two main types of data tables (entity and set), it can make a special association of predefined participant, sample, and pair data tables for the analysis of tumor-normal samples. These tables are just entity tables, but when used together with the appropriate naming convention, Terra can link them to facilitate the analysis of paired tumor-normal samples taken from the same patient. Somatic workflows that require pairs of tumor and normal samples accept data in pair tables by default.

How does a pair table work?

A pair table is used to specify control and case samples for a particular participant (HCC1141 below). 
Screenshot of expanded pair table with the case-sample (SM-74P4M) and control-sample (SM-74NEG) in the row with pair ID HCC-143 circled

For Terra to link a pair table to a participant and its samples in downstream workflows, you need to create both a participant table that lists the participants and a sample table (shown below) that lists the genomic samples (for tumor and for normal) for each participant.

Screenshot of expanded sample table with samples SM-74P4M (case sample from above) and SM-74NEG (control sample from above) highlighted.

The pair table should reference the participant_ids used in the participant table and the sample_ids used in the sample table. To create a pair table from scratch, follow the example shown below.

Example: pair.tsv in a spreadsheet

entity:pair_id case_sample control_sample participant
HCC1143-2020 SM-74P4M SM-74NEG HCC1143

Formatting requirements

  • The header entries in red (i.e., "entity:pair_id", "case_sample", "control_sample" and "participant" - shown above) must be typed exactly as shown
  • Customize with your own pair, sample, and participant IDs.
  • Remember the "sample" and "participant" IDs are taken from the "sample" and "participant" tables!

Example: Pair table in Terra

Screenshot of table from above in a Terra workspace

Uploading pair and associated tables

For Terra to link participant, sample, and pair tables, they must be uploaded in the appropriate order.

Required upload order

  1. Participant
  2. Sample
  3. Pair

To associate the tables, you must upload in the correct order and select “Create participant, sample, pair associations” at the prompt in the pop-up window that appears when you try to upload the files. 
Screenshot of Import data table with 'Create participant, sample, and pair associations' checkbox (selected) highlighted

Uploading pair tables example: Mutect2

A good example of when to use pair tables is when you're running the workflow Mutect2. The input table is a pair table, but the fields in the pair table reference data stored in the sample table that's associated with participant IDs from a participant table.

Order to upload (Mutect 2)

  1. participant.tsv
  2. sample.tsv
  3. pair.tsv

Modifying-tables_Pairs_Pairs-table-in-Terra.png

If you uploaded the tables out of order or without selecting that checkbox, the workspace will give you a table loading error. Additionally, the workflow will fail because it reads from the pair table but has no knowledge of the sample table from which the pair table is supposed to read.

Additional resources

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.