Broad Public Reference Data

Jason Cerrato
  • Updated

The Broad Institute provides a number of reference data files, hosted by Google and available for anyone to use. Reference data include standard Human reference files (hg19, hg38) as well as mouse, rat, monkey, bacterial, and viral reference data, and more. You can find these files in the gcp-public-data--broad-references Google bucket.

Overview: Reference data tables

There are several standard reference files that you can add to the Reference data table in your workspace Data page. These reference files are hosted by the Broad (so you don't have to pay for storage!).

Using Broad Public Reference Files

When you run a workflow, you can use these files as input in the configuration form using the format workspace.referenceData_ in the reference file attribute column (scroll down for step-by-step instructions). 

How to add reference data to a workspace

1. Navigate to the Data page.

2. Click the Import Data button at the top left.

3. Select Add Reference Data from the menu.

4. Choose the reference to add (such as B237 Human or hg38) from the dropdown and click OK

Expand the Reference Data section (left column) to find the reference data. 

Video: Installing b37Human reference data

Screen capture of adding reference files to the workspace by clicking on the import data button on the data page and selecting add reference data from the dropdown then choosing b37 human from the dropdown in the popup window and clicking the blue OK button

How to use reference files in a workflow analysis

Once you've added a reference file to the Reference data table in your workspace, you can use it as a workflow input variable by calling the file in the Attributes field of the workflow Input configuration form. 

1. Go to the attribute field for the reference file.

2. Start typing in workspace.referenceData_.

3. Select the correct reference file from the dropdown list. Hint: Check the name of the variable to help select the right one. 

Video: Specifying the reference file in the workflow configuration form

Screen capture video of specifying the reference data file in the workflow configuration form by clicking into the ref_fasta attribute field and typing workspace.referenceData and selecting the file hg38_fasta from the dropdown menu. Expending the reference data section in the left column exposes the hg38 fasta file

Was this article helpful?

0 out of 0 found this helpful



Please sign in to leave a comment.