The Broad Institute provides a number of reference data files, hosted by Google and available for anyone to use. Reference data include standard Human reference files (hg19, hg38) as well as mouse, rat, monkey, bacterial, and viral reference data, and more. You can find these files in the gcp-public-data--broad-references Google bucket.
Overview: Reference data tables
There are several standard reference files that you can add to the Reference data table in your workspace Data page. These reference files are hosted by the Broad (so you don't have to pay for storage!).
Using Broad Public Reference Files
When you run a workflow, you can use these files as input in the configuration form using the format workspace.referenceData_
in the reference file attribute column (scroll down for step-by-step instructions).
How to add reference data to a workspace
1. Navigate to the Data page.
2. Click the Import Data button at the top left.
3. Select Add Reference Data from the menu.
4. Choose the reference to add (such as B237 Human or hg38) from the dropdown and click OK.
Expand the Reference Data section (left column) to find the reference data.
Video: Installing b37Human reference data
How to use reference files in a workflow analysis
Once you've added a reference file to the Reference data table in your workspace, you can use it as a workflow input variable by calling the file in the Attributes field of the workflow Input configuration form.
1. Go to the attribute field for the reference file.
2. Start typing in workspace.referenceData_
.
3. Select the correct reference file from the dropdown list. Hint: Check the name of the variable to help select the right one.