AnVIL on Azure: Data Submitters' guide

Allie Cliffe
  • Updated

If you're interested in using Terra on Azure, please email terra-enterprise@broadinstitute.org.

An overview to help (AnVIL) Data Submitters previously using TDR (GCP) get started staging and uploading data to TDR (Azure). For step-by-step instructions, to stage data in your deposit workspace, see How to stage data in your AnVIL deposit workspace.  

Prerequisites

This document assumes you have already registered your study data with AnVIL and defined the data model for your dataset. See Step 1: Register Study/Obtain Approvals and Step 2: Set Up a Data Model for detailed instructions. 

Process overview and requirements

You’ll first stage data in a dedicated Terra on Azure workspace. Then the AnVIL team will ingest the data files and tables into TDR. 

What’s the same

  • Login with the same Terra ID (can be a Google or GSuite)
  • Stage data in a deposit workspace

What’s different in Azure

  • View workspace cloud storage directory in the workspace (not in GCP console)
  • Azure-specific directory structure for CSVs and data files
  • Workspace storage identified with a SAS URL (updated every 8 hours)
  • Upload large files with AzCopy (versus gsutil)

For additional data submission support, reach out to the AnVIL Support team at anvil-data@broadinstitute.org

AnVIL provides data submitters with a submission workspace where you will stage data for ingestion (large data files and CSV files for each dataset table).

As the data submitter, you’re expected to abide by the following guidelines Only upload data from the current approved data submission.

Use a separate workspace to run any compute or analysis on this data unless you have prior approval from the AnVIL program. Note that the WRITER role allows you to run computations, although you are not allowed to without approval.

Don’t copy or move primary data from this workspace without prior approval from the AnVIL program. 

Next steps: Accessing the data

Note that once the data is ingested, you will be able to access it in TDR for analysis. Please do NOT clone this workspace for long-term use. This workspace will be deleted once your submission is complete.

Step-by-step instructions

Ready to submit data to AnVIL? See the resources below. 

 

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.