AnVIL on GCP: Data Submitters' guide

Allie Cliffe
  • Updated

An overview of the data submission process to help AnVIL Data Submitters (GCP) get started staging and uploading data to TDR.

PrerequisitesThis document assumes you have already registered your study data with AnVIL and defined the data model for your dataset. For new projects that have not yet been approved, data submitters would complete the AnVIL Onboarding Application.

Process overview and requirements

For additional data submission support, reach out to the AnVIL Support team at anvil-data@broadinstitute.org

AnVIL provides data submitters with a submission workspace where you will stage data for ingestion (large data files such as omics and image files and TSV files for each dataset table).

As the data submitter, you’re expected to abide by the following guidelines Only upload data from the current approved data submission.

You must have prior approval from the AnVIL program to run any compute or analysis in AnVIL-owned workspaces, including the submission workspace. Note that cloning the submission workspace is not allowed, as the clones may not have the same enhanced monitoring and logging required for controlled access data in a workspace.

Don’t copy or move primary data from this workspace without prior approval from the AnVIL program. 

Ready to submit data to AnVIL? Follow step-by-step instructions in the links below

Next steps: Accessing the data

Once the data is ingested, you'll be able to access it from TDR (for updates, for example) and via the AnVIL Data Explorer for analysis. You will also retain access to the data present in the submission workspace in a read-only capacity with Requester Pays enabled on the workspace bucket.

For more information on finding and using AnVIL data, see Terra Support articles for AnVIL researchers.  

Additional data model resources

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.