First steps to submit data to the AnVIL/TDR; register your study and obtain approvals with the appropriate governing body.
Summary
AnVIL strives to balance the goals of ensuring that data is as widely and freely available as possible while safeguarding the rights and privacy of subjects who participate in NIH-sponsored research.
This includes protecting the confidential and proprietary genomic and phenotypic data of individual human participants. See information about the Genomic Data Sharing policy here.
Before beginning the process of data submission to the Terra Data Repository, you will want to make sure your data are compatible with the AnVIL by completing the AnVIL Dataset Onboarding Application form and registering your data with the appropriate NCBI registration system.
How to register study and obtain approval
Steps to complete
- Obtain approval
- Register study
- Apply to AnVIL
For details, see step-by-step instructions on the AnVIL portal.
A note about Functional Equivalence (FE)
To maximize the value of AnVIL-hosted data and minimize batch effects in cross-project analyses (Regier et al., 2018), CCDG and TOPMed consortia have defined a functional equivalence (FE) standard for alignment and processing of whole-genome sequencing data (i.e. WGS). AnVIL strongly encourages submitting FE-compliant genome and exome sequencing data aligned to GRChB38. (See the CCDG pipeline standard).
FE is important for downstream joint calling across datasets, but is difficult to prove. There is no easy way for AnVIL to validate or have the submitter prove that submitted data were aligned and mapped on a FE pipeline.
If you are unsure of whether or not your data is functionally equivalent, the AnVIL ingestion team may reach out to you to review your dataset prior to submission.