You'll discover diverse biomedical datasets with hundreds of thousands of subjects in the cloud. But are all these data useful? The Terra Data Catalog can help you find and analyze the data you need for your study quickly and easily.
Terra's integrated Data Catalog is designed to make it easier to find and analyze relevant data
- Quickly search and filter datasets hosted by Terra.
- Once you find interesting data, export it directly to your Terra workspace to use in a workflow or interactive analysis.
To access the Data CatalogGo to the main navigation menu > Library > Datasets and toggle New Catalog ON.
Streamlining the exploratory process
- Target dataset metadata instead of specific dataset research fields with search and filter
- Access and search datasets that reside in different systems (currently Data Repo and workspaces - and eventually external repositories) in one place
- Request access to controlled data (coming soon)
- Similar experience regardless of dataset and consortium and source
How to search and filter datasets
Target dataset metadata instead of specific dataset research fields. In this example, searching for "leukemia" surfaces one dataset.
- Access type
- Data use policy
- Data modality
- File type
How to filter
The oval beside each filter includes the number of datasets with that filter. Click on the oval to filter (the oval will turn green).
All relevant datasets are listed in the All datasets column, with some basic information:
dataset name | consortium | number of subjects | data modality | last updated
Filtering datasets example
When using multiple filters
Note: Filters can use either "and" or "or" logic.
Selecting filters across categories enforces “and” logic (datasets must satisfy every condition)
If you select “Granted” as the "Access type" and the consortium "ClinVar Annotations, you will only see ClinVar datasets for which you have permission to access.
Selecting filters within a category enforces “or” logic (datasets satisfy at least one condition):
If you select disease type “Adrenal carcinoma” and “Bladder”, your datasets will include everything with either epigenomic or proteomic data.
How to explore datasets quickly
Clicking on the dataset name will surface the following.
- Dataset overview
Includes access type, donor size, sample size, data modality, data type, and file counts. Also contact information and data contributors, as well as the cloud infrastructure and region where the primary data files are stored.
- Data preview
Lets you drill down into the specifics of data included in the dataset as well as request access to controlled data.
- Export to Terra
You can export the data to a new or existing workspace for analysis
1. Screenshot of TARGET Acute Myeloid Leukemia (AML) Project in the Terra Data Catalog
2. Screenshot of dataset preview example (participant table)
3. Screenshot of export to Terra destination example
What to expect
Data will be delivered as one or more tables in the workspace data page.