This article is a primer on using the Data Uploader feature on Terra. The Data Uploader lets you upload files and associated metadata all at once, right in a Terra workspace. If you upload metadata and the primary key/column matches the name of your file (for example, if you have a bam file called sample1.bam and a row in your metadata file also called sample1), the metadata will get updated with the gs path to your file in the workspace bucket. This is critical since the gs path in the data table is needed if you're using the data model to run your WDL.
Why use Data Uploader?
If you were to attempt doing the same thing through the workspace directly - by uploading the files to the workspace bucket and the metadata to the data tables in the Data Tab - you would need a way to get all of the bucket URLs (the "gs://..." filepaths) that point to your files so that you could add each path to the appropriate line in the metadata table. If you use Data Uploaded, Terra wires this up for you, and you don't have to worry about programmatically generating the list of bucket URLs.
How do I use Data Uploader?
Currently, the only way to get to the Data Uploader is to use this URL: https://app.terra.bio/#upload. Follow the four steps below to upload data. Once you do, you'll be able to see the data in the Data Tab of the target workspace.
Step 1. Select target workspace
When you first arrive at https://app.terra.bio/#upload you'll see a workspace selection screen that allows you to search for workspaces available to you based workspace names, tags, or billing projects. Find the workspace to which you'd like to add data, and click on it to select it.
1.1. Go to https://app.terra.bio/#upload
1.2. Search for your desired target workspace and select it
Step 2. Create/select a collection
Once you've selected the target workspace, you'll be prompted to either create a "collection" or select an existing one. Collections are a way to organize your data submissions, for instance if you are adding data for different organisms, different experimental methodologies, or different sequencing technologies to the same workspace. If the structure of the metadata for a new set of files is the same as one that's part of an existing collection, you could just add to that collection. If the metadata structure is new to that workspace, you can create a new collection. In the example shown in the screenshot below, the workspace doesn't have any existing collections to choose from, so we would create a new one.
2.1. Click "create a new collection" (or select a collection from an existing one, if you have any)
2.2. Name your collection and click "CREATE COLLECTION"
Step 3. Upload files
Once you have a collection selected, if you scroll down you'll see an area prompting you to upload your files.
3.1. Upload your files, either by dragging-and-dropping them to the indicated part of the page, or clicking the blue plus button at the bottom right to browse the files from your computer:
3.2. Once you've begun the upload, you'll see a pop-up window showing the progress of the upload, along with an option to abort the upload
3.3. Once the upload is complete, you'll need to click "NEXT >" to the right of where it says "UPLOAD YOUR DATA FILES"
Step 4. Upload metadata
The final step is to upload your metadata. If you've never constructed metadata to use with Terra's data model, review this article to understand how create such metadata from scratch.
4.1. After you click "NEXT >" in the data file upload step, you'll see a similar prompt to upload your metadata. Again, you'll have the option to either drag-and-drop, or click the blue upload button to select files from your local machine. Note that Data Uploader will only accept .TSV or .TXT files.
4.2. To complete this process, once the metadata file has completed its upload, just click "CREATE TABLE" (or "UPDATE TABLE").
Once this is done, you'll be able to see the data in the target workspace Data page - the metadata will be in one of the data tables, and the data files themselves will be in your workspace bucket, in a directory called "uploads" in a folder named however you named the collection. You can see these files by navigating to the "Files" section under workspace Data page.