This document describes ways Broad users can retrieve data from Terra. Please refer to the documents for setting up a Google App Account (setting up a Google account with a non-Gmail address) and Terra Account if you have not yet registered on Terra. Your Broad Project Manager will notify you when your data is ready for download and will provide you with the name of the Terra workspace and backing Google bucket where you can find the data.
- Moving data from Terra to an on-premises location using the Terra interface
- Moving files from a Broad server to Terra using gsutil in a terminal
- DataShuttle is no longer supported
- File validation/checksum generation
- Engage BITS to help upload large amounts of data
- gcloud authorization error
- using gsutil on Unix
Moving data from Terra to an on-premises location using the Terra interface
Recommended for small numbers (1 to 10 files)
Follow the instructions in Moving data to/from a workspace Google bucket , in the section labeled Upload/download through the Terra UI . If this is a Broad-owned Data Delivery workspace, all egress charges are covered by the Genomics Platform. If you have any concerns, please discuss with your Project Manager.
Moving files from a Broad server to Terra using gsutil in a terminal
We recommend this option for all transfers, but it's ideal for large file sizes or 1000s of files.
First, initialize gsutil (moving data from a Broad server only)
Use the commands below on your laptop terminal:
ssh login use UGER ish
- Enter your password
use Google-Cloud-SDKNote: You may see out of date messages, don't worry about this.
Before uploading/downloading data using gsutil, you can use the ls command to look at the buckets you have access to:
gsutil lsto see all of the Cloud Storage buckets under your default project ID
gsutil ls -p [project name]to list buckets for a specific project
Upload commands (click to expand)
gsutil cp [local file path] [bucket URL](you must be an Owner or Writer of the workspace to upload).
The bucket URL is the path to your file or folder in the Google Cloud SDK. It will have the format gs://[bucket name] or, for folders within a bucket, gs://[bucket name]/[folder name]
For example, to upload a file "Example.bam" into the folder "gene_files" in a bucket:
gsutil cp /Users/Documents/Example.bam gs://WorkspaceBucket/gene_files
Download commands (click to expand)
gsutil cp [bucket URL]/[file name] [local file path]
Make sure to leave a space between the the bucket URL and the file path:
gsutil cp gs://WorkspaceBucket/gene_files/example.bam /Users/Documents
Please note that this application is not actively supported. As such, we are unable to troubleshoot any bugs or error messages associated with its use.
Contact BITS for help with moving large amounts of data in and out of the cloud (Broad Institute community members only)
If you are a member of the Broad Institute community, BITS is a great resource to help migrate large amounts of on-prem data to the cloud in a cost-effective way! You can read more about their support offerings here.
File validation / checksum generation
Per Google: At the end of every upload or download the gsutil cp command validates that the checksum it computes for the source file/object matches the checksum the service computes. If the checksums do not match, gsutil will delete the corrupted object and print a warning message. This very rarely happens, but if it does, please contact email@example.com.
gcloud authorization error
You may have trouble accessing your Terra workspaces if you have authorized your gcloud sdk installation with a Google Account that is not registered in Terra and applied to your workspace. You can verify which Google Account you’ve authorized with gcloud by running the following command:
gcloud auth list
- If the Google ID returned matches the one on your Terra workspace, you should be able to access your workspace. If not, please contact your Project Manager.
- If the Google ID returned does not match the one on your Terra workspace, run the following command to specify the correct account:
gcloud auth login [Google account]
gsutil errors on Unix
When working on a Unix system, you will need to to tell it not to try to start a browser. Then it gives you a url you can paste into your desktop browser.
To tell the system not to start a browser, use the command
gcloud auth login --no-launch-browser