How to move data to/from a Google bucket

Allie Hajian

Explore how to add data to - or download from - your Workspace bucket or an external Google bucket. The best approach depends on how many files you have and what size they are, whether you're moving to or from local storage, and how comfortable you are with command-line tools

Need to transfer files from one GCP Bucket to another?To move files from one GCP bucket to another, preserving the file structure if specified, see this WDL (on GitHub).

This is useful if you want to copy files from a Terra workspace to an external GCS Bucket, for example. You can import the WDL to a Terra workspace and run as a workflow. 

Local storage <-> workspace Bucket (small numbers of small files)

When can you upload/download data in Terra?You can move files through the Terra website, rather than the command line.This is the sort of transfer you often see where you upload or download a file from the internet.

Restrictions on how you can use this feature
- Only for transfers between workspace storage and local storage (e.g., laptop).
- Recommended only for small numbers of small data files.

  • 1. Click on the folder icon on the right-hand pane of any tab in your workspace.

    Screenshot of the folder icon used to browse workspace files.

    2. Click on the upload button with a cloud icon in the upper left corner.

    3. Use the finder window to select the file(s) to upload.
  • 1. Click on the folder icon on the right-hand pane of any tab in your workspace.

    Screenshot of the folder icon used to browse workspace files.

    2. Find the file you want to download (you may have to navigate through the folder structure on the left-hand side of the screen to access the file you want).
    Screenshot of a folder in the workspace file explorer for an example workspace.

    3. Click on the file to download. This will open a pop-up window with instructions for downloading the data in multiple ways, and the cost of the download.

  • 1. Start from the workspace's Data page.

    2. Click on the table with the data file to download on the left side of the screen. The example below is for the sample table.

    Any files available for download will be shown as a link in the sample row.
    Screenshot of sample table with two rows. In the first column is the unique sample ID (NA12878 and template_sample). In the second cram column is a clickable link to download the file.

    3. Click on a file link to open a pop-up window describing the size of the file, instructions for downloading the data in multiple ways, and the cost of the download. For example, in the screenshot below the file could be downloaded via a terminal command or by clicking a button.
    Screenshot of a pop-up window that appears when clicking on a file linked from a data table. The window lists the file's name, a preview of its contents, and the cost of downloading it. In addition, there are two ways to download the file: a button that says 'Download for <$0.01' and a copyable terminal command.

    4. Click on the “Download for [cost to download your file]” button to initiate the download. Note: This button starts the download immediately. You won't get another opportunity to verify before the download starts. However, you can cancel the download at any time during the process. 

    5. Repeat for any additional files you would like to download.

Transfer using gcloud storage (large files and/or large numbers)

When to use gcloud storage

  • Works well for all size transfers
  • Ideal for large file sizes or 1000s of files
  • Can be used for transfers between local storage and a bucket, workspace virtual machine (VM) or persistent disk and a Google bucket, as well as between Google buckets (external and workspace)

Diagram of three locations for data: local storage, workspace cloud storage (Google bucket) and Cloud Environment (VM disk or persistent disk). An arrow between the Cloud Environment and workspace storage shows that you can use workspace tools (cloud environemnt terminal) to move or copy files between the cloud environment and workspace storage.

What is gcloud storage CLI? gcloud storage is a Python application that lets you access Cloud Storage from the command line in a terminal.

The terminal you use can be run on your local machine (local instance) or built into the workspace Cloud Environment (workspace instance).

gcloud storage in a terminal - Step-by-step instructions

Step 1. Open gcloud storage in a terminal

You can run a terminal locally or in your workspace. Which you use depends on where your data are located

Which terminal instance should you use?

Moving data to or from the Cloud Environment VM/PD?
- Use the workspace terminal instance.

Moving data to or from local storage?
- Use a local terminal instance.

Google bucket to Google bucket transfer?
- You can use either instance.

  • Use for moving data to/from a cloud environment

    1.1. Start a Cloud Environment if one is not already running, as this is the virtual machine the terminal runs on.

    1.2. Scroll to the right of your workspace page to see these icons, which will lead you to one of the best-kept secrets of Terra - a command line interface. Click on the (>_) icon  and you can access what resembles a UNIX terminal.

    Screenshot of right sidebar with (from the top) the cloud environment rate, the cloud environment lightening logo, and the terminal logo

    Opening the terminal from an RStudio Cloud Environment If you're running a Cloud Environment with RStudio, you won't see the terminal icon on the right-hand panel, as shown above. Instead, follow the instructions in Using the terminal in RStudio to open the terminal.

    1.3. From here, you can perform command-line tasks including gcloud storage cp.

  • Use for moving data to/from local storage

    1.1. Follow Google’s instructions to install Google SDK, which includes gcloud storage CLI.

    1.2. Open a Google Cloud SDK shell and run gcloud init to authenticate. You will be asked to sign into your Google Cloud account and select your Google Cloud project.

    1.3. Set a default project name using gcloud config set project MY_PROJECT.

    1.4. Verify gcloud storage installation. To do this, run gcloud storage ls to see all of the Cloud Storage buckets you have access to.

    List the buckets for a specific projectRun gcloud storage ls -p PROJECT_NAME to list buckets for a specific project. You will need to have owner access to the project to run this command.  

Step 2. Run gcloud storage commands 

Once in a terminal (either on your local machine or in a Terra workspace), you can copy data from one place to another using the cp command:


For example, to copy a file from one location in the workspace bucket to a folder called 'favorites' in the same bucket, your command would look something like this:

gcloud storage cp gs://fc-3dfd2d6a-d382-4c2b-b593-39651709b7bf/myFile.txt gs://fc-3dfd2d6a-d382-4c2b-b593-39651709b7bf/favorites

Finding the full path to workspace bucket

In Terra, you can find the full path to the workspace bucket in the Cloud Information box on the right-hand side of the workspace's Dashboard tab. Copy this path by clicking the clipboard icon in the right side of the path. 

Screenshot showing the Cloud Information section of an example workspace's dashboard. An orange rectangle highlights the full path to the workspace's google bucket.

You can find the full path to an individual file in the workspace by clicking on the clipboard icon to the right of the file's name in the files section of the workspace's data tab.

Additional details on the gcloud storage cp command can be found in the Google gcloud documentation.

You must be an Owner or Writer to upload to a Google bucket, including the workspace bucket!

  • To generate a manifest when uploading, use the -L option.

  • To copy the file "Example.bam" from an external bucket "gs://My_GCP_bucket" into the "gene_files" folder in a workspace bucket "gs://fc-7ac2cfe6-4ac5-4a00-add1-c9b3c84a36b7", use the command

    gcloud storage cp gs://MY_GOOGLE_BUCKET/EXAMPLE.bam gs://fc-7ac2cfe6-4ac5-4a00-add1-c9b3c84a36b7/gene_files
  • To download data from a bucket, reverse the order of the bucket URL and local file path, use 

    gcloud storage cp [bucket URL]/[file name] [local file path]

    Make sure to leave a space between the the bucket URL and the file path. For example:

    gcloud storage cp gs://WorkspaceBucket/GeneFiles/example.bam /Users/Documents

    Note that operating systems specify local file paths differently -- for example, on a Windows system the local path in the example above might be Users\Documents.

    To download data from a bucket that is enabled with requester-pays, run the command this way.


    To learn more about accessing files from a requester-pays enabled Google bucket, see the  Google requester pays docs.

  • If you're downloading folders, you'll need to use the -R flag to copy the folder and its contents

    gcloud storage cp -R gs://EXAMPLE_BUCKET/FOLDER_1 LOCAL_FILE_PATH

    The cp command automatically runs parallel (multi-threaded/multi-processing) copies as needed. To recursively copy subdirectories, use the --recursive flag in the command. For example, to copy files including subdirectories from a local directory named top-level-dir to a bucket, you can use:

    gcloud storage cp top-level-dir  gs://EXAMPLE_BUCKET/FOLDER_1 LOCAL_FILE_PATH --recursive

    More gcloud instructions working with large data can be found here and an explanation of -mcan be found here.

File validation / checksum generation

At the end of every upload or download, the gcloud storage cp command validates that the checksum it computes for the source file/object matches the checksum the service computes. If the checksums do not match, gcloud storage will delete the corrupted object and print a warning message. You can learn more about this from Google's documentation. This very rarely happens, but if it does, please contact


The following are the most common errors our users encounter when moving data using gcloud storage. If you experience a different error, please note the error in the comments of this article and contact Frontline Support by clicking on contact us under Support in the main Terra menu.

Screenshot of the menu used to contact Frontline Support. From the main Terra menu (three horizontal lines at the top left of any Terra window), expand the Support section and click on Contact Us, which is highlighted with an orange rectangle. 

  • You may have trouble accessing your Terra workspaces if you authorized your gcloud sdk installation with a Google Account that is not registered in Terra and applied to your workspace.  You can verify which Google Account you’ve authorized with gcloud by running the following command: gcloud auth list.

    1. If the Google ID returned matches the one on your Terra workspace, you should be able to access your workspace.  If it doesn't, please contact your Project Manager.

    2. If the Google ID returned does not match the one on your Terra workspace, run the following command to specify the correct account:
      gcloud auth login GOOGLE_ACCOUNT

  • When working on a Unix system, you need to to tell it not to try to start a browser. Once you do that, you should receive a url you can paste into your desktop browser. 

    To tell the system not to start a browser, use the command gcloud auth login --no-launch-browser

Was this article helpful?

2 out of 2 found this helpful



  • Comment author
    Peter van Galen

    In the section "Step 1. Open gsutil in a terminal," if you have already set up an RStudio VM, there is no terminal shortcut as depicted in the "Workspace terminal instance" tab of this article. Instead, you can use the Terminal tab next to the Console tab in the RStudio interface.

  • Comment author
    Leyla Tarhan

    Thanks for catching that, Peter! I've updated the instructions in this article accordingly.


Please sign in to leave a comment.