Make your own project workspace

Allie Hajian
  • Updated

Cloning a workspace makes another copy of the workspace under your own billing project. Cloning creates a completely independent copy of the workspace in which you are the owner and sole user (until you choose to "share" your copy with someone else). You can experiment with code and data in your copy without running the risk of affecting a group workspace. Cloning is useful if you want to run computations in a workspace where you only have "read access."  Cloning a Featured Workspace that covers the research tasks you want to do, for example, is one way to jump into using Terra (without writing your own scripts or notebooks).

For information about collaborating in the same workspace by sharing the workspace, see How to share a workspace

Step-by-step guide to cloning a workspace

See the video and step-by-step guide below to learn how to clone a workspace in Terra.

Step 1. Locate the three vertical dots icon

  • From within the workspace

    Click on the three vertical dots on the top right of your workspace Dashboard page.

    Clone-workspaces_Dashboard_Screen_shot.png

  • From the Workspaces page

    1.1. To access the full list of Terra workspaces available to you, click on View Workspaces from the Terra homepage app.terra.bio.
    S17d_May31_2019.png

    1.2. Click on the three vertical dots at the lower right of the card of the workspace you want to clone.
    Clone-workspaces_Workspace-card_Screen_shot.png

Step 2. Complete the clone form

2.1. In the popup menu, choose Clone.
S17b_May31_2019.png

2.2. Type in your new workspace name, choose a billing project, a Workspace bucket location (see Customizing where your data are stored and analyzed, or US Multiregional versus Regional bucket tradeoffs) and enter an Authorization Domain (if needed - see Managing data privacy and access with Authorization Domains). 

Clone-workspace_Screen_shot.png

2.3. Then click the Clone workspace button at the bottom of the form. 

You should see the newly cloned workspace in the "My Workspaces" section of the Workspaces page immediately.

Troubleshooting tip

If you get an error message that includes the phrase "Precondition failed" (see screenshot below), you have likely exceeded your Google project quota. To learn more about Google project quotas and how to request more, see Google Cloud quotas: What are they and how do you request more

Project-quota-error-message-when-creating-too-many-workspaces_Screen_shot.png

Building workspaces using the Terra Library

Terra has three libraries that can help when you are building a project workspace. To access the libraries, click the main menu icon (three horizontal lines) at the top left of any page and open the "Library" submenu.  

S51e_Workspaces_libraries_Screen_Shot.png

  • Terra hosts both open- and controlled-access datasets. Some datasets have built-in functionality ("Data Explorers) for browsing the data. Explorers also let you use selection criteria to create custom subsets (cohorts) of data. Note that the number of available datasets - and the amount and type of data available in each - are growing, including public-access datasets.

    Accessing controlled data from the Terra Data LibraryRegistering for Terra does not automatically mean you can access all the hosted data. To access restricted data, you must be added to an Access policy for that resource. When you try to view data you don't have access to, you'll be prompted to request access.

    G15a_May13_2019.gif

    Terra Library datasets

    Note that since the Data Library is always expanding, there may be additional datasets available on Terra. 

    AnVIL - CCDG, CMG, GTEx, eMERGE

    The Genotype-Tissue Expression (GTEx) Program established a data resource and tissue bank to study the relationship between genetic variation and gene expression in multiple human tissues.

    STAGE - TOPMedTrans-Omics for Precision Medicine (TOPMed)

    Sponsored by the National Institutes of Health's National Heart, Lung, and Blood Institute (NHLBI), TOPMed is a program to generate scientific resources to enhance our understanding of fundamental biological processes that underlie heart, lung, blood, and sleep disorders (HLBS).

    Nurses health Study (NHS)

    The Nurses' Health Study and Nurses' Health Study II are among the largest investigations into the risk factors for major chronic diseases in women.

    Human Cell Atlas (HCA)

    The Human Cell Atlas is made up of comprehensive reference maps of all human cells — the fundamental units of life — as a basis for understanding fundamental human biological processes and diagnosing, monitoring, and treating disease.

    The Encyclopedia Of DNA Elements (ENCODE)

    The ENCODE project aims to delineate all functional elements encoded in the human genome. To this end, ENCODE has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification.

    TCGA, TARGET

    The Cancer Genome Atlas (TCGA) is a dataset comprised of over two petabytes of genomic data, produced in a collaboration between the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI).

    Using the Data Explorer

    If you have permission to view a dataset, clicking browse data will open a data explorer, where you can create your own subset of data (cohort) by choosing criteria relevant to your research. You can then export the subset to your workspace. It will appear as a Cohort table in the Data tab. 

    G15b_May13_2019.gif

    In the clip above, we create a unique cohort by filtering for patients with a few types of cancers, and then limiting the cohort by gender. Clicking ExportSend reveals an import data screen where you can select the workspace to send your custom cohort:

    S9_Jan29_2019.png

    For step-by-step instructions on on how to use custom cohorts with SQL and BigQuery, see Accessing and analyzing custom cohorts with Data Explorer.

  • One of the best ways to get started in Terra is to explore the curated Showcase Workspaces in the Library (access from the dropdown in the main navigation menu at the top left). These workspaces span a variety of use cases to give show how peers are designing similar experiments. They are standardized for completeness and ease of use.

    Classes of Showcase workspaces

    • Tutorial workspaces 
    • Specific analysis tools (i.e. WDLs, Jupyter Notebooks, Hail, Bioconductor)
    • Experimental strategies (i.e. GWAS, Exome analysis, RNA-seq)
    • Scientific domains (Cancer, infectious diseases, single-cell, immunology)

    They're great as templates or to help reproduce instructive results and learn established methodologies - and you don't have to be logged into Terra to see them (though you will have to log in to make your own copy to work in!).

    Using Showcase workspaces as templates

    You should find enough detail in the workspace description to enable you analyse the included sample data. Cost and time estimates give you the confidence to run on your own data, if you want.

    Featured-Workspaces_Screen_shot.png

    Note that you can get to the Featured Workspaces page using the navigation menu at the top left of any screen in Terra.

  • The Code and Workflows Library contains GATK Best Practices workflows and links to both Dockstore and the Broad Methods Repository (look in the right column). You'll find workflow components to run individually or to string together in these workflow repositories. Note that all workflows in Terra use Workflow Description Language (WDL).

    S24_May31_2019.png

Was this article helpful?

0 out of 1 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.