This article summarizes the cloud components you'll use in Terra and how working in the cloud differs from working locally.
This is a living document. Check back here to see the current state of Terra on Azure.
Terra on Azure is a public preview release intended to allow users early access to tools and resources on Terra. Your candid feedback will help us improve the Terra experience as we develop and roll out additional functionality.
Overview: Terra on Azure
Terra is a cloud-native platform for storing and analyzing biomedical data whose mission is “to help accelerate research by integrating data, analysis tools, and built-in security components to deliver frictionless research flows from data to results.” This release of Terra uses Microsoft Azure’s cloud infrastructure for data analyses and storage.
Project data and tools - together in a Terra workspace
Whether you're interested in running pipelines, a statistical analysis, or visualizing your data, you can access and manage the tools and data you need in a Terra workspace dedicated to your project.
Workspaces function like a (very powerful) desktop computer, except the working parts are all in the cloud, and you operate it from your browser.
Browser-based and cloud-native
- Streamline your work by consolidating resources
- Access data stored in different cloud locations in a single analysis
- Seamless collaboration with built-in security and access controls
Preview disclaimersSince this is a preview environment, features may change without notice. Also, note that we cannot guarantee you will not lose data.
The vision for Terra on Azure
Terra on Azure offers several major functional upgrades for Terra. Refactoring existing features
- Improves performance and scalability
- Makes it easier to integrate new analysis capabilities (such as upcoming support for additional workflow languages)
- Gives you, the user, maximum control of where all of your data in Terra is stored
Toward a unified Terra experience
Our vision is to iterate and improve these upgrades based on user feedback, starting with Terra on Azure Preview.
Once we validate that these changes meet current user needs and open opportunities for new user communities, we hope to implement many of these changes (as feasible) in Terra on Google.
Current costs of using Terra on Azure
Working in the cloud in Terra has infrastructure and resource costs, outlined below. Terra passes Azure cloud resource charges to the user's subscription with no additional markup.
Infrastructure cloud costs (per Terra Instance - i.e., Terra Billing project)
For maximum control over where your data is stored, increased scalability (ability to store large amounts of data with no effect on performance), and the flexibility to integrate additional analysis apps, we’ve transitioned some infrastructure from Terra-owned to user-owned.
- When you (or an IT Admin or collaborator) create a Terra Billing Project, Terra launches a distinct Terra Instance with Azure infrastructure resources costing about $5 per day and shared across all workspaces in the billing project.
- New workspaces have an additional fixed cost of about $5 per day for resources that power data tables. Additional charges will apply based on storage and compute usage within the workspace.
One of our top priorities is to drive these costs down while increasing performance and usability.
These infrastructure cloud costs accrue as long as you have a Terra Billing project/workspaceWe have not yet released support for deleting a Terra Billing project once created. If you want to pause resources on your billing project to reduce costs after you start working in Terra on Azure Preview, please reach out to firstname.lastname@example.org for assistance.
How costs scale
Broadly, costs scale with the resources you use.
Minimum Base Cost = Terra Instance + multiple Workspaces (depending on the services used)
The Terra Instance powers the entire infrastructure (base cost = up to 5 Workspaces with multiple services - data tables, notebooks, and workflows). The base cost will accommodate more "smaller" workspaces (with fewer services running) - or fewer "larger" workspaces (with many services running).
The base cost increases as more workspaces are added
Note that this cost model differs from that of Terra on Google (see Overview: Terra costs and Billing - GCP). We expect the platform functionality and cost models to align as we develop multi-cloud Terra.
Variable cloud costs (per Terra workspace)
Adding data to storage and running analyses will incur additional fees to cover the cloud resources used in the workspace. These costs are calculated following Azure’s pricing (see pricing in Overview: Costs and billing in Terra on Azure). Terra passes these costs along to users without any markup.
Workspace data tables
Data tables help store and organize data in an integrated, spreadsheet-like format. Primary data - including clinical data, demographics, or phenotypic data - can all be stored in data tables. Data tables can also keep links to genomic data files in cloud storage (workspace or external).
Data tables are hosted in a private relational database set up when you create a workspace. This makes data tables more scalable and gives you complete control over where (what geographic location) your data lives in Azure.
Who can see data tables?
All workspace collaborators can interact with data tables in Terra on Azure workspaces (for workspaces created after May 17, 2023). This means you can share tables with your collaborators and actively modify those tables together.
Data tables are copied to workspace clones
As of July 31, 2023, workspace clones include the data tables from the original workspace. Once your WDS is running, you will see the tables on the Data page.
You will need a new Terra Billing projectIn order to take advantage of the new feature, you will need to set up a new Terra Billing project following the step-by-step instructions - step 2.3 in Setting up team billing in Terra on Azure (admins).
Terra on Azure includes access to JupyterLab supported by Azure Data Science Virtual Machines (DSVM). This offering includes flexible VM and disk size configuration options, the option to use GPUs, detachable Persistent Disk storage, and a convenient file syncing service that automatically saves your notebook (.jpynb) files to and from your workspace blob storage.
Select from four pre-configured cloud compute profiles
and specify the Persistent Disk size in the Azure Cloud
Environment setup pane. Cost estimates for the
configuration will be displayed in the blue bar at the top.
WDL workflows with Cromwell
Terra on Azure includes three COVID-19-related workflows in every workspace (February 22, 2023), and additional curated workflows (March 2023). The workflows are automatically included when you launch Cromwell in the workspace.
Bring your own workflow
You can import workflows with a GitHub link or directly from Dockstore.
Workspace collaboration & sharing
Terra on Azure workspaces supports multiple users with owner, reader, and writer permissions.
What is shared by all collaborators?
As of the first release, dashboard content and notebooks are visible to all collaborators.
Data tables are shared collaboratively among all users with access to the workspace and included in workspace clones.
What are single-user features?
Workflows are a single-user-per-workspace experience for now. Only the workspace creator can see and use Cromwell for launching workflows. Our upcoming releases focus on providing a collaborative analysis experience.
What is copied (when cloning a workspace)?
When you clone a workspace, the dashboard contents, data tables, and notebook files are copied into the new workspace, along with several pre-configured, curated workflows. Each workspace has its own blob storage, but files stored there are not copied to the cloned workspace.
Ready to get started using Terra on Azure? Follow the three steps below.
1. Set up an account on Terra
Register for a Terra account (see How to set up an account in Terra on Azure).
2. Set up billing
Finance admins/users with access to an existing Azure subscription must set up cloud billing and link it to a Terra Billing Project following step-by-step instructions in Setting up billing in Terra on Azure (admins).
3. Explore a tutorial workspace
Featured workspaces let you try out the platform with pre-configured sample data, analysis tools, and documentation to guide you.
- Bulk and single cell RNA Seq Analysis with Bioconductor workspace (JupyterLab-based)
- COVID-19-Surveillance tutorial workspace (workflows) and step-by-step guide.
Note that you will need to make your own clone, as Featured Workspaces are "read-only"!