Step-by-step instructions to run the Interactive Genomic Viewer (IGV) in a notebook in Terra on Azure. See the IGV in Terra on Azure Featured Workspace.
Interactive Genomic Viewer: Overview
The Interactive Genomic Viewer (IGV) is a powerful and popular visualization tool for exploring genomic data. It supports a range of genomic data types including aligned sequence reads, mutations, gene expression, and genomic annotations. With IGV, you can zoom and explore genomic data at levels of detail ranging from whole genome to base pair. IGV supports data loaded from multiple sources ranging from cloud-based to local resources. To learn more about IGV, please refer to Robinson, J., Thorvaldsdóttir, H., Winckler, W. et al. Integrative genomics viewer. Nat Biotechnol 29, 24–26 (2011).
Using IGV in Terra on Azure
You can use IGV in Terra on Azure by running a Jupyter Notebook. Jupyter notebooks are a great tool for data scientists working on genomic data analysis.
IGV in Terra (Featured Workspace)
The instructions below walk through how to run IGV in a notebook in Terra on Azure in the IGV in Terra on Azure Featured Workspace. The Featured Workspace demonstrates the basics of using JupyterLab and IGV on Terra on Microsoft Azure with data from the Microsoft Genomics Data Lake.
Step 1: Clone the Featured Workspace
In order to create compute resources and run notebooks on Terra, you need to have edit and execute permissions in a workspace. By cloning this read-only Featured Workspace, you will create an identical workspace where you have owner permissions. All compute costs will be charged to the Terra Billing project you assign during workspace creation.
Step-by-step instructions
1.1. Click on the three vertical dot action-icon in the upper right-hand corner of the workspace.
1.2. Select Clone from the drop-down menu.
1.3. Give your new workspace a unique name (Note: It may help to write down or memorize the name of your workspace).
1.4. Choose an Azure billing project.
1.5. Click the Clone Workspace button.
What to expect
Terra will navigate to your brand new workspace where you can get to work!
Step 2: Launch a JupyterLab Cloud Environment
2.1. Navigate to the Analyses tab in your workspace.
2.2. Click on the notebook file IGV on Terra on Microsoft Azure.ipynb.
2.3. Select the Open button, which will prompt you to start an Azure Cloud Environment. This is a virtual machine that powers your interactive experience.
2.4. Under Cloud Compute Profile, select the Standard_DS2_v2, 2 CPU(s), 7GBs profile and click Create.
VM creation may take 10-11 minutes to complete.
Step 3: Run the IGV notebook
3.1. Click into the first code cell.
3.2. Click the Play button, or shift-enter, to execute each cell in the notebook one at a time.
How long will it take to run? Going through the notebook will take approximately 5 minutes and cost less than a dollar.
Notebook outline
1. Download sample data from the Azure Genomics Data Lake.
2. Install igv-jupyter and igv-notebook libraries.
Coloring options for variant (VCF) tracks.
Sample submission for .bam files
3. Optionally, save data from your VM to your workspace storage container for longer-term storage.
Step 4. Delete the JupyterLab Virtual Machine
Once you complete the tutorial notebook, you'll want to delete the virtual machine to avoid incurring any more cloud compute costs. You can keep all data generated during your notebook session in the Persistent Disk, or save it in the workspace blob storage by running the example code in the notebook before deleting your environment.
4.1. Click on the Jupyter icon in the right side-panel.
4.2. In the Cloud Environment Details pane, select Settings.
4.3. In the bottom-left corner of the Azure Cloud Environment window, click Delete Environment.