This workspace uses JupyterLab in Terra on Azure to explore common Bioconductor packages that can be used to perform bulk RNA differential expression analyses or manipulate single-cell RNA-seq data.
Before proceeding with this tutorial, you will need to register on Terra and set up your Azure billing project.
Step 1: Set up your workspace
1.1. Go to app.terra.bio.
1.2. Log in with your Microsoft ID to use Terra on Azure.
1.3. From the welcome screen, navigate to Workspaces.
1.4. Select the Bulk and Single-Cell RNASeq Analysis with Bioconductor workspace from the Featured Workspaces list.
1.5. This Featured Workspace is read-only. To make your own copy of the workspace for completing the tutorial, go to the top right corner and select the circle with three vertical dots.
1.6. Select "clone" from the drop down menu.
What to expect
Note: In the early phase of Terra on Azure Public Preview, only the dashboard and any available notebooks will be cloned.
Now you have your own copy of the workspace to explore! You'll see it in Your Workspaces (note that you will be the owner).
Step 2: Set up and launch the Jupyter app VM
To run a Jupyter analysis, you will need to set up the virtual machine that runs the Jupyter app. See How to customize and launch Jupyter Lab for step-by-step instructions.
2.1. From the Analyses tab in your workspace, click the link to the “EdgeR” Jupyter notebook.
2.2. Once selected, you will be prompted to start an Azure Cloud Environment.
VM configuration
Under Cloud Compute Profile, select the STandard_DS2_v2, 2 CPU(s), 7GBs profile.
It may take 3-5 minutes to spin up.
Be careful of runaway costsYou will pay for the Jupyter VM as long as it is running! There is currently no autopause for notebooks running on Terra on Azure.
If you need to step away from your analysis, don’t forget to pause the cloud environment to keep from accruing costs.
Once the Azure Cloud environment creation is completed, you will have the option to either pause the environment, modify the environment using ‘Settings’ or Open the newly created cloud environment with a JupyterLab
Step 3. Run the EdgeR Notebook
What does the notebook do?
The "edgeR" Notebook uses the Bioconductor edgeR package to analyze synthetic bulk RNAseq gene count data. The sample data included in the analysis are read count data derived from the edgeR R package from a published study.
3.1. Once the Azure Cloud environment creation is completed, you will have the option to either pause the environment, modify the environment using ‘Settings’ or Open the newly created cloud environment with a JupyterLab.
3.2. Click to Open JupyterLab with the .ipynb notebooks from the cloned workspace.
3.3. To start your analysis, open the edgeR.ipynb jupyter notebook from the left for editing.
3.4. Verify that the notebook is using the right kernel by checking the top right corner and bottom left corner of the browser page. It should be R.
3.5. To change the kernel for the current notebook, click on the left bottom kernel name. This opens a window with a dropdown list to select the kernel for the ‘edgeR.ipynb’ notebook.
3.6. Run the analysis by following the instructions in the notebook.
Step 4. Run the Single Cell Experiment Notebook
4.1. Repeat steps 3.1 - 3.6 with the next R-based notebook Single Cell Experiment.