The Bioconductor Docker image is one of the current base images integrated into Terra. This guide will introduce you to the Terra Bioconductor image and how to use it with a Jupyter Notebook.
- Introduction to Bioconductor and the Terra Bioconductor Image
- Accessing the Bioconductor Image in Terra
- Uploading Additional Bioconductor Packages
- Additional Docker Resources
Introduction to Bioconductor and Terra Image
Bioconductor is a suite of open source tools, primarily written as R packages, designed for the statistical analysis of high-throughput genomic data. The terra-jupyter-bioconductor image is an extension of the terra-jupyter-r image that contains preloaded Bioconductor packages. You can find a list of all packages and software dependencies in the terra-docker GitHub repository (https://github.com/DataBiosphere/terra-docker/tree/master/terra-jupyter-bioconductor).
The terra-jupyter-bioconductor image comes preloaded with 10 commonly used Bioconductor packages:
- SingleCellExperiment: a package designed for single-cell analysis that defines an S4 class (a light-weight container for genomics data) used for storing dimensionality reduction results or alternative single cell analysis features such as spike-in transcripts or antibody tags.
- GenomicFeatures: a suite of tools allowing the manipulation and tracking of transcript-related annotations. This tool enables you to download the genomic location of transcripts, exons, and coding sequences (cds) from UCSC genome browser or BioMart.
- GenomicAlignments: a container used for storing and manipulating short genomic alignments.
- ShortRead: tools that allow you to manipulate and assess the quality of fastq files.
- DESeq2: an RNA-seq analysis package that tests differential gene expression using a negative binomial generalized linear model.
- AnnotationHub:a web resource that allows you to search genomic files from other common web resources (UCSC, Ensembl, etc.).
- ExperimentHub: a web resource that allows you to search curated experiments, publications, etc.
- ensembldb:a package that fetches transcript-centric annotations from Ensembl
- scRNAseq: a package containing gene counts data from a collection of public scRNAseq datasets
- scran: a collection of functions for common single-cell analyses
Accessing the Terra Bioconductor Image
To use Bioconductor in a Jupyter Notebook, you first need to set your Terra workspace cloud environment to the Bioconductor Docker image. If you are interested in looking inside the Docker file (or building your own custom Docker file), you can read more details in this GitHub repository.
To install the Bioconductor environment, simply click on the Cloud Environment widget in the top right of your workspace screen, select the Bioconductor image from the cloud environment drop-down menu, and click “create” (or “replace”).
After setting your cloud environment, launch your R Jupyter Notebook or create a new one using the instructions in this article. Once you’ve launched the notebook (when you can enter Edit Mode), you can run a quick sanity check by accessing the R help page of a Bioconductor function using the “?” syntax as shown below.
Or you could try loading a package you now expect to be available, such as 'GenomicAlignments'.
Uploading Additional Bioconductor Packages
If your research requires additional Bioconductor packages, you can readily install them in your R Jupyter notebook using Bioconductor’s BiocManager package which comes pre-installed with the Terra Bioconductor image. Use the command ‘BiocManager::install()’.
Example: Installing the ‘edgeR’ Bioconductor package
- From the Terra Notebooks tab, navigate to your Juptyer Notebook or create a new Notebook with the language set to R
- In a cell block of the Juptyer Notebook, type ‘BiocManager::install(‘edgeR’)
- To check that your Notebook appropriately installed the package, you can type library(“edgeR”) into the code block. If no error message appears, the Notebook has successfully installed the package.
Additional Docker Resources
You can read more about Dockers and customizing Docker images with the following articles:
- Creating safe and secure images: https://support.terra.bio/hc/en-us/articles/360034669811
- Custom cloud environments for Jupyter Notebooks: https://support.terra.bio/hc/en-us/articles/360037143432
- Working with project-specific environments: https://support.terra.bio/hc/en-us/articles/360037269472