Terra offers multiple ways to interact with and manipulate data directly, with more on the way! Currently, Terra contains an integrated Jupyter Notebooks environment, a popular open-source platform that makes sharing and collaboration a breeze. This article covers the following topics:
1. Jupyter notebooks
Jupyter Notebooks are an increasingly popular platform for sharing and interacting analysis. The shareable, web-based platform for files (called notebooks) contains a list of code cells, interspersed with rich text commentary produced by Markdown cells. Markdown is a text formatting language, and the ability to document with markdown cells is a key features in Jupyter's use as a collaborative tool.
In Terra, Jupyter Notebooks are especially convenient because of how easy it is to share them. Find a notebook in a public workspace you'd like to play around with? Clone yourself a copy! Need to develop/refine an analysis in tandem with distant collaborators? Just invite them to your workspace, and work directly in the same notebook with them!
2. Creating/copying/opening a notebook
To start working on a notebook, use one of the five options below:
2.1 Click on a notebook in one of your workspaces
Be very careful when choosing this type of interaction in a shared workspace. If someone has shared a master version of a notebook with you, changes you make to the code will be saved in the master copy.
2.2. Clone a workspace that contains a notebook you are interested in
Cloning a workspace will automatically create copies of all of the notebooks contained within that workspace, allowing users to quickly duplicate code and tools in a secure sand box for their own use.
2.3. Clone an individual notebook to your own workspace
This is especially useful for notebooks in public workspaces that contain large amounts of sample notebooks, of which a limited amount may be useful to any given project.
2.4. Create a new notebook from scratch
2.5. Upload a ready-made notebook
This is nearly as simple as creating a new notebook, you only have to make sure that what you are uploading is a JSON file format saved with the
3. Running a cell
A notebook is broken down into individual cells. Each cell has a "cell type" (Code/Markdown/Raw NBConvert) that determines how the cluster will interpret the instructions in the cell.
The cell type can be seen or changed either in the Cell>Cell Type drop down menu, or on the right of the toolbar of function shortcuts located just below the menu bar:
To run a cell, either click the 'Run' button in the shortcut toolbar, or press
Enter, or select the appropriate command from the Cell dropdown menu:
To edit the content of a cell, simply double click on the cell you wish to edit.
3.1. Markdown Cells
The clip below shows an example of editing a Markdown-type cell. Markdown is a lightweight plain text formatting language. The syntax controls things like bold/italic text, header size, section enumeration, etc. Running a Markdown cell will cause the cluster to interpret only the Markdown-based syntax in the cell (running a Markdown-type cell that contains, for example, Python code, will result in that Python code simply being reproduced as plain text - Markdown cells do not interpret the code of the selected kernel).
The clip below shows two examples of Markdown syntax:
- Italicizing using a single asterisk on either side (*italics* => italics)
- Modifying text size using hashtags
A more comprehensive listing of Markdown syntax can be found in this helpful Markdown Cheatsheet
3.2. Code Cells
When the cell type is set to "code", running that cell will cause the cluster to interpret the contents of the cell using the interpreter of the selected kernel (e.g. Python, R). If the code in the cell does not match the kernel's language, the cluster will return an error. If the code is correct but specifies no outputs, the code will run and the result of the computation will be stored in the cluster (at least until the kernel is restarted). If no output is specified, the user can still tell if the cell was successfully executed by noting the square parens [ ] to the left of each cell.
When a user launches a notebook for the first time, the square parens are empty  indicating that these cells have not yet been run during this cluster session. Running a cell will cause this indicator to change: an asterisk inside the parens [*] indicates the cell is running. After a given cell has been executed, the asterisk will be replaced by an integer representing the number of times the cluster has executed any cell since the kernel last started. There is nothing to stop you from executing the same cell multiple times, as shown below.
If you clear the outputs by going to the dropdown menu Cell>All Outputs and selecting "Clear", the integer parens will all be replaced by empty parens again, but the integer count will only reset to zero if you restart the kernel.