Executing jobs on the cloud can be scary if you don't know how much you're spending. Read on for an overview of how to estimate the cost of an interactive analysis in Terra.
Cloud Environment (i.e., interactive analysis) costs
When creating a Cloud Environment to run interactive analyses like Jupyter Notebooks, Galaxy, or RStudio, Terra gives a breakdown of the total costs (running cloud compute, paused cloud compute and persistent disk) in the configuration form:
Cloud environment cost caveats
Your Cloud Environment costs are calculated per hour and accrue when your virtual machine (VM) is running, regardless of whether you are running calculations. There is also a cost for your detachable persistent disk (PD) and minimal cost associated with paused Cloud Environments.
Default configuration versus Spark cluster costs
The hourly cost for a high-powered Spark cluster (at left - $1.42/hour) is almost 24 times more than for a low-powered default VM (at right - $0.06/hour).
Make sure to choose the resources you need appropriately to balance cost and speed. There's no need to use an expensive Spark configuration for analysis that will run fine on a lower-powered machine.
How to estimate your Jupyter Notebook or RStudio costs
Use %%time
to get the elapsed time of running a cell (or group of cells) of code.
You can use the "time" command to benchmark the total time for a notebook analysis. Multiplying this number by the cost/hour provides a reasonable estimate of the total cost of running the notebook (assuming you don't leave the Cloud Environment running for long periods in between executing code cells.
For more details and recommendations about how to customize your Cloud Environment, see Your interactive analysis VM (Cloud Environment).
How to estimate your Galaxy analysis costs
Choose your Cloud Environment
The cost of your Galaxy instance is determined by the numbers you pick at launch in Terra, (i.e., compute size, disk size, number of nodes) regardless of the number of jobs run in Galaxy at any given moment.
Once you know that cost, you can estimate how much your Galaxy analysis will cost.
Multiply the Cloud Environment costs ($/hour) by the number of hours you have the instance open.
There are also active/passive costs, i.e., compute cost only when running the instance, versus disk cost also when instance is down but disk not deleted.
Galaxy instances continue to charge until deletedDepending on the resources you request, these charges can add up quickly! Don't forget to delete your Galaxy instance when you are done running your analysis!