If you remember the days before Google docs, when you could work for hours on a paper only to have it vanish if your computer shut down and you hadn't saved it, you know firsthand the pain of losing work you thought was safe. Notebooks are wonderful for interactive data analysis, but there are a few quirks that can trip you up if you're not careful!
This article describes how to avoid a potential pitfall when working in a Jupyter notebook:
What and when you need to save so you don't lose parts of your analysis unintentionally
Note: For a deeper dive into the back end of a Terra notebook and to understand why notebooks have these characteristics, see this article about key notebook components or this article about key notebook operations.
How to not lose output files
The key is to understand is that files generated by the notebook are not automatically saved in the Workspace. In particular, you will lose output data generated in a notebook if you delete or reconfigure a cluster without explicitly saving your output to the workspace bucket.
You will not lose your data if you pause (stop) a cluster, since the cluster goes away but the runtime environment disk does not. In fact, when you re-open your notebook, the cluster creates more quickly as the disk does not need to be recreated. As an added bonus, you do not need to reinstall your software.
To avoid losing your data, make sure to explicitly save your outputs in the workspace bucket. You can find step by step instructions on how to do this within the notebook in this article.
What happens if you've lost your notebook data?
Your notebooks and any data explicitly saved to your bucket are still in long term storage in the workspace bucket. This means you can rerun the notebook to regenerate any output data (though you will pay for this, of course).