Have you always wished Jupyter notebooks were plain text documents? Wished you could edit them in your favorite IDE? And get clear and meaningful diffs when doing version control? Then, Jupytext may well be the tool you’re looking for!
A Python notebook encoded in the
py:percent format has a
.py extension and looks like this:
# %% [markdown]
# This is a markdown cell
Only the notebook inputs (and optionally, the metadata) are included. Text notebooks are well suited for version control. You can also edit or refactor them in an IDE - the
.py notebook above is a regular Python file.
We recommend the
percent format for notebooks that mostly contain code. The
percent format is available for Julia, Python, R and many other languages.
If your notebook is documentation-oriented, a Markdown-based format (text notebooks with a
.md extension) might be more appropriate. Depending on what you plan to do with your notebook, you might prefer the Myst Markdown format, which interoperates very well with Jupyter Book, or Quarto Markdown, or even Pandoc Markdown.
Install Jupytext in the Python environment that you use for Jupyter. Use either
pip install jupytext
conda install jupytext -c conda-forge
Then, restart your Jupyter Lab server, and make sure Jupytext is activated in Jupyter:
.md files have a Notebook icon, and you can open them as Notebooks with a right click in Jupyter Lab.
Text notebooks with a
.md extension are well suited for version control. They can be edited or authored conveniently in an IDE. You can open and run them as notebooks in Jupyter Lab with a right click. However, the notebook outputs are lost when the notebook is closed, as only the notebook inputs are saved in text notebooks.
A convenient alternative to text notebooks are paired notebooks. These are a set of two files, say
.py, that contain the same notebook, but in different formats.
You can edit the
.py version of the paired notebook, and get the edits back in Jupyter by selecting reload notebook from disk. The outputs will be reloaded from the
.ipynb file, if it exists. The
.ipynb version will be updated or recreated the next time you save the notebook in Jupyter.
To pair a notebook in Jupyter Lab, use the command
Pair Notebook with percent Script from the Command Palette:
To pair all the notebooks in a certain directory, create a configuration file with this content:
# jupytext.toml at the root of your notebook directory
formats = "ipynb,py:percent"
Jupytext is also available at the command line. You can
pair a notebook with
jupytext --set-formats ipynb,py:percent notebook.ipynb
synchronize the paired files with
jupytext --sync notebook.py(the inputs are loaded from the most recent paired file)
convert a notebook in one format to another with
jupytext --to ipynb notebook.py(use
-oif you want a specific output file)
pipe a notebook to a linter with e.g.
jupytext --pipe black notebook.ipynb
Sample use cases¶
Notebooks under version control¶
This is a quick how-to:
Save the notebook - this creates a
.pynotebook to version control
You might exclude
.ipynb files from version control (unless you want to see the outputs versioned!). Jupytext will recreate the
.ipynb files locally when the users open and save the
Collaborating on notebooks with Git¶
Collaborating on Jupyter notebooks through Git becomes as easy as collaborating on text files.
Assume that you have your
.py notebooks under version control (see above). Then,
Your collaborator pulls the
They open it as a notebook in Jupyter (right-click in Jupyter Lab)
At that stage the notebook has no outputs. They run the notebook and save it. Outputs are regenerated, and a local
.ipynbfile is created
They edit the notebook, and push the updated
notebook.pyfile. The diff is nothing else than a standard diff on a Python script.
You pull the updated
notebook.pyscript, and refresh your browser. The input cells are updated based on the new content of
notebook.py. The outputs are reloaded from your local
.ipynbfile. Finally, the kernel variables are untouched, so you have the option to run only the modified cells to get the new outputs.
Editing or refactoring a notebook in an IDE¶
Once your notebook is paired with a
.py file, you can easily edit or refactor the
.py representation of the notebook in an IDE.
Once you are done editing the
.py notebook, you will just have to reload the notebook in Jupyter to get the latest edits there.
Note: It is simpler to close the
.ipynb notebook in Jupyter when you edit the paired
.py file. There is no obligation to do so; however, if you don’t, you should be prepared to read carefully the pop-up messages. If Jupyter tries to save the notebook while the paired
.py file has also been edited on disk since the last reload, a conflict will be detected and you will be asked to decide which version of the notebook (in memory or on disk) is the appropriate one.
Read more about Jupytext in the documentation.