GitHub - cphyc/ipyspaghetti (original) (raw)

IpySpaghetti — WORK IN PROGRESS

Github Actions StatusBinder

This extension is composed of a Python package named ipyspaghettifor the server extension and a NPM package named ipyspaghettifor the frontend extension.

It is subject to changes as this is still very much a work in progress. DO NOT RELY EXPECT IT TO WORK NOR RELY ON IT FOR IMPORTANT WORK. If you are interested in the idea, please feel free to contact me, as I don't have much time to progress on the project for the foreseeable future.

Comments, reviews and contributions are more than welcome!

Rationale

I really enjoy working with IPython Notebook, especially in jupyterlab for the level of interactivity it allows. Some of the shortcomings of notebooks (from this blog post, but similar posts are numerous!).

I would also add that notebooks do not convey the way I think about data analysis. In my mind, analysing data requires multiple independent steps that get some data in, transform them and yield new data.

This projects aims to address these issues by providing a slightly more constrained development environment based on data flows rather than cells. To address the points above, this project provides a file format and a development environment, all integrated in JupyterLab.

Example image

The project is based on the idea of data flows, represented as nodes in a graph. The graph describes how to load, modify and output data, where data is an abstraction that comprise files on disk, resources on the Internet, the result of a plot, etc. The underlying file format is a valid Python file that contains a global variable named ___GRAPH. This variable is a string containing a JSON-formatted description of the graph. It can either be loaded in the development environment, or eventually be parsed entirely in Python and executed in headless environments (this hasn't been coded yet).

The project aims to reuse as much as possible what's already been done for JupyterLab to allow a similar level of interactivity (including ipywidgets).

The python package provides a registry in which you can register any Python function using register_node. Any import will happen in the global scope, and any function which is not registered will be global. Registered function can then be used as many times as you want as a node in the graph. The node takes as inputs the function's input and returns some outputs. If you add type annotations to your function, it will also take them into account to decide which inputs and outputs are compatible. This allows to address the points above as follows:

Of course, this approach has a few shortcomings. First, completion is not as friendly as in a regular notebook, as the global scope is clean (but it should be possible to integrate JupyterLab-LSP, since we're reusing many JupyterLab components). Second, there is some magic happening under the hood to connect the nodes together and manage the inputs/outputs. This may be a source of confusion and hard-to-understand bugs.

Features & TODO list

Requirements

Install

For the moment, the package needs to be installed from source, which you can achieve using

Troubleshoot

If you are seeing the frontend extension, but it is not working, check that the server extension is enabled:

jupyter server extension list

If the server extension is installed and enabled, but you are not seeing the frontend extension, check the frontend extension is installed:

jupyter labextension list

Contributing

Development install

Note: You will need NodeJS to build the extension package.

The jlpm command is JupyterLab's pinned version ofyarn that is installed with JupyterLab. You may useyarn or npm in lieu of jlpm below.

Clone the repo to your local environment

Change directory to the ipyspaghetti directory

Install package in development mode

pip install -e .

Link your development version of the extension with JupyterLab

jupyter labextension develop . --overwrite

Rebuild extension Typescript source after making changes

jlpm run build

You can watch the source directory and run JupyterLab at the same time in different terminals to watch for changes in the extension's source and automatically rebuild the extension.

Watch the source directory in one terminal, automatically rebuilding when needed

jlpm run watch

Run JupyterLab in another terminal

jupyter lab

With the watch command running, every saved change will immediately be built locally and available in your running JupyterLab. Refresh JupyterLab to load the change in your browser (you may need to wait several seconds for the extension to be rebuilt).

By default, the jlpm run build command generates the source maps for this extension to make it easier to debug using the browser dev tools. To also generate source maps for the JupyterLab core extensions, you can run the following command:

jupyter lab build --minimize=False

Uninstall

pip uninstall ipyspaghetti