PROTOTYPE: Use lexical scope to assign metadata to runs in notebooks

As a python library in Jupyter user, I'd like to be able to re-run arbitrary cells in my notebook in whatever order I see fit and still have correct run metadata generated, so that I can enjoy the interactivity and explorability of Jupyter notebooks along with the provenance tracking and reproducibility of Dotscience.

Currently, I can't, because of #1. Let's try using a different approach and see if it works.

### ACs: 

As this is a prototyping effort, all these ACs are to be considered "aspirational"; we'll see what we can achieve in practice then decide, at the end, whether what we have is better than what we ALREADY have.

- [ ] I can run cells in any order a reasonable user would do so, and get the results I'd expect in my run metadata.
- [ ] No extra user effort is required.
- [ ] Unless I do something really/deliberately silly or unlikely, there's no way to end up with two runs merged together ending up in a Dotscience commit.

These ACs should apply for all of these cases:

- [ ] Notebooks with one run.
- [ ] Notebooks with two or more runs in separate cells, eg multiple calls to `ds.publish()`.
- [ ] Notebooks with a loop that generates runs, eg a single call to `ds.publish()` that's in a loop (eg, trying the same algorithm with a range of input parameters to see what's best).
- [ ] A combination of the previous two cases.

### Implementation plan:

We have a CUNNING PLAN to break this impasse! It's Luke's suggestion:

- Don't store state in-memory in the python library, because the history of that in-memory state is the dynamic flow of execution of Jupyter cells which may have nothing to do with their order in the notebook, leading to the problems expounded above.
- Instead, every time you call a metadata-registration function like ds.input(), it should output a machine-readable tag at that very point.
- `ds.publish()` outputs an "end of this run" tag
- the parser (be it notebook or command-output) reads the tags from top to bottom, building up in-memory state in notebook lexical order and outputting a run and clearing its in-memory state at the "end of this run" tag
- Therefore, the assignment of actions to runs is based purely on the lexical structure, not the dynamic structure.
- For extra niceness, in Jupyter mode, we can output the markers inside "Jupyter widgets" that control their display (rather than plain text) so they're less obtrusive and prettier; but we need to transparently not do that when not in Jupyter.
- How does this work with "publish inside a loop"? Unless we come up with a clever trick, we'll only keep the results of the last iteration of the loop. But do users do publish inside a loop to try the same algorithm with different input parameters, or copy+paste the cell and edit the parameters in each copy?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PROTOTYPE: Use lexical scope to assign metadata to runs in notebooks #6

ACs:

Implementation plan:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PROTOTYPE: Use lexical scope to assign metadata to runs in notebooks #6

Description

ACs:

Implementation plan:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions