Diffing and merging notebooks by martinal · Pull Request #8 · jupyter/enhancement-proposals (original) (raw)
Yup @minrk , "we are doing something about this" :) But one key thing that seems missing to me (and I don't find it in the nbdime repo either) is a high-level description of the approach taken.
To me, the notebook diff problem seems separable in two phases: identification of changes to the 'container' by finding cell insertions, deletions, transpositions, moves, etc, and then within-cell diffing, that then becomes content/mimetype-specific. This is a more complex version of the problem of finding the location of changes in a linear pure-text file, and then applying the local diff itself.
That's the approach that seems natural to me, but if that's not the case, then what approach is being used should be described.
Second, I think at least mention of what happens to metadata should be made. We have a lot of metadata in the notebook, and there can be differences therein.
So, while I apologize for the delay, I'm -1 on merging this document until these high-level ideas are a bit better described. While I realize that the actual implementation in the nbdime repo is where the 'real work' happens, I think it's also important that the high-level description is reasonably complete.
It shouldn't be too difficult to add a bit of language to that effect, and the document in the long run will be much more valuable. We want our JEPs to serve a similar role to Python's PEPs, in that they provide reasonable stand-alone descriptions of the problem and the proposed solution, that one can read to understand the whole thing at a high-level without digging into the implementation itself.