Fix footnote ordering by aeskildsen · Pull Request #1546 · Python-Markdown/markdown (original) (raw)

So I managed to do a bit of benchmarking.

I benchmarked the processing of two markdown documents using pyperf:

Let me know if you want to see the contents of these files.

The stack.pop(0) fix seems to have a very negligible impact on performance. I'm no statistician, but I would guess the difference is within the margin of error for the benchmarks I ran. collections.deque seems to show a very minor performance improvement in the case of the nested document, but again, likely not far from the margin of error.

I am no expert on how the inline processor works, but I think the reason we are not seeing big differences here is that inline nesting mostly does not go very deep. I believe this is what @waylan mentioned earlier. As I understand it, list.pop(0) is O(n) and deque.popleft() is O(1), but when the list is short the difference is probably negligible.

Data

sample_doc.md, default pyperf timeit benchmark

v3.8.2 with no changes: Mean +- std dev: 50.6 ms +- 0.9 ms
commit d8ecae6 (stack.pop(0) fix): Mean +- std dev: 51.4 ms +- 1.9 ms
commit d8ecae6 + collections.deque instead of list: Mean +- std dev: 51.9 ms +- 2.3 ms

nested_doc.md, default pyperf timeit benchmark

Here I used extensions=['extra'].

v3.8.2 with no changes: Mean +- std dev: 156 ms +- 3 ms
commit d8ecae6 (with stack.pop(0) fix): Mean +- std dev: 160 ms +- 5 ms
commit d8ecae6 + collections.deque instead of list: Mean +- std dev: 158 ms +- 3 ms

nested_doc.md, rigorous pyperf timeit benchmark

Here I used extensions=['extra'].

v3.8.2 with no changes: Mean +- std dev: 160 ms +- 9 ms
commit d8ecae6 (with stack.pop(0) fix): Mean +- std dev: 164 ms +- 11 ms
commit d8ecae6 + collections.deque instead of list: Mean +- std dev: 160 ms +- 7 ms

TLDR

There does not appear to be a noteworthy performance impact from the stack.pop(0) fix. I think it can be merged without major concerns about performance.

It is possible to use collections.deque as a drop-in replacement for the list, which could offer a slight improvement in documents with very complex nested inline structures. We could consider testing how it performs on a wider range of documents first, and so perhaps a different PR would be more appropriate for that.