GitHub - polydbms/sheetreader-core (original) (raw)

SheetReader is a blazingly fast and memory-efficient spreadsheet parser for tabular data from Excel OOXML (.xlsx) files, implemented in C++. Other spreadsheet parsers are based on general-purpose XML parsers, that lead to CPU and memory over-utilization, because of the redundant XML information and the inflated in-memory XML tree representation. In contrast, SheetReader leverages the fixed spreadsheet structure, employs parallelism at different levels, and manages memory efficiently.

@article{DBLP:journals/is/GavriilidisHZM23,
  author       = {Haralampos Gavriilidis and
                  Felix Henze and
                  Eleni Tzirita Zacharatou and
                  Volker Markl},
  title        = {SheetReader: Efficient Specialized Spreadsheet Parsing},
  journal      = {Inf. Syst.},
  volume       = {115},
  pages        = {102183},
  year         = {2023},
  url          = {https://doi.org/10.1016/j.is.2023.102183},
  doi          = {10.1016/J.IS.2023.102183},
  timestamp    = {Mon, 26 Jun 2023 20:54:32 +0200},
  biburl       = {https://dblp.org/rec/journals/is/GavriilidisHZM23.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}