move io.c from using unbuffered fread()s to read()s. by dgwynne · Pull Request #16039 · pandas-dev/pandas (original) (raw)

pandas already buffers reads coming from io.c itself, so it previously
used setbuf() to disable buffering inside fread(). however, certain
implementations of unbuffered stdio reads are sub-optimal. for example,
fread() in solaris ends up doing a read() for each individual byte of
the underlying filedescriptor, which turns out to be very slow.

instead, this code now open()s a file descritor and read()s directly
into the buffer that pandas has already allocated. this is effectively
what other libcs (eg, glibc) do underneath an unbuffered fread() anyway,
but this is more explicit.

while here, this tweaks the mmap backend to use open() too, and also
properly checks for mmap failure by comparing its result to MAP_FAILED
instead of NULL.