Segfaults: "numeric(0)" input in several fast statistical functions · Issue #101 · SebKrantz/collapse (original) (raw)
While working on some data cleaning, a numeric(0)
vector popped out after base
subsetting. This led to a segfault when I tried to use the zero length vector with fsum()
. Afterwards I tested all other fast statistical functions with the following results:
collapse::fsum(numeric(0)) # Segfault: address 0x55f7363efff8, cause 'memory not mapped' collapse::fprod(numeric(0)) # Segfault: address 0x557a14cc8ff8, cause 'memory not mapped' collapse::fmean(numeric(0)) # Segfault: address 0x559c7bdfaff8, cause 'memory not mapped collapse::fmedian(numeric(0)) # Output: [1] 4.668247e-310 collapse::fmode(numeric(0)) # Output: [1] 4.668247e-310 collapse::fvar(numeric(0)) # Output: [1] 0 (but the equivalent 'stats::var(numeric(0))' returns NA collapse::fsd(numeric(0)) # Output: [1] 0 (but the equivalent 'stats::sd(numeric(0))' returns NA collapse::fmin(numeric(0)) # Segfault: address 0x55ef52e99ff8, cause 'memory not mapped' collapse::fmax(numeric(0)) # Segfault: address 0x5586c9c93ff8, cause 'memory not mapped' collapse::fnth(numeric(0)) # Output: [1] 4.668247e-310 collapse::ffirst(numeric(0)) # Output: [1] 4.668247e-310 collapse::flast(numeric(0)) # Output: [1] 0 (inconsistent data types compared to 'ffirst()') collapse::fNobs(numeric(0)) # Output: [1] 0 collapse::fNdistinct(numeric(0)) # Output: [1] 0
fNobs()
and fNdistinct()
both return the expected output. For the others, as I've commented, they either:
- Cause a crash.
- Inconsistent output compared to the
stats
package (e.g.stats::median(numeric(0))
returns NA, not 0). - Internally inconsistent output (e.g.
ffirst()
andflast()
return different data types).
I've attached a gdb session with the backtrace for the segfault with fsum()
.
debug_session.txt
Session info:
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux bullseye/sid
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_IL.utf8 LC_COLLATE=en_IL.utf8
[5] LC_MONETARY=en_IL.utf8 LC_MESSAGES=en_IL.utf8
[7] LC_PAPER=en_IL.utf8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_IL.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.0.3 parallel_4.0.3 RcppArmadillo_0.10.1.2.0
[4] Rcpp_1.0.5 collapse_1.4.2
Package compiled with gcc:
gcc (Debian 10.2.0-19) 10.2.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.