GitHub - nigeltao/parse-number-fxx-test-data: parse_number_fxx test data (original) (raw)
parse_number_fxx
Test Data
This repository contains test data for parse_number_fxx
implementations (forfxx
being f16
, f32
or f64
), also known as StringToDouble
, strtod
,atof
, etc. These convert from an ASCII string to a 16-, 32- or 64-bit value (IEEE 754 half-, single- or double-precision floating point).
Most of the data/*.txt
files were derived by runningscript/extract-numbery-strings.go
on various repositories or zip files, listed further below. Their contents look like:
3C00 3F800000 3FF0000000000000 1
3D00 3FA00000 3FF4000000000000 1.25
3D9A 3FB33333 3FF6666666666666 1.4
57B7 42F6E979 405EDD2F1A9FBE77 123.456
622A 44454000 4088A80000000000 789
7C00 7F800000 7FF0000000000000 123.456e789
For example, parsing "1.4"
as a float32
gives the bits 0x3FB33333
.
In this case, the final line's float16
, float32
and float64
values are all infinity. The largest finite float{16,32,64}
values are approximately6.55e+4
, 3.40e+38
and 1.80e+308
.
For each line of these data/*.txt
files, the f16
, f32
and f64
hexadecimal digits and the ASCII string subslices are:
- When column indexes start at 0:
[0..4]
,[5..13]
,[14..30]
and[31..]
. - When column indexes start at 1:
[1..5]
,[6..14]
,[15..31]
and[32..]
.
The first half (the high 16 bits) of the f32
hexadecimal digits are also known as the bfloat16
format.
Data
In the data
directory:
exhaustive-float16.txt
is an exhaustive list offloat16
values.freetype-2-7.txt
was extracted from Freetype2.7google-double-conversion.txt
was extracted fromgoogle/double-conversiongoogle-wuffs.txt
was extracted fromgoogle/wuffsibm-fpgen.txt
was extracted from IBM'sIEEE 754R test suitelemire-fast-double-parser.txt
was extracted fromlemire/fast_double_parserlemire-fast-float.txt
was extracted fromlemire/fast_floatmore-test-cases.txt
was extracted from this repository's manually curated collection of more test casestencent-rapidjson.txt
was extracted fromTencent/rapidjsonulfjack-ryu.txt
was extracted fromulfjack/ryu
remyoudompheng/fptest
The data/remyoudompheng-fptest-?.txt
files were created by running go test -test.run=TestTortureAtof64
in theremyoudompheng/fptest repository (with the following patch), running the resultant TestTortureAtof64.txt
file through script/extract-numbery-strings.go
and then using sed
to split what would be a 189 MiB file into multiple (million line) files:
diff --git a/torture_test.go b/torture_test.go index 87ba7e7..59887ff 100644 --- a/torture_test.go +++ b/torture_test.go @@ -1,8 +1,11 @@ package fptest
import (
"bufio" "bytes"
"fmt" "math"
"os" "strconv" "testing"
@@ -124,6 +127,11 @@ func TestTortureShortest32(t *testing.T) { }
func TestTortureAtof64(t *testing.T) {
tmpFile, _ := os.Create("/tmp/TestTortureAtof64.txt")
defer tmpFile.Close()
tmpWriter := bufio.NewWriter(tmpFile)
defer tmpWriter.Flush()
count := 0 buf := make([]byte, 64) roundUp := false
@@ -140,6 +148,7 @@ func TestTortureAtof64(t *testing.T) { t.Errorf("could not parse %q: %s", s, err) return }
fmt.Fprintf(tmpWriter, "%s\n", s) expect := x if roundUp { expect = y
Users
Programs that use this test data set:
script/manual-test-parse-number-f64.cc
ingoogle/wuffstestsuite/json/test_json_decimal_to_number.adb
inAdaCore/VSS
Test Suite Running Time
As of November 2021, data/*.txt
contains over 5 million test cases. Parsing them all should take tens of seconds at most. For example, on a mid-rangex86_64
laptop (2016; Skylake):
$ grep model.name /proc/cpuinfo | uniq
model name : Intel(R) Core(TM) m3-6Y30 CPU @ 0.90GHz
$ git clone --depth 1 --quiet https://github.com/google/wuffs.git
$ gcc -O3 wuffs/script/manual-test-parse-number-f64.cc
$ time ./a.out data/*.txt
31745 OK in data/exhaustive-float16.txt
3566 OK in data/freetype-2-7.txt
564745 OK in data/google-double-conversion.txt
10744 OK in data/google-wuffs.txt
102792 OK in data/ibm-fpgen.txt
94313 OK in data/lemire-fast-double-parser.txt
3299 OK in data/lemire-fast-float.txt
60 OK in data/more-test-cases.txt
1000000 OK in data/remyoudompheng-fptest-0.txt
1000000 OK in data/remyoudompheng-fptest-1.txt
1000000 OK in data/remyoudompheng-fptest-2.txt
885708 OK in data/remyoudompheng-fptest-3.txt
3563 OK in data/tencent-rapidjson.txt
599458 OK in data/ulfjack-ryu.txt
real 0m6.790s
user 0m6.707s
sys 0m0.082s