HTTP/JSON: route int8 query vectors to byte[] for end-to-end INT8 ingest payload savings (original) (raw)
Background
The INT8 ingest work (#4132) added the {{VectorEncoding.INT8}} option and the BINARY-column / byte[]-key plumbing through the engine: ingest from byte[] no longer round-trips through float, document storage shrinks 4x, and the server-side query path accepts byte[] queries when invoked through the Java API.
The HTTP/JSON entry point does not yet plumb int8 query vectors through to byte[]. JSON has no native byte[] type, so a typical HTTP client today sends either:
- a JSON array of integers ({{[0, 64, 127, -1]}}), which the SQL parameter binder hands over as a {{List}}; or
- a base64-encoded string blob.
Both currently land as something other than {{byte[]}}, so {{VectorUtils.toFloatArray(value, VectorEncoding.INT8)}} routes through the {{List}} / {{String}} branch (interpreting elements as floats, not as int8 bytes) and the engine effectively pays the float32 wire cost even when the index is INT8-encoded. The 4x payload claim materialises for the storage side, not for the wire round-trip from HTTP clients.
Desired behaviour
When the target index uses {{VectorEncoding.INT8}}:
- {{PostCommandHandler}} / {{PostQueryHandler}} (and the corresponding Bolt / PostgreSQL / gRPC adapters) should be able to accept a query vector as either a JSON array of integers in {{[-128, 127]}} or a base64-encoded byte string, and decode it into a {{byte[]}} parameter that the SQL parameter binder forwards verbatim to {{VectorUtils.toFloatArray(value, VectorEncoding.INT8)}}.
- Document this convention so client SDK authors (Python, Node, Java) know the wire format for INT8 query vectors.
- Add an end-to-end test that submits an int8 query vector via HTTP and asserts the stored byte[] is what reaches the index (not a float32 round trip).
Proposed convention
Recommend base64 string for binary inputs (matches existing JSON conventions for binary data). The HTTP handler can detect {{"\$binary": "..."}} or a typed-parameter form and decode to {{byte[]}} before invoking SQL.
Why now
The INT8 ingest landing claim is partially incomplete without this; users reading the comparison matrix may assume HTTP clients see the 4x payload savings, when today they only get the storage savings.
Related
- INT8 ingest landing: feat(vector): pre-quantized INT8 ingest for LSM_VECTOR (Tier 4 follow-up) #4132
- Schema-level encoding plumbing: PR {{vector-int8-ingest}}
- HTTP handler: {{server/src/main/java/com/arcadedb/server/http/handler/AbstractQueryHandler.java}} ({{mapParams}})