Implement new vector index using JVector library (original) (raw)

Actually, in ArcadeDB, there's a vector index implementation based on hnswlib that lacks some features and is not very integrated in Arcade (e.g.: no SQL support).

Jvector is the leading library to implement embedded vector search engine.

The plan is to replace the existing implementation with the new one based on Jvector and provide a fully integrated support with ArcadeDB engine and transactions.

For the sake of completeness, the #1490 is copied here:

We need some new function/method to expose the following methods from the index:

The easiest way is to create 3 new SQL functions to be used from SQL. Example:

select findNeighborsFromVector( "Word[name,vector]", [1,2,3,4,5,6], 10 )

The Java API returns a List<Pair<Identifiable, ? extends Number>>, with the vertex rid as the first argument and a number (float, double or whatever you pick at index creation) with the proximity. Ordered by proximity, the closest first.

With SQL it must be wrapped in a Result with "vertex" and "proximity" properties:

+------------------+---------------------+
| VERTEX           |           PROXIMITY |
+------------------+---------------------+
| #13:4            |                0.12 |
| #19:10           |                0.19 |
+------------------+---------------------+

So you can also cross the graph starting with embeddings:

select expand( vertex ) from ( select findNeighborsFromVector( "Word[name,vector]", [1,2,3,4,5,6], 10 ) ) where proximity < 0.5

To return all the neighbors with proximity less than 0.5 from the vector.