Metadata
vecs allows you to associate key-value pairs of metadata with indexes and ids in your collections. You can then add filters to queries that reference the metadata metadata.
Types#
Metadata is stored as binary JSON. As a result, allowed metadata types are drawn from JSON primitive types.
- Boolean
- String
- Number
The technical limit of a metadata field associated with a vector is 1GB. In practice you should keep metadata fields as small as possible to maximize performance.
Metadata Query Language#
The metadata query language is based loosely on mongodb's selectors.
vecs currently supports a subset of those operators.
Comparison Operators#
Comparison operators compare a provided value with a value stored in metadata field of the vector store.
| Operator | Description |
|---|---|
| $eq | Matches values that are equal to a specified value |
| $ne | Matches values that are not equal to a specified value |
| $gt | Matches values that are greater than a specified value |
| $gte | Matches values that are greater than or equal to a specified value |
| $lt | Matches values that are less than a specified value |
| $lte | Matches values that are less than or equal to a specified value |
| $in | Matches values that are contained by scalar list of specified values |
Logical Operators#
Logical operators compose other operators, and can be nested.
| Operator | Description |
|---|---|
| $and | Joins query clauses with a logical AND returns all documents that match the conditions of both clauses. |
| $or | Joins query clauses with a logical OR returns all documents that match the conditions of either clause. |
Performance#
For best performance, use scalar key-value pairs for metadata and prefer $eq, $and and $or filters where possible.
Those variants are most consistently able to make use of indexes.
Examples#
year equals 2020
_10{"year": {"$eq": 2020}}
year equals 2020 or gross greater than or equal to 5000.0
_10{_10 "$or": [_10 {"year": {"$eq": 2020}},_10 {"gross": {"$gte": 5000.0}}_10 ]_10}
last_name is less than "Brown" and is_priority_customer is true
_10{_10 "$and": [_10 {"last_name": {"$lt": "Brown"}},_10 {"is_priority_customer": {"$gte": 5000.00}}_10 ]_10}
priority contained by ["enterprise", "pro"]
_10{_10 "priority": {"$in": ["enterprise", "pro"]}_10}