-
Notifications
You must be signed in to change notification settings - Fork 470
document vector indexes #19595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
document vector indexes #19595
Conversation
✅ Deploy Preview for cockroachdb-api-docs canceled.
|
✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.
|
Files changed:
|
✅ Netlify Preview
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments.
src/current/_includes/v25.2/known-limitations/vector-limitations.md
Outdated
Show resolved
Hide resolved
@dikshant @andy-kimball TFTRs! I've incorporated your comments. |
src/current/_includes/v25.2/known-limitations/vector-limitations.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM nice work! 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm pending questions/suggestions
very informative
- {% include {{ page.version.version }}/sql/vector-batch-inserts.md %} | ||
- Creating a vector index through a backfill disables mutations ([`INSERT`]({% link {{ page.version.version }}/insert.md %}), [`UPSERT`]({% link {{ page.version.version }}/upsert.md %}), [`UPDATE`]({% link {{ page.version.version }}/update.md %}), [`DELETE`]({% link {{ page.version.version }}/delete.md %})) on the table. [#144443](https://github.com/cockroachdb/cockroach/issues/144443) | ||
- `IMPORT INTO` is not supported on tables with vector indexes. You can import the vectors first and create the index after import is complete. [#145227](https://github.com/cockroachdb/cockroach/issues/145227) | ||
- Only L2 distance (`<->`) searches are accelerated. [#144016](https://github.com/cockroachdb/cockroach/issues/144016) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "accelerated"? Should "accelerated" be "supported"?
@@ -88,7 +90,10 @@ SELECT category, vector FROM items WHERE category = 'electronics' ORDER BY vecto | |||
electronics | [0.9,0.1,0] | |||
~~~ | |||
|
|||
You can use a [vector index]({% link {{ page.version.version }}/vector-indexes.md %}) to make searches on large numbers of high-dimensional [`VECTOR`]({% link {{ page.version.version }}/vector.md %}) rows more efficient. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove link to same page
You can use a [vector index]({% link {{ page.version.version }}/vector-indexes.md %}) to make searches on large numbers of high-dimensional [`VECTOR`]({% link {{ page.version.version }}/vector.md %}) rows more efficient. | |
You can use a [vector index]({% link {{ page.version.version }}/vector-indexes.md %}) to make searches on large numbers of high-dimensional `VECTOR` rows more efficient. |
(14 rows) | ||
~~~ | ||
|
||
We also have other resources on indexes: | ||
[Learn more about indexes]({% link {{ page.version.version }}/indexes.md %}). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence does not seem necessary here and could be removed. Maybe "Secondary indexes" in line 262 should have a link to https://www.cockroachlabs.com/docs/v25.2/schema-design-indexes.
CREATE TABLE items ( | ||
id uuid DEFAULT gen_random_uuid(), | ||
embedding VECTOR (1536), | ||
VECTOR INDEX (embedding) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If cluster setting is not enabled, it gives an error.
CREATE TABLE items ( | |
id uuid DEFAULT gen_random_uuid(), | |
embedding VECTOR (1536), | |
VECTOR INDEX (embedding) | |
); | |
SET CLUSTER SETTING feature.vector_index.enabled = true; | |
CREATE TABLE items ( | |
id uuid DEFAULT gen_random_uuid(), | |
embedding VECTOR (1536), | |
VECTOR INDEX (embedding) | |
); |
|
||
### Specify an opclass | ||
|
||
You can optionally specify an opclass. If not specified, the default is `vector_l2_ops`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For readers like me who do not know what an opclass is:
You can optionally specify an opclass. If not specified, the default is `vector_l2_ops`: | |
You can optionally specify an opclass (short for operator class) that defines how a `VECTOR` data type is handled by the index. If not specified, the default is `vector_l2_ops`: |
|
||
Vector indexes on `VECTOR` columns support the following comparison operator: | ||
|
||
- **L2 distance**: [`<->`]({% link {{ page.version.version }}/functions-and-operators.md %}#operators) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might be a more informative link:
- **L2 distance**: [`<->`]({% link {{ page.version.version }}/functions-and-operators.md %}#operators) | |
- **L2 distance**: [`<->`]({% link {{ page.version.version }}/vector.md %}#syntax) |
|
||
Partition size and beam size interact to control both the precision of nearest neighbor search and the cost of maintaining the index. You can improve the accuracy of vector searches by increasing either the search beam size or partition size: | ||
|
||
- A larger search beam improves accuracy by exploring more partitions, which increases the number of candidate vectors evaluated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For parallelism:
- A larger search beam improves accuracy by exploring more partitions, which increases the number of candidate vectors evaluated. | |
- A larger search beam size improves accuracy by exploring more partitions, which increases the number of candidate vectors evaluated. |
DOC-13405
DOC-13249
DOC-13585
DOC-13604
DOC-13633
DOC-13634
DOC-13635
Preview