Skip to content

Commit 8809bf6

Browse files
committed
address more reviewer comments
1 parent 706648f commit 8809bf6

File tree

4 files changed

+45
-20
lines changed

4 files changed

+45
-20
lines changed

src/current/_includes/v25.2/known-limitations/vector-limitations.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,4 @@
55
- Index acceleration with filters is only supported if the filters match prefix columns. [#146145](https://github.com/cockroachdb/cockroach/issues/146145)
66
- Index recommendations are not provided for vector indexes. [#146146](https://github.com/cockroachdb/cockroach/issues/146146)
77
- Vector index queries may return incorrect results when the underlying table uses multiple column families. [#146046](https://github.com/cockroachdb/cockroach/issues/146046)
8-
- Queries may ignore filter conditions (e.g., a `WHERE` clause) when multiple vector indexes exist on the same `VECTOR` column, and one has a prefix column. [#146257](https://github.com/cockroachdb/cockroach/issues/146257)
8+
- Queries against a vector index may ignore filter conditions (e.g., a `WHERE` clause) when multiple vector indexes exist on the same `VECTOR` column, and one has a prefix column. [#146257](https://github.com/cockroachdb/cockroach/issues/146257)
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
Large batch inserts of [`VECTOR`]({% link {{ page.version.version }}/vector.md %}) types can cause performance degradation. When inserting vectors, batching should be avoided.
1+
Large batch inserts of [`VECTOR`]({% link {{ page.version.version }}/vector.md %}) types can cause performance degradation. When inserting vectors, batching should be avoided. For an example, refer to [Create and query a vector index]({% link {{ page.version.version }}/vector-indexes.md %}#create-and-query-a-vector-index).

src/current/v25.2/create-index.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ Parameter | Description
4848
`IF NOT EXISTS` | Create a new index only if an index of the same name does not already exist; if one does exist, do not return an error.
4949
`opt_index_name`<br>`index_name` | The name of the index to create, which must be unique to its table and follow these [identifier rules]({% link {{ page.version.version }}/keywords-and-identifiers.md %}#identifiers).<br><br>If you do not specify a name, CockroachDB uses the format `<table>_<columns>_key/idx`. `key` indicates the index applies the `UNIQUE` constraint; `idx` indicates it does not. Example: `accounts_balance_idx`
5050
`table_name` | The name of the table you want to create the index on.
51-
`USING name` | An optional clause for compatibility with third-party tools. Accepted values for `name` are `btree`, `gin`, and `gist`, with `btree` for a standard secondary index, `gin` as the PostgreSQL-compatible syntax for a [GIN index](#create-gin-indexes), `gist` for a [spatial index]({% link {{ page.version.version }}/spatial-indexes.md %}), and `cspann` for a [vector index]({% link {{ page.version.version }}/vector-indexes.md %}). `hnsw` and `ivfflat` are aliased to `cspann` for compatibility with [`pgvector`](https://github.com/pgvector/pgvector) syntax.
51+
`USING name` | An optional clause for compatibility with third-party tools. Accepted values for `name` are `btree`, `gin`, and `gist`, with `btree` for a standard secondary index, `gin` as the PostgreSQL-compatible syntax for a [GIN index](#create-gin-indexes), `gist` for a [spatial index]({% link {{ page.version.version }}/spatial-indexes.md %}), and `cspann` for a [vector index]({% link {{ page.version.version }}/vector-indexes.md %}). `hnsw` is aliased to `cspann` for compatibility with [`pgvector`](https://github.com/pgvector/pgvector) syntax.
5252
`name` | The name of the column you want to index. For [multi-region tables]({% link {{ page.version.version }}/multiregion-overview.md %}#table-localities), you can use the `crdb_region` column within the index in the event the original index may contain non-unique entries across multiple, unique regions.
5353
`ASC` or `DESC`| Sort the column in ascending (`ASC`) or descending (`DESC`) order in the index. How columns are sorted affects query results, particularly when using `LIMIT`.<br><br>__Default:__ `ASC`
5454
`STORING ...`| Store (but do not sort) each column whose name you include.<br><br>For information on when to use `STORING`, see [Store Columns](#store-columns). Note that columns that are part of a table's [`PRIMARY KEY`]({% link {{ page.version.version }}/primary-key.md %}) cannot be specified as `STORING` columns in secondary indexes on the table.<br><br>`COVERING` and `INCLUDE` are aliases for `STORING` and work identically.

src/current/v25.2/vector-indexes.md

+42-17
Original file line numberDiff line numberDiff line change
@@ -58,29 +58,59 @@ You can also specify a vector index during table creation. For example:
5858
{% include_cached copy-clipboard.html %}
5959
~~~ sql
6060
CREATE TABLE items (
61-
category STRING,
61+
department_id INT,
62+
category_id INT,
6263
embedding VECTOR(1536),
6364
VECTOR INDEX (embedding)
6465
);
6566
~~~
6667

67-
You can create a vector index with a *prefix column* to pre-filter the search space. This is especially useful for tables containing millions of vectors or more. For an example, refer to [Create and query a vector index](#create-and-query-a-vector-index).
68+
### Prefix columns
69+
70+
You can create a vector index with one or more *prefix columns* to pre-filter the search space. This is especially useful for tables containing millions of vectors or more.
6871

6972
{% include_cached copy-clipboard.html %}
7073
~~~ sql
7174
CREATE TABLE items (
72-
category STRING,
75+
department_id INT,
76+
category_id INT,
7377
embedding VECTOR(1536),
74-
VECTOR INDEX (category, embedding)
78+
VECTOR INDEX (department_id, category_id, embedding)
7579
);
7680
~~~
7781

82+
A vector index is only used if each prefix column is constrained to a specific value in the query. For example:
83+
84+
{% include_cached copy-clipboard.html %}
85+
~~~ sql
86+
WHERE department_id = 100 AND category_id = 200
87+
~~~
88+
89+
You can filter on multiple prefix values using `IN`:
90+
91+
{% include_cached copy-clipboard.html %}
92+
~~~ sql
93+
WHERE (department_id, category_id) IN ((100, 200), (300, 400))
94+
~~~
95+
96+
The following example will not use the vector index:
97+
98+
{% include_cached copy-clipboard.html %}
99+
~~~ sql
100+
WHERE department_id = 100 AND category_id >= 200
101+
~~~
102+
103+
For an example, refer to [Create and query a vector index](#create-and-query-a-vector-index).
104+
105+
### Vector index opclass
106+
78107
You can optionally specify an opclass. If not specified, the default is `vector_l2_ops`:
79108

80109
{% include_cached copy-clipboard.html %}
81110
~~~ sql
82111
CREATE TABLE items (
83-
category STRING,
112+
department_id INT,
113+
category_id INT,
84114
embedding VECTOR(1536),
85115
VECTOR INDEX embed_idx (embedding vector_l2_ops)
86116
);
@@ -199,23 +229,18 @@ In the following example, a vector index with a prefix column is used to optimiz
199229
--http-addr=localhost:8080
200230
~~~
201231

202-
1. In a separate terminal, [enable vector indexes](#enable-vector-indexes) on the cluster and session:
203-
204-
{% include_cached copy-clipboard.html %}
205-
~~~ sql
206-
SET CLUSTER SETTING feature.vector_index.enabled = true;
207-
~~~
232+
1. In a separate terminal, open a SQL shell on the cluster:
208233

209234
{% include_cached copy-clipboard.html %}
210-
~~~ sql
211-
SET sql_safe_updates = false;
235+
~~~ shell
236+
cockroach sql --insecure
212237
~~~
213238

214-
1. Open a SQL shell on the cluster:
239+
1. [Enable vector indexes](#enable-vector-indexes) on the cluster:
215240

216241
{% include_cached copy-clipboard.html %}
217-
~~~ shell
218-
cockroach sql --insecure
242+
~~~ sql
243+
SET CLUSTER SETTING feature.vector_index.enabled = true;
219244
~~~
220245

221246
1. Create an `items` table that includes a `VECTOR` column called `embedding`, along with a vector index that uses `customer_id` as the prefix column:
@@ -238,7 +263,7 @@ In the following example, a vector index with a prefix column is used to optimiz
238263
python fast_insert.py
239264
~~~
240265

241-
This process should take approximately 20 minutes.
266+
This process should take approximately 5-10 minutes.
242267

243268
1. When the script is finished executing, verify that `items` is populated:
244269

0 commit comments

Comments
 (0)