-
Notifications
You must be signed in to change notification settings - Fork 57
支持使用 MariaDB 作为向量库 #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
## spring-ai-vector-mariadb | ||
|
||
该工程模块主要是集成 mariadb 的向量存储功能,提供了一个使用 mariadb 存储向量并执行相似性搜索的简单示例。 | ||
|
||
[mariadb使用说明](https://mariadb.org/documentation/) | ||
|
||
- 关系数据库简介 | ||
- 10分钟MariaDB服务器入门 | ||
- SQL 语句列表 | ||
- 有用的 MariaDB 服务器查询 | ||
- MariaDB 服务器文档 | ||
|
||
## 前提条件 | ||
|
||
### 1、mariadb 数据库部署与初始化 | ||
|
||
mariadb 是一个开源的向量数据库,用于存储和检索高维向量数据。本项目是使用 Docker 来运行 mariadb,当然你也可以选择其他方式安装 mariadb或者使用已经部署好的 mariadb 服务。 | ||
[mariadb镜像地址](https://hub.docker.com/_/mariadb) | ||
``` | ||
|
||
② 一个 `EmbeddingModel` 实例,用于计算文档嵌入。有多个[选项](https://docs.spring.io/spring-ai/reference/api/embeddings.html#available-implementations)可供选择 | ||
|
||
③ 一个 API 密钥,给 EmbeddingModel 用于生成向量数据 | ||
|
||
|
||
## 自动配置 | ||
Spring AI 为 Mariadb 向量数据库提供了 Spring Boot 自动配置。要启用它~~,请将以下依赖添加到项目的 Maven pom.xml 文件中:~~ | ||
``` | ||
```xml | ||
<dependencies> | ||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-starter-vector-store-mariadb</artifactId> | ||
</dependency> | ||
</dependencies> | ||
```` | ||
|
||
## 配置文件 | ||
|
||
在你启动项目之前,你需要修改 `application.yml` 文件。 | ||
|
||
```yaml | ||
server: | ||
port: 8080 | ||
|
||
spring: | ||
datasource: | ||
url: ${BASE_HOST} | ||
username: ${BASE_NAME} | ||
password: ${BASE_PWD} | ||
driver-class-name: org.mariadb.jdbc.Driver | ||
ai: | ||
vectorstore: | ||
mariadb: | ||
# 启用模式初始化 | ||
initialize-schema: true | ||
# 设置距离计算类型为余弦相似度 | ||
distance-type: COSINE | ||
# 定义向量维度为1536 | ||
dimensions: 1536 | ||
openai: | ||
api-key: ${API_KEY} | ||
embedding: | ||
base-url: https://dashscope.aliyuncs.com/compatible-mode/v1 | ||
embeddings-path: /embeddings | ||
options: | ||
model: text-embedding-v4 | ||
|
||
``` | ||
修改完成之后即可以在 IDEA 中启动单元测试。 | ||
|
||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<project xmlns="http://maven.apache.org/POM/4.0.0" | ||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
<parent> | ||
<groupId>com.glmapper</groupId> | ||
<artifactId>spring-ai-vector</artifactId> | ||
<version>0.0.1</version> | ||
</parent> | ||
|
||
<artifactId>spring-ai-vector-mariadb</artifactId> | ||
<packaging>jar</packaging> | ||
|
||
<properties> | ||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> | ||
</properties> | ||
|
||
<dependencies> | ||
|
||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-starter-vector-store-mariadb</artifactId> | ||
<exclusions> | ||
<exclusion> | ||
<groupId>org.mariadb.jdbc</groupId> | ||
<artifactId>mariadb-java-client</artifactId> | ||
</exclusion> | ||
</exclusions> | ||
</dependency> | ||
<!-- 3.3版本缺少enquoteIdentifier方法, springAI 自动调用失败,引入最新版本client --> | ||
<dependency> | ||
<groupId>org.mariadb.jdbc</groupId> | ||
<artifactId>mariadb-java-client</artifactId> | ||
<version>3.5.2</version> | ||
</dependency> | ||
</dependencies> | ||
</project> |
15 changes: 15 additions & 0 deletions
15
...ring-ai-vector-mariadb/src/main/java/com/glmapper/ai/vector/MariadbVectorApplication.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
package com.glmapper.ai.vector; | ||
|
||
import org.springframework.boot.SpringApplication; | ||
import org.springframework.boot.autoconfigure.SpringBootApplication; | ||
|
||
/** | ||
* @author siyuan | ||
* @since 2025/6/14 | ||
*/ | ||
@SpringBootApplication | ||
public class MariadbVectorApplication { | ||
public static void main(String[] args) { | ||
SpringApplication.run(MariadbVectorApplication.class, args); | ||
} | ||
} |
38 changes: 38 additions & 0 deletions
38
...ng-ai-vector-mariadb/src/main/java/com/glmapper/ai/vector/storage/VectorStoreStorage.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
package com.glmapper.ai.vector.storage; | ||
|
||
import lombok.RequiredArgsConstructor; | ||
import org.springframework.ai.document.Document; | ||
import org.springframework.ai.vectorstore.SearchRequest; | ||
import org.springframework.ai.vectorstore.VectorStore; | ||
import org.springframework.stereotype.Component; | ||
|
||
import java.util.ArrayList; | ||
import java.util.List; | ||
import java.util.Set; | ||
|
||
/** | ||
* @author siyuan | ||
* @since 2025/6/14 | ||
*/ | ||
@Component | ||
@RequiredArgsConstructor | ||
public class VectorStoreStorage { | ||
|
||
private final VectorStore vectorStore; | ||
|
||
|
||
public void delete(Set<String> ids) { | ||
vectorStore.delete(new ArrayList<>(ids)); | ||
} | ||
|
||
public void store(List<Document> documents) { | ||
if (documents == null || documents.isEmpty()) { | ||
return; | ||
} | ||
vectorStore.add(documents); | ||
} | ||
|
||
public List<Document> search(String query) { | ||
return vectorStore.similaritySearch(SearchRequest.builder().query(query).topK(5).build()); | ||
} | ||
} |
27 changes: 27 additions & 0 deletions
27
spring-ai-vector/spring-ai-vector-mariadb/src/main/resources/application.yml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
server: | ||
port: 8080 | ||
|
||
spring: | ||
application: | ||
name: redis-vector-store | ||
|
||
datasource: | ||
url: ${BASE_HOST} | ||
username: ${BASE_NAME} | ||
password: ${BASE_PWD} | ||
driver-class-name: org.mariadb.jdbc.Driver | ||
ai: | ||
vectorstore: | ||
mariadb: | ||
# 启用模式初始化 | ||
initialize-schema: true | ||
# 设置距离计算类型为余弦相似度 | ||
distance-type: COSINE | ||
# 定义向量维度为1536 | ||
dimensions: 1536 | ||
openai: | ||
api-key: ${API_KEY} | ||
embedding: | ||
base-url: https://dashscope.aliyuncs.com/compatible-mode/v1 | ||
embeddings-path: /embeddings | ||
options: | ||
model: text-embedding-v4 |
58 changes: 58 additions & 0 deletions
58
spring-ai-vector/spring-ai-vector-mariadb/src/test/java/storage/VectorStoreStorageTest.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
package storage; | ||
|
||
import com.glmapper.ai.vector.MariadbVectorApplication; | ||
import com.glmapper.ai.vector.storage.VectorStoreStorage; | ||
import org.junit.jupiter.api.AfterEach; | ||
import org.junit.jupiter.api.Assertions; | ||
import org.junit.jupiter.api.Test; | ||
import org.springframework.ai.document.Document; | ||
import org.springframework.beans.factory.annotation.Autowired; | ||
import org.springframework.boot.test.context.SpringBootTest; | ||
import org.springframework.test.context.ActiveProfiles; | ||
|
||
import java.util.List; | ||
import java.util.Map; | ||
import java.util.Set; | ||
import java.util.stream.Collectors; | ||
|
||
/** | ||
* 测试 基于 mariadb 的 VectorStoreStorage 的存储和搜索功能 | ||
* | ||
* @author siyuan | ||
* @since 2025/6/14 | ||
*/ | ||
@SpringBootTest( | ||
classes = MariadbVectorApplication.class, | ||
webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT | ||
) | ||
@ActiveProfiles("test") | ||
public class VectorStoreStorageTest { | ||
|
||
@Autowired | ||
private VectorStoreStorage vectorStoreStorage; | ||
|
||
//prepare test data | ||
private static final List<Document> documents = List.of( | ||
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")), | ||
new Document("The World is Big and Salvation Lurks Around the Corner"), | ||
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2"))); | ||
|
||
@AfterEach | ||
public void cleanUp() { | ||
// clear the vector store after each test | ||
Set<String> ids = documents.stream().map(Document::getId) | ||
.collect(Collectors.toSet()); | ||
vectorStoreStorage.delete(ids); | ||
} | ||
|
||
@Test | ||
public void testStoreAndSearch() { | ||
// store documents | ||
vectorStoreStorage.store(documents); | ||
// do search | ||
String query = "Spring AI rocks!!"; | ||
List<Document> results = vectorStoreStorage.search(query); | ||
// assertions | ||
Assertions.assertFalse(results.isEmpty(), "搜索结果不应该为空"); | ||
} | ||
} |
28 changes: 28 additions & 0 deletions
28
spring-ai-vector/spring-ai-vector-mariadb/src/test/resources/application-test.yml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
server: | ||
port: 8080 | ||
|
||
spring: | ||
application: | ||
name: redis-vector-store | ||
|
||
datasource: | ||
url: jdbc:mariadb://localhost:3308/vector_test | ||
username: root | ||
password: root | ||
driver-class-name: org.mariadb.jdbc.Driver | ||
ai: | ||
vectorstore: | ||
mariadb: | ||
# 启用模式初始化 | ||
initialize-schema: false | ||
# 设置距离计算类型为余弦相似度 | ||
distance-type: COSINE | ||
# 定义向量维度为1536 | ||
dimensions: 1536 | ||
openai: | ||
api-key: ${API_KEY} | ||
embedding: | ||
# doc reference: https://bailian.console.aliyun.com/?switchAgent=12095181&productCode=p_efm&switchUserType=3&tab=api#/api/?type=model&url=https%3A%2F%2Fhelp.aliyun.com%2Fdocument_detail%2F2712515.html&renderType=iframe | ||
base-url: https://dashscope.aliyuncs.com/compatible-mode/v1 | ||
embeddings-path: /embeddings | ||
options: | ||
model: text-embedding-v4 |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里可以去掉,下面没提到
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
更新了 readme文档