使用 Embedding 模型和向量数据库的 Spring AI RAG

2024-06-13

本文主要介绍以下内容：

嵌入式模型简介。
使用 DocumentReader 加载数据。
在 VectorStore 中存储 Embedding。
实现 RAG（Retrieval-Augmented Generation，检索增强生成），又名 Prompt Stuffing。

你可以在 GitHub 中找到本文的示例代码

大型语言模型（LLM），如 OpenAI、Azure Open AI、Google Vertex 等，都是在大型数据集上训练出来的。但这些模型并不是在你的私人数据上训练出来的，因此它们可能无法回答你所在领域的特定问题。但是，在你的私人数据上训练模型可能既昂贵又耗时。那么，我们该如何使用这些 LLM 来回答我们领域的特定问题呢？

其中一种方法是使用 RAG（Retrieval-Augmented Generation，检索增强生成），又称 Prompt Stuffing（提示填充）。使用 RAG，从数据存储中检索相关文档，并将其传递给 LLM 以生成答案。在这一过程中，使用嵌入模型将文档转换为 Embedding，并将其存储到向量数据库中。

了解检索增强生成（RAG）

你可能在关系数据库中存储结构化数据，在 NoSQL 数据库中存储非结构化数据，甚至在文件中存储结构化数据。你能够使用 SQL 有效地查询关系数据库，使用 NoSQL 数据库的查询语言有效地查询 NoSQL 数据库。你还可以使用 Elasticsearch、Solr 等全文搜索引擎来查询非结构化数据。

不过，你可能希望使用具有语义的自然语言检索数据。

例如，“我喜欢 Java 编程语言” 和 “Java 始终是我的首选语言” 具有相同的语义，但使用了不同的词语。尝试使用准确的词语检索数据可能不会有效。

这就是 Embedding 的作用所在。Embedding 是单词、句子或文档的向量表示。你可以通过这些 Embedding，使用自然语言检索数据。

你可以将结构化和非结构化数据转换为 Embedding，并将其存储在向量数据库中。然后，你可以使用自然语言查询向量数据库并检索相关数据。然后，你可以通过相关数据查询 AI 模型，以获得响应。

检索增强生成（RAG）是在生成响应之前，通过使用训练数据之外的额外知识库来优化 LLM 输出的过程。

Embedding API

Embedding API 可以将单词、句子、文档或图像转换为 Embedding 。Embedding 是单词、句子或文档的向量表示。

例如，单词 “Apple” 可以表示为一个向量 [0.1, 0.2, 0.3, 0.4, 0.5] 。一句 “I love Apple” 可以表示为一个向量 [0.1, 10.3, -10.2, 90.3, 2.4, -0.5]。

Spring AI 提供了一个 EmbeddingModel 接口，用于将文本或文档转换为 Embedding 。你可以使用任何受支持的 EmbeddingModel 实现，如 OpenAiEmbeddingModel、OllamaEmbeddingModel、AzureOpenAiEmbeddingModel、VertexAiEmbeddingModel 等。

根据要使用的实现，可以添加相应的依赖，并在 application.properties 文件中配置属性。

例如，如果要使用 OpenAI 的 EmbeddingModel，可以在 pom.xml 文件中添加以下依赖。

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    <version>1.0.0-M1</version>
</dependency>

在 application.properties 文件中配置属性。

spring.ai.openai.api-key=${OPENAI_API_KEY}

# 你可以使用以下属性覆盖上述用于 embedding 的通用 api-key
spring.ai.openai.embedding.api-key=${OPENAI_API_KEY}

通过上述配置，你可以注入 EmbeddingModel 并将文本或文档转换为 Embedding，具体如下：

@Component
class MyComponent {
    private final EmbeddingModel embeddingModel;
    
    public MyComponent(EmbeddingModel embeddingModel) {
        this.embeddingModel = embeddingModel;
    }
    
    public void convertTextToEmbedding() {
        // 例 1：将文本转换为 Embedding
        List<Double> embeddings1 = embeddingModel.embed("I like Spring Boot");
        
        // 例 2：将文档转换为 Embedding
        List<Double> embeddings2 = embeddingModel.embed(new Document("I like Spring Boot"));
        
        // 例 3：使用选项将文本转换为 Embedding
        EmbeddingRequest embeddingRequest =
                new EmbeddingRequest(List.of("I like Spring Boot"),
                        OpenAiEmbeddingOptions.builder()
                                .withModel("text-davinci-003")
                                .build());
        EmbeddingResponse embeddingResponse = embeddingModel.call(embeddingRequest);
        List<Double> embeddings3 = embeddingResponse.getResult().getOutput();
    }
}

向量数据库

向量数据库（Vector Database）是一种存储 Embedding 的数据库。你可以在向量数据库中存储单词、句子或文档的 Embedding。你可以通过向量数据库，使用自然语言查询 Embedding 信息，并检索相关数据。

矢量和向量是一个东西，只是在不同领域里面用到的不同的称呼。矢量常常用在物理学中，向量在数学、几何中比较常见。

Spring AI 提供了一个 VectorStore 接口，用于存储和检索 Embedding。目前，Spring AI 提供的 VectorStore 实现包括 SimpleVectorStore、ChromaVectorStore、Neo4jVectorStore、PgVectorStore、RedisVectorStore 等。

来看看如何使用 SimpleVectorStore 来存储和检索 Embedding。

@Configuration
class AppConfig {
    @Bean
    VectorStore vectorStore(EmbeddingModel embeddingModel) {
        return new SimpleVectorStore(embeddingModel);
    }
}

@Component
class MyComponent {
    private final VectorStore vectorStore;
    
    public MyComponent(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }
    
    public void storeAndRetrieveEmbeddings() {
        // 存储 Embedding
        List<Document> documents = 
                List.of(new Document("I like Spring Boot"),
                        new Document("I love Java programming language"));
        vectorStore.add(documents);
        
        // 检索 Embedding
        SearchRequest query = SearchRequest.query("Spring Boot").withTopK(2);
        List<Document> similarDocuments = vectorStore.similaritySearch(query);
        String relevantData = similarDocuments.stream()
                            .map(Document::getContent)
                            .collect(Collectors.joining(System.lineSeparator()));
    }
}

如上，我们将文档添加到 VectorStore，而 VectorStore 会使用 EmbeddingClient 在内部将文档转换为 Embedding，并将其存储到向量数据库中。

然后，使用自然语言查询 VectorStore 并检索相关数据。我们使用 withTopK() 方法指定了要返回的相似文档的最大数量。

DocumentReader 和 DocumentWriter

在上例中，我们直接从字符串构建了一个 Document 实例来表示文本或文档。但在实际应用中，你可能希望从文件、数据库或任何其他来源读取文档。

Spring AI 提供了 DocumentReader 和 DocumentWriter 接口，用于从不同来源读写文档。

目前，Spring AI 提供了 JsonReader、TextReader、PagePdfDocumentReader 等 DocumentReader 实现。

VectorStore 接口继承了 DocumentWriter 接口，因此你可以将任何 VectorStore 实现用作 DocumentWriter。

来看看如何使用 TextReader 读取文本文档并将其存储到 VectorStore 中。

@Component
class MyComponent {
    private final VectorStore vectorStore;
    
    @Value("classpath:myfile.txt")
    private Resource resource;
    
    public MyComponent(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }
    
    public void storeEmbeddingsFromTextFile() {
        var textReader = new TextReader(resource);
        textReader.setCharset(Charset.defaultCharset());
        List<Document> documents = textReader.get();

        vectorStore.add(documents);
    }
}

如上，从 classpath 文件中读取文本并将其存储在 VectorStore 中。

实现 RAG（检索增强生成）

在了解了如何将文档转换为 Embedding 并存储到向量数据库中，以及如何使用自然语言检索相关文档后，下面来看看如何实现 RAG。

@RestController
class RAGController {
    private final ChatClient chatClient;
    private final VectorStore vectorStore;

    RAGController(ChatClient.Builder chatClientBuilder, VectorStore vectorStore) {
        this.chatClient = chatClientBuilder.build();
        this.vectorStore = vectorStore;
    }
    
    // 假设我们已经从包含人员信息的文件中读取了文件  并将其存储在 VectorStore 中，如上一节所述。
    
    @GetMapping("/ai/rag/people")
    Person chatWithRag(@RequestParam String name) {
        // 使用自然语言查询 VectorStore，查找有关个人的信息。
        List<Document> similarDocuments = 
                vectorStore.similaritySearch(SearchRequest.query(name).withTopK(2));
        String information = similarDocuments.stream()
                .map(Document::getContent)
                .collect(Collectors.joining(System.lineSeparator()));
        
        // 构建系统消息（systemMessage），指示 AI 模型使用传递的信息来回答问题。
        var systemPromptTemplate = new SystemPromptTemplate("""
              You are a helpful assistant.
              
              Use the following information to answer the question:
              {information}
              """);
        var systemMessage = systemPromptTemplate.createMessage(
                Map.of("information", information));

        // 使用 BeanOutputConverter 将响应解析为 Person 的实例。
        var outputConverter = new BeanOutputConverter<>(Person.class);
        
        // 构建用户信息（userMessage），要求 AI 模型介绍这个人。
        PromptTemplate userMessagePromptTemplate = new PromptTemplate("""
        Tell me about {name} as if current date is {current_date}.

        {format}
        """);
        Map<String,Object> model = Map.of("name", name,
                "current_date", LocalDate.now(),
                "format", outputConverter.getFormat());
        var userMessage = new UserMessage(userMessagePromptTemplate.create(model).getContents());

        var prompt = new Prompt(List.of(systemMessage, userMessage));

        var response = chatClient.prompt(prompt).call().content();

        return outputConverter.convert(response);
    }
}

record Person(String name,
              String dateOfBirth,
              int experienceInYears,
              List<String> books) {
}

注释中包含对上述代码的解释。

总的来说，RAG 流程包括以下步骤：

使用 DocumentReader 从不同来源加载文档。
Convert the documents into embeddings using EmbeddingModel and store them in the VectorStore.
使用 EmbeddingModel 将文件转换为 Embedding，并将其存储在 VectorStore 中。
使用自然语言查询 VectorStore 并检索相关文档。
构建 SystemMessage，指示 AI 模型使用传递的信息来回答问题。
构建 UserMessage，向人 AI 模型询问信息。
构建提示，并调用 AI 模型获取响应。
使用 OutputConverter 将响应解析为所需格式。
返回响应。

总结

本文介绍了如何使用 Embedding API 将文本或文档转换为 Embedding，以及如何使用向量数据库来存储和检索 Embedding 信息。最后，还实现了 RAG（检索增强生成），通过检索到的信息来使用 AI 模型回答问题。

Ref：https://www.sivalabs.in/spring-ai-rag-using-embedding-models-vector-databases/