使用聊天/嵌入响应 Usage（用量）

本站(springdoc.cn)中的内容来源于 spring.io ，原始版权归属于 spring.io。由 springdoc.cn 进行翻译，整理。可供个人学习、研究，未经许可，不得进行任何转载、商用或与之相关的行为。商标声明：Spring 是 Pivotal Software, Inc. 在美国以及其他国家的商标。

概览

Spring AI 通过引入 Usage 接口的 getNativeUsage() 方法和提供 DefaultUsage 实现，增强了模型用量处理能力。这一改进简化了不同 AI 模型跟踪和报告用量指标的流程，同时保持了框架的一致性。

主要变化

Usage 接口增强

Usage 接口现在包含一个新方法：

Object getNativeUsage();

该方法允许访问模型特定的原生用量数据，可在需要时实现更详细的用量跟踪。

与 `ChatModel` 配合使用

以下是一个完整示例，展示如何使用 OpenAI 的 ChatModel 跟踪用量（即，使用情况）：

@SpringBootConfiguration
public class Configuration {

        @Bean
        public OpenAiApi chatCompletionApi() {
            return OpenAiApi.builder()
                .apiKey(System.getenv("OPENAI_API_KEY"))
                .build();
        }

        @Bean
        public OpenAiChatModel openAiClient(OpenAiApi openAiApi) {
            return OpenAiChatModel.builder()
                .openAiApi(openAiApi)
                .build();
        }

    }

@Service
public class ChatService {

    private final OpenAiChatModel chatModel;

    public ChatService(OpenAiChatModel chatModel) {
        this.chatModel = chatModel;
    }

    public void demonstrateUsage() {
        // Create a chat prompt
        Prompt prompt = new Prompt("What is the weather like today?");

        // Get the chat response
        ChatResponse response = this.chatModel.call(prompt);

        // Access the usage information
        Usage usage = response.getMetadata().getUsage();

        // Get standard usage metrics
        System.out.println("Prompt Tokens: " + usage.getPromptTokens());
        System.out.println("Completion Tokens: " + usage.getCompletionTokens());
        System.out.println("Total Tokens: " + usage.getTotalTokens());

        // Access native OpenAI usage data with detailed token information
        if (usage.getNativeUsage() instanceof org.springframework.ai.openai.api.OpenAiApi.Usage) {
            org.springframework.ai.openai.api.OpenAiApi.Usage nativeUsage =
                (org.springframework.ai.openai.api.OpenAiApi.Usage) usage.getNativeUsage();

            // Detailed prompt token information
            System.out.println("Prompt Tokens Details:");
            System.out.println("- Audio Tokens: " + nativeUsage.promptTokensDetails().audioTokens());
            System.out.println("- Cached Tokens: " + nativeUsage.promptTokensDetails().cachedTokens());

            // Detailed completion token information
            System.out.println("Completion Tokens Details:");
            System.out.println("- Reasoning Tokens: " + nativeUsage.completionTokenDetails().reasoningTokens());
            System.out.println("- Accepted Prediction Tokens: " + nativeUsage.completionTokenDetails().acceptedPredictionTokens());
            System.out.println("- Audio Tokens: " + nativeUsage.completionTokenDetails().audioTokens());
            System.out.println("- Rejected Prediction Tokens: " + nativeUsage.completionTokenDetails().rejectedPredictionTokens());
        }
    }
}

与 `ChatClient` 配合使用

如果你使用 ChatClient，可以通过 ChatResponse 对象访问用量信息：

// Create a chat prompt
Prompt prompt = new Prompt("What is the weather like today?");

// Create a chat client
ChatClient chatClient = ChatClient.create(chatModel);

// Get the chat response
ChatResponse response = chatClient.prompt(prompt)
        .call()
        .chatResponse();

// Access the usage information
Usage usage = response.getMetadata().getUsage();

优势

标准化：为不同 AI 模型提供一致的用量处理方式
灵活性：通过原生用量功能支持模型特定的用量数据
简单化：通过默认实现减少样板代码
可扩展性：在保持兼容性的同时，易于扩展以满足特定模型需求

类型安全考量

处理原生用量数据时，请谨慎考虑类型转换：

// Safe way to access native usage
if (usage.getNativeUsage() instanceof org.springframework.ai.openai.api.OpenAiApi.Usage) {
    org.springframework.ai.openai.api.OpenAiApi.Usage nativeUsage =
        (org.springframework.ai.openai.api.OpenAiApi.Usage) usage.getNativeUsage();
    // Work with native usage data
}