版本：3.10.x

在网关层实现检索增强生成

本文介绍如何使用 ai-rag 在 API7 AI 网关层实现检索增强生成（RAG）。网关会在请求到达模型前，从知识库检索相关上下文并注入提示词，避免每个业务服务重复实现 RAG 编排逻辑。

备注

当前 API7 AI 网关中的 RAG 能力主要面向 Azure 场景：需要使用 Azure OpenAI 生成向量，并使用 Azure AI Search 执行向量检索。其他服务提供方的支持需以实际版本能力为准。

请求流程

客户端向 API7 AI 网关发送聊天请求。
网关使用 ai-rag 调用 Azure OpenAI 生成查询向量。
网关在 Azure AI Search 中检索相关知识片段。
网关把检索结果注入提示词。
网关通过 ai-proxy 将增强后的请求转发到 Azure OpenAI。

前提条件

API7 Gateway 中可用 ai-proxy 和 ai-rag 插件。
已有 Azure OpenAI 资源和用于生成回答的模型部署。
已有 Azure OpenAI 向量模型访问权限。
已有 Azure AI Search 服务，并完成知识库索引构建。

配置 RAG

在同一路由上同时配置 ai-rag 和 ai-proxy：

Admin API
ADC

curl "http://127.0.0.1:7080/apisix/admin/routes?gateway_group_id=default" -X PUT \
  -H "X-API-KEY: $ADMIN_API_KEY" \
  -d '{
  "id": "rag-azure",
  "service_id": "'"$SERVICE_ID"'",
  "paths": ["/v1/chat/completions"],
  "plugins": {
    "ai-proxy": {
      "provider": "azure-openai",
      "auth": {
        "header": {
          "api-key": "YOUR_AZURE_OPENAI_KEY"
        }
      },
      "options": {
        "model": "gpt-4o-mini"
      },
      "override": {
        "endpoint": "https://YOUR-RESOURCE.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT/chat/completions?api-version=2024-10-21"
      }
    },
    "ai-rag": {
      "embeddings_provider": {
        "azure_openai": {
          "endpoint": "https://YOUR-RESOURCE.openai.azure.com/openai/deployments/text-embedding-3-large/embeddings?api-version=2023-05-15",
          "api_key": "YOUR_AZURE_OPENAI_KEY"
        }
      },
      "vector_search_provider": {
        "azure_ai_search": {
          "endpoint": "https://YOUR-SEARCH.search.windows.net",
          "api_key": "YOUR_SEARCH_API_KEY",
          "index_name": "knowledge-base"
        }
      }
    }
  }
}'

adc.yaml
services:
  - name: RAG Service
    routes:
      - name: rag-azure
        uris:
          - /v1/chat/completions
        plugins:
          ai-proxy:
            provider: azure-openai
            auth:
              header:
                api-key: YOUR_AZURE_OPENAI_KEY
            options:
              model: gpt-4o-mini
            override:
              endpoint: https://YOUR-RESOURCE.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT/chat/completions?api-version=2024-10-21
          ai-rag:
            embeddings_provider:
              azure_openai:
                endpoint: https://YOUR-RESOURCE.openai.azure.com/openai/deployments/text-embedding-3-large/embeddings?api-version=2023-05-15
                api_key: YOUR_AZURE_OPENAI_KEY
            vector_search_provider:
              azure_ai_search:
                endpoint: https://YOUR-SEARCH.search.windows.net
                api_key: YOUR_SEARCH_API_KEY
                index_name: knowledge-base

运行建议

控制检索片段数量和上下文长度，避免令牌成本失控。
对知识库索引版本、检索命中和生成模型分别记录日志。
对高风险回答场景结合内容审核和提示词防护。
对内部知识库问题设置引用来源或审计字段，便于追踪答案依据。

请求流程​

前提条件​

配置 RAG​

运行建议​

请求流程

前提条件

配置 RAG

运行建议