代理 Vertex AI 请求

Vertex AI 通过 OpenAI 兼容的 API 提供对 Google Gemini 模型的访问。

本指南展示了如何使用 ai-proxy 插件将 APISIX 与 Vertex AI 集成。将 provider 设置为 vertex-ai 后，你可以通过 provider_conf 配置你的项目和区域，而无需指定自定义端点。

前置条件

安装 Docker。
安装 cURL 以向服务发送请求进行验证。
按照快速入门教程在 Docker 或 Kubernetes 中启动一个新的 APISIX 实例。
拥有一个启用了 Vertex AI API 的 Google Cloud 项目。

获取 Vertex AI 服务帐号密钥

按照 Google Cloud 服务帐号文档创建服务帐号和 JSON 密钥。确保服务帐号具有调用 Vertex AI 的权限（例如 Vertex AI User）。

你可以选择将服务帐号 JSON 保存到环境变量：

export GCP_SERVICE_ACCOUNT_JSON="$(cat /path/to/service-account.json)"

创建到 Vertex AI 的路由

创建一个带有 ai-proxy 插件的路由，如下所示：

Admin API
ADC
Ingress Controller

curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT -d '{
  "id": "vertex-ai-chat",
  "uri": "/anything",
  "plugins": {
    "ai-proxy": {
      "provider": "vertex-ai",
      "provider_conf": {
        "project_id": "evident-xxx",
        "region": "us-central1"
      },
      "auth": {
        "gcp": {
          "service_account_json": "'"$GCP_SERVICE_ACCOUNT_JSON"'"
        }
      },
      "options": {
        "model": "google/gemini-2.5-flash"
      }
    }
  }
}'

❶ 将提供商设置为 vertex-ai 并配置 project_id 和 region。

❷ 替换为你的服务帐号 JSON。

❸ 设置 Vertex AI 支持的模型，例如 google/gemini-2.5-flash。

adc.yaml
services:
  - name: Vertex AI Service
    routes:
      - uris:
          - /anything
        name: vertex-ai-chat
        plugins:
          ai-proxy:
            provider: vertex-ai
            provider_conf:
              project_id: evident-xxx
              region: us-central1
            auth:
              gcp:
                service_account_json: |
                  {
                    ...
                  }
            options:
              model: google/gemini-2.5-flash

❶ 将提供商设置为 vertex-ai 并配置 project_id 和 region。

❷ 替换为你的服务帐号 JSON。

❸ 设置 Vertex AI 支持的模型，例如 google/gemini-2.5-flash。

将配置同步到 APISIX：

adc sync -f adc.yaml

创建一个 Kubernetes 清单文件以配置路由：

Gateway API
APISIX CRD

vertex-ai-route.yaml
apiVersion: apisix.apache.org/v1alpha1
kind: PluginConfig
metadata:
  namespace: ingress-apisix
  name: ai-proxy-plugin-config
spec:
  plugins:
    - name: ai-proxy
      config:
        provider: vertex-ai
        provider_conf:
          project_id: evident-xxx
          region: us-central1
        auth:
          gcp:
            service_account_json: |
              {
                ...
              }
        options:
          model: google/gemini-2.5-flash
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  namespace: ingress-apisix
  name: vertex-ai-chat
spec:
  parentRefs:
  - name: apisix
  rules:
  - matches:
    - path:
        type: Exact
        value: /anything
    filters:
    - type: ExtensionRef
      extensionRef:
        group: apisix.apache.org
        kind: PluginConfig
        name: ai-proxy-plugin-config

vertex-ai-route.yaml
apiVersion: apisix.apache.org/v2
kind: ApisixRoute
metadata:
  namespace: ingress-apisix
  name: vertex-ai-route
spec:
  ingressClassName: apisix
  http:
    - name: vertex-ai-route
      match:
        paths:
          - /anything
      plugins:
      - name: ai-proxy
        enable: true
        config:
          provider: vertex-ai
          provider_conf:
            project_id: evident-xxx
            region: us-central1
          auth:
            gcp:
              service_account_json: |
                {
                  ...
                }
          options:
            model: google/gemini-2.5-flash

❶ 将提供商设置为 vertex-ai 并配置 project_id 和 region。

❷ 替换为你的服务帐号 JSON。

❸ 设置 Vertex AI 支持的模型，例如 google/gemini-2.5-flash。

将配置应用到你的集群：

kubectl apply -f vertex-ai-route.yaml

验证

向路由发送带有以下提示的请求：

curl "http://127.0.0.1:9080/anything" -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "system", "content": "You are a mathematician" },
      { "role": "user", "content": "What is 1+1?" }
    ]
  }'

你应该收到类似以下的响应：

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "1 + 1 = 2\n"
      },
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 8,
    "extra_properties": {
      "google": {
        "traffic_type": "ON_DEMAND"
      }
    },
    "total_tokens": 19,
    "prompt_tokens": 11
  },
  "object": "chat.completion",
  "model": "google/gemini-2.5-flash",
  ...
}

下一步

你现在已经学会了如何将 APISIX 与 Vertex AI 集成。请参阅 Vertex AI 文档和 Gemini 模型页面以了解更多详细信息。

如果你想流式传输响应，请在请求中启用流式传输，并使用 proxy-buffering 插件来禁用 NGINX 的 proxy_buffering 指令，以避免缓冲服务器发送事件 (SSE)。

前置条件​

获取 Vertex AI 服务帐号密钥​

创建到 Vertex AI 的路由​

验证​

下一步​

前置条件

获取 Vertex AI 服务帐号密钥

创建到 Vertex AI 的路由

验证

下一步