跳到主要内容

ai-aliyun-content-moderation

ai-aliyun-content-moderation 插件支持集成 阿里云内容安全增强版,在代理 LLM 请求时检查请求体的风险等级(如涉黄、涉政、辱骂、暴力等),如果评估结果超过配置的阈值,则拒绝该请求。

请确保在插件中正确配置了 access_key_secret。如果配置错误,所有请求将绕过插件直接转发到 LLM 上游,并且你将在网关的错误日志中看到 Specified signature is not matched with our calculation (指定的签名与我们的计算不匹配)错误。

ai-aliyun-content-moderation 插件应与 ai-proxyai-proxy-multi 插件配合使用,用于代理 LLM 请求。

示例

以下示例将使用 OpenAI 作为上游服务提供商。

在开始之前,请创建一个 OpenAI 账号 并获取 API Key。如果你使用其他 LLM 提供商,请参考该提供商的文档获取 API Key。

此外,请创建一个 阿里云账号,开通内容安全增强版服务,并获取 endpoint、region ID、access key ID 和 access key secret。

你可以选择将这些信息保存到环境变量中:

# 替换为你的数据
export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>
export ALIYUN_ENDPOINT=<YOUR_ALIYUN_ENDPOINT>
export ALIYUN_REGION_ID=<YOUR_ALIYUN_REGION_ID>
export ALIYUN_ACCESS_KEY_ID=<YOUR_ALIYUN_ACCESS_KEY_ID>
export ALIYUN_ACCESS_KEY_SECRET=<YOUR_ALIYUN_ACCESS_KEY_SECRET>

审核请求内容毒性

使用 ai-proxy 插件创建一个通往 LLM 聊天完成端点的路由,并在 ai-aliyun-content-moderation 插件中配置集成详情以及拒绝代码和消息:

curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-aliyun-content-moderation-route",
"uri": "/anything",
"plugins": {
"ai-aliyun-content-moderation": {
"endpoint": "'"$ALIYUN_ENDPOINT"'",
"region_id": "'"$ALIYUN_REGION_ID"'",
"access_key_id": "'"$ALIYUN_ACCESS_KEY_ID"'",
"access_key_secret": "'"$ALIYUN_ACCESS_KEY_SECRET"'",
// Annotate 1
"deny_code": 400,
// Annotate 2
"deny_message": "Request contains forbidden content, such as hate speech or violence."
},
"ai-proxy": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
}
}
}
}'

❶ 配置拒绝 HTTP 状态码。

❷ 配置拒绝消息。

向该路由发送一个 POST 请求,请求体中包含系统提示词和一个带有脏话的用户问题:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "Stupid, what is 1+1?" }
]
}'

你应该收到 HTTP/1.1 400 Bad Request 响应,并看到以下消息:

{
"object": "chat.completion",
"usage": {
"completion_tokens": 124,
"prompt_tokens": 31,
"total_tokens": 155
},
"choices": [
{
"message": {
"role": "assistant",
"content": "Request contains forbidden content, such as hate speech or violence."
},
"finish_reason": "stop",
"index": 0
}
],
"model": "gpt-4",
"id": "c9466bbf-e010-469d-949a-a10f25525964"
}

向该路由发送另一个请求,请求体中包含一个正常的问题:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "What is 1+1?" }
]
}'

你应该收到 HTTP/1.1 200 OK 响应,并看到模型输出:

{
...,
"model": "gpt-4-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "1+1 equals 2.",
"refusal": null
},
"logprobs": null,
"finish_reason": "stop"
}
],
...
}

调整风险等级阈值

以下示例演示了如何调整风险等级阈值,该阈值控制请求/响应是否可以通过。

使用 ai-proxy 插件创建一个通往 LLM 聊天完成端点的路由,并将 ai-aliyun-content-moderation 中的 risk_level_bar 配置为 high

curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-aliyun-content-moderation-route",
"uri": "/anything",
"plugins": {
"ai-aliyun-content-moderation": {
"endpoint": "'"$ALIYUN_ENDPOINT"'",
"region_id": "'"$ALIYUN_REGION_ID"'",
"access_key_id": "'"$ALIYUN_ACCESS_KEY_ID"'",
"access_key_secret": "'"$ALIYUN_ACCESS_KEY_SECRET"'",
"deny_code": 400,
"deny_message": "Request contains forbidden content, such as hate speech or violence.",
"risk_level_bar": "high"
},
"ai-proxy": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
},
"model": "gpt-4"
}
}
}'

向该路由发送一个 POST 请求,请求体中包含系统提示词和一个带有脏话的用户问题:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "Stupid, what is 1+1?" }
]
}'

你应该收到 HTTP/1.1 400 Bad Request 响应,并看到以下消息:

{
"object": "chat.completion",
"usage": {
"completion_tokens": 124,
"prompt_tokens": 31,
"total_tokens": 155
},
"choices": [
{
"message": {
"role": "assistant",
"content": "Request contains forbidden content, such as hate speech or violence."
},
"finish_reason": "stop",
"index": 0
}
],
"model": "gpt-4",
"id": "c9466bbf-e010-469d-949a-a10f25525964"
}

将插件中的 risk_level_bar 更新为 max

curl "http://127.0.0.1:9180/apisix/admin/routes/ai-aliyun-content-moderation-route" -X PATCH \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"plugins": {
"ai-aliyun-content-moderation": {
"risk_level_bar": "max"
}
}
}'

向该路由发送相同的请求:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "Stupid, what is 1+1?" }
]
}'

你应该收到 HTTP/1.1 200 OK 响应,并看到模型输出:

{
...,
"model": "gpt-4-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "1+1 equals 2.",
"refusal": null
},
"logprobs": null,
"finish_reason": "stop"
}
],
...
}

这是因为单词 “stupid” 的风险等级为 high,低于配置的阈值 max。要查看阿里云的审核结果,你可以将网关的日志级别更新为 debug,如下所示:

conf/config.yaml
nginx_config:
error_log_level: debug

重新加载网关 以使配置更改生效。

例如,对于上面的请求,你应该看到类似以下的调试日志条目:

{
"RequestId": "29F7AD19-074B-54AC-B240-B297AD96883F",
"Message": "OK",
"Data": {
...,
"RiskLevel": "high",
"Result": [
{
"RiskWords": "are&you&stupid",
...
}
]
},
"Code": 200
}