跳到主要内容

模型常见问题

适用于:对话模型(LLM)、向量模型(Embedding)、工具调用(Tools/Function Call)、以及索引/构建向量相关问题。
排查原则:先用最小请求直连模型服务验证(确认模型本身没问题)→ 再排查网关/代理 → 最后排查业务侧参数与超时。


1. 最小化请求验证(必做)

1.1 对话模型:Chat Completions

请求示例:

说明:BASE_URL 请填写平台管理中配置的 Base URL

curl <BASE_URL>/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d '{
"model": "<MODEL_NAME>",
"messages": [
{ "role": "user", "content": "Hello" }
],
"stream": false
}'

成功返回示例(关键字段):

{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"created": 1730000000,
"model": "<MODEL_NAME>",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "Hi! How can I help you?" },
"finish_reason": "stop"
}
],
"usage": { "prompt_tokens": 8, "completion_tokens": 9, "total_tokens": 17 }
}

1.2 向量模型:Embeddings

请求示例:

curl <BASE_URL>/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d '{
"model": "<EMB_MODEL_NAME>",
"input": "test"
}'

成功返回示例(关键字段):

{
"object": "list",
"model": "<EMB_MODEL_NAME>",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0123, -0.0456, 0.0789]
}
],
"usage": { "prompt_tokens": 2, "total_tokens": 2 }
}

说明:embedding 会是一个很长的浮点数组(维度取决于模型)。


1.3 图片请求:验证图片是否可访问

请求示例(多模态 Chat Completions):

curl <BASE_URL>/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d '{
"model": "<VISION_MODEL_NAME>",
"stream": false,
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "请描述这张图片的主要内容,20字以内。" },
{
"type": "image_url",
"image_url": { "url": "<IMAGE_URL>" }
}
]
}
]
}'

成功判定(任一满足即可):

  1. HTTP 200 且 choices[0].message.content 非空
  2. error 字段(常见图片不可访问会返回错误信息)

说明:如果报“模型不支持图片/多模态”,请更换支持图片输入的模型后重试。


2. 四类常见模型问题

2.1 模型不可用 / 503 / 连接失败

典型现象

  • HTTP 503 / 502 / 504
  • 连接超时、连接被拒绝(timeout / connection refused)
  • 401/403(鉴权失败)

常见错误返回示例

鉴权失败:

{
"error": {
"message": "Incorrect API key provided",
"type": "invalid_request_error"
}
}

模型不存在:

{
"error": {
"message": "The model `<MODEL_NAME>` does not exist",
"type": "invalid_request_error"
}
}

排查要点

  1. 先用「最小请求」直连模型服务(第 1 节)
  2. 若直连成功但走网关失败:检查网关转发、鉴权、上游地址与超时配置
  3. 若直连失败:优先检查模型服务是否启动、是否加载成功、资源是否足够

2.2 Stream 响应为空或提前结束

典型现象

  • stream=true 时几乎没有返回内容或很快断开
  • 客户端只能看到 EOF / 连接关闭,看不到真实错误

处理方式

  1. 先改成非流式 stream=false,便于拿到明确错误信息
  2. 用“最小请求体法”复现:只保留 model + messages (+ stream),确认可用后再逐个加回其他字段

流式请求示例:

curl <BASE_URL>/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d '{
"model": "<MODEL_NAME>",
"messages": [
{ "role": "user", "content": "Hello" }
],
"stream": true
}'

流式返回示例(SSE,常见片段):

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"role":"assistant"},"index":0}]}

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"Hi"},"index":0}]}

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"!"},"index":0}]}

data: [DONE]

2.3 工具调用(tools / function call)异常

典型现象

  • 不返回 tool_calls,只输出自然语言
  • 返回结构不规范导致解析失败
  • 网关/代理不支持 tools 字段

最小验证示例(判断链路是否支持 tools)

curl <BASE_URL>/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d '{
"model": "deepseek/deepseek-v3.2",
"stream": true,
"messages": [
{ "role": "system", "content": "You can call tools when needed." },
{ "role": "user", "content": "Call get_time tool." }
],
"tools": [
{
"type": "function",
"function": {
"name": "get_time",
"description": "Get current time in ISO format",
"parameters": { "type": "object", "properties": {} }
}
}
]
}'

支持 tools 的返回示例(关键字段):

{
"choices": [
{
"message": {
"role": "assistant",
"tool_calls": [
{
"id": "call_xxx",
"type": "function",
"function": {
"name": "get_time",
"arguments": "{}"
}
}
]
},
"finish_reason": "tool_calls"
}
]
}

如果你永远拿不到 tool_calls:优先判断 模型是否支持网关是否透传 tools 字段