Skip to main content

Model Issues

Applicable to: Conversational Models (LLM), Embedding Models, Tools/Function Call, and issues related to vector indexing/building. Troubleshooting principle: First verify by directly calling the model service with a minimal request (to confirm the model itself is functioning correctly) → then check the gateway/proxy → finally check business-side parameters and timeouts.


1. Minimal Request Verification (Required)

1.1 Chat Completions

Request Example:

Note: Please fill in the BASE_URL configured in Platform Mgt.

curl <BASE_URL>/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d '{
"model": "<MODEL_NAME>",
"messages": [
{ "role": "user", "content": "Hello" }
],
"stream": false
}'

Successful Response Example (Key Fields):

{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"created": 1730000000,
"model": "<MODEL_NAME>",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "Hi! How can I help you?" },
"finish_reason": "stop"
}
],
"usage": { "prompt_tokens": 8, "completion_tokens": 9, "total_tokens": 17 }
}

1.2 Embeddings

Request Example:

curl <BASE_URL>/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d '{
"model": "<EMB_MODEL_NAME>",
"input": "test"
}'

Successful Response Example (Key Fields):

{
"object": "list",
"model": "<EMB_MODEL_NAME>",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0123, -0.0456, 0.0789]
}
],
"usage": { "prompt_tokens": 2, "total_tokens": 2 }
}

Note: The embedding field will be a long array of floating-point numbers (the dimension depends on the model).


1.3 Image Request: Verify Whether the Image Is Accessible

Request Example (Multimodal Chat Completions):

curl <BASE_URL>/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d '{
"model": "<VISION_MODEL_NAME>",
"stream": false,
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "Please describe the main content of this image within 20 words." },
{
"type": "image_url",
"image_url": { "url": "<IMAGE_URL>" }
}
]
}
]
}'

Successful judgment (any condition is sufficient):

  1. HTTP 200 and choices[0].message.content is not empty
  2. No error field (common images that are inaccessible will return an error message)

Explanation: If the message "Model does not support image/multimodal" is reported, please replace the model that supports image input and try again.


2. Four common model examples

2.1 Model unavailable/503/connection failure

Typical phenomenon

  • HTTP 503 / 502 / 504
  • Connection timeout/connection refused
  • 401/403 (authentication failed)

Common error return examples

Authentication failed:

{
"error": {
"message": "Incorrect API key provided",
"type": "invalid_request_error"
}
}

The model does not exist:

{
"error": {
"message": "The model `<MODEL_NAME>` does not exist",
"type": "invalid_request_error"
}
}

Key points of investigation

  1. First, use the "minimum request" direct connection model service (Section 1)
  2. If the direct connection is successful but the gateway connection fails: check the gateway forwarding, authentication, upstream address, and timeout configuration
  3. If the direct connection fails: prioritize checking whether the model service is started, whether it is loaded successfully, and whether the resources are sufficient

2.2 Stream response is empty or ends prematurely

Typical phenomenon

  • When stream=true, almost no content is returned or it disconnects quickly
  • The client can only see EOF/connection closure and cannot see real errors

handling method

  1. First change to non streaming stream=false, in order to obtain clear error information
  2. Use the "minimum request body method" to reproduce: only retain the model + messages (+ stream), confirm its availability, and then add back the other fields one by one

Example of streaming request:

curl <BASE_URL>/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d '{
"model": "<MODEL_NAME>",
"messages": [
{ "role": "user", "content": "Hello" }
],
"stream": true
}'

Streaming return example (SSE, common fragment):

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"role":"assistant"},"index":0}]}

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"Hi"},"index":0}]}

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"!"},"index":0}]}

data: [DONE]

2.3 Tool/function call exception

Typical phenomenon

  • Do not return tool_calls, only output natural language
  • Improper return structure leads to parsing failure
  • Gateway/proxy does not support tools field
curl <BASE_URL>/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d '{
"model": "deepseek/deepseek-v3.2",
"stream": true,
"messages": [
{ "role": "system", "content": "You can call tools when needed." },
{ "role": "user", "content": "Call get_time tool." }
],
"tools": [
{
"type": "function",
"function": {
"name": "get_time",
"description": "Get current time in ISO format",
"parameters": { "type": "object", "properties": {} }
}
}
]
}'

Example of tools supported return (key fields):

{
"choices": [
{
"message": {
"role": "assistant",
"tool_calls": [
{
"id": "call_xxx",
"type": "function",
"function": {
"name": "get_time",
"arguments": "{}"
}
}
]
},
"finish_reason": "tool_calls"
}
]
}

If you can never get tool_calls: prioritize checking if the model supports and if the gateway transparently passes the tools field.