OpenAI API - 视觉功能#
SGLang 提供与 OpenAI 兼容的 API,便于从 OpenAI 服务平滑过渡到自托管本地模型。 完整的 API 参考请见 OpenAI API 参考。 本教程介绍视觉语言模型的视觉 API。
SGLang 支持多种视觉语言模型,如 Llama 3.2、LLaVA-OneVision、Qwen2.5-VL、Gemma3 及更多模型。
作为 OpenAI API 的替代方案,您也可以使用 SGLang 离线引擎。
启动服务器#
在您的终端中启动服务器并等待其初始化完成。
[ ]:
from sglang.test.doc_patch import launch_server_cmd
from sglang.utils import wait_for_server, print_highlight, terminate_process
vision_process, port = launch_server_cmd(
"""
python3 -m sglang.launch_server --model-path Qwen/Qwen2.5-VL-7B-Instruct --log-level warning
"""
)
wait_for_server(f"http://localhost:{port}")
使用 cURL#
服务器启动后,您可以使用 curl 或 requests 发送测试请求。
[ ]:
import subprocess
curl_command = f"""
curl -s http://localhost:{port}/v1/chat/completions \\
-H "Content-Type: application/json" \\
-d '{{
"model": "Qwen/Qwen2.5-VL-7B-Instruct",
"messages": [
{{
"role": "user",
"content": [
{{
"type": "text",
"text": "图片中有什么内容?"
}},
{{
"type": "image_url",
"image_url": {{
"url": "https://github.com/sgl-project/sglang/blob/main/examples/assets/example_image.png?raw=true"
}}
}}
]
}}
],
"max_tokens": 300
}}'
"""
response = subprocess.check_output(curl_command, shell=True).decode()
print_highlight(response)
response = subprocess.check_output(curl_command, shell=True).decode()
print_highlight(response)
使用 Python Requests#
[ ]:
import requests
url = f"http://localhost:{port}/v1/chat/completions"
data = {
"model": "Qwen/Qwen2.5-VL-7B-Instruct",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "图片中有什么内容?"},
{
"type": "image_url",
"image_url": {
"url": "https://github.com/sgl-project/sglang/blob/main/examples/assets/example_image.png?raw=true"
},
},
],
}
],
"max_tokens": 300,
}
response = requests.post(url, json=data)
print_highlight(response.text)
使用 OpenAI Python 客户端#
[ ]:
from openai import OpenAI
client = OpenAI(base_url=f"http://localhost:{port}/v1", api_key="None")
response = client.chat.completions.create(
model="Qwen/Qwen2.5-VL-7B-Instruct",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "图片中有什么内容?",
},
{
"type": "image_url",
"image_url": {
"url": "https://github.com/sgl-project/sglang/blob/main/examples/assets/example_image.png?raw=true"
},
},
],
}
],
max_tokens=300,
)
print_highlight(response.choices[0].message.content)
多图像输入#
如果模型支持,服务器还支持多图像以及交错排列的文本和图像输入。
[ ]:
from openai import OpenAI
client = OpenAI(base_url=f"http://localhost:{port}/v1", api_key="None")
response = client.chat.completions.create(
model="Qwen/Qwen2.5-VL-7B-Instruct",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://github.com/sgl-project/sglang/blob/main/examples/assets/example_image.png?raw=true",
},
},
{
"type": "image_url",
"image_url": {
"url": "https://raw.githubusercontent.com/sgl-project/sglang/main/assets/logo.png",
},
},
{
"type": "text",
"text": "我有两张完全不同的图片。它们之间没有任何关联。请用一句话描述第一张图片,然后用另一句话描述第二张图片。",
},
],
}
],
temperature=0,
)
print_highlight(response.choices[0].message.content)
[ ]:
terminate_process(vision_process)