结构化输出
ecnu-plus 与 ecnu-turbo 现已支持基于 XGrammar 的结构化输出能力。
要点
Xgrammar 利用约束解码来确保输出的 100% 结构正确性。但并非开启了结构化输出就一定可以获得正确的结构,您还是需要在提示词中给出好的提示和示例,以确保在约束解码的过程中能够选到合理正确的令牌。并且给出足够的 max_tokens 以确保模型有足够的空间生成完整的结构化内容。
结构化输出参数
response_format: 用于指定结构化输出的格式,字段为json_schema。需要传入json_schema参数,指定输出的 JSON 结构。详见以下示例
示例代码
json object 输出
python
import json
from openai import OpenAI
client = OpenAI(
api_key='sk-xxxxx', # 替换为您的API密钥
base_url="https://chat.ecnu.edu.cn/open/api/v1",
)
json_schema = json.dumps(
{
"type": "object",
"properties": {
"name": {"type": "string"},
"department": {"type": "string"},
"technical": {"type": "string"}
},
"required": ["name", "department", "technical"],
}
)
system_prompt = '''
你需要根据用户的提供的文本提取其中的关键信息,包含姓名,部门和职称。以下是一组示例。
<example>
<input>
张三,法律事务部高级总监。主要负责法律合规和风险管理工作。
</input>
<output>
{
"name": "张三",
"department": "法律事务部",
"technical": "高级总监"
}
</output>
</example>
'''
user_prompt = "冯骐,信息化治理办公室高级工程师,主要负责学校数据治理与大语言模型平台的开发工作。"
response = client.chat.completions.create(
model="ecnu-turbo",
messages=[
{
"role": "system",
"content": system_prompt,
},
{
"role": "user",
"content": user_prompt,
}
],
temperature=0.3,
response_format={
"type": "json_schema",
"json_schema": {"name": "foo", "schema": json.loads(json_schema)},
},
max_tokens=1024
)
print(response.choices[0].message.content)
# 预期输出
{
"department": "信息化治理办公室",
"name": "冯骐",
"technical": "高级工程师"
}
'''json array 输出
python
import json
from openai import OpenAI
client = OpenAI(
api_key='sk-xxxxxxx', # 替换为您的API密钥
base_url="https://chat.ecnu.edu.cn/open/api/v1",
)
json_schema = json.dumps(
{
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "integer",
"minimum": 1
},
"name": {
"type": "string",
},
"sex": {
"type": "string",
},
"age": {
"type": "integer",
"minimum": 0
},
"height": {
"type": "integer",
"minimum": 0
},
"weight": {
"type": "integer",
"minimum": 0
},
},
"required": ["id", "name", "sex", "age", "height", "weight"]
},
"minItems": 1
}
)
system_prompt = '''
你需要根据用户的提供的文本提取其中的关键信息,包含姓名,性别,年龄,身高,体重。以下是一组示例。
<example>
<input>
小明,男,25岁,身高180cm,体重75kg。
小李,女,30岁,身高165cm,体重60kg。
</input>
<output>
[
{
"id": 1,
"name": "小明",
"sex": "男",
"age": 25,
"height": 180,
"weight": 75
},
{
"id": 2,
"name": "小李",
"sex": "女",
"age": 30,
"height": 165,
"weight": 60
}
]
</output>
</example>
'''
user_prompt = '''
老冯,男,36岁,身高188cm,体重85kg。
老王,男,35岁,身高175cm,体重70kg。
老陈,女,32岁,身高160cm,体重55kg。
老李,女,28岁,身高170cm,体重65kg。
老张,男,40岁,身高182cm,体重80kg。
'''
response = client.chat.completions.create(
model="ecnu-turbo",
messages=[
{
"role": "system",
"content": system_prompt,
},
{
"role": "user",
"content": user_prompt,
}
],
temperature=0.3,
response_format={
"type": "json_schema",
"json_schema": {"name": "foo", "schema": json.loads(json_schema)},
},
max_tokens=2048
)
print(response.choices[0].message.content)
# 预期输出
'''
[
{
"age": 36,
"height": 188,
"id": 1,
"name": "老冯",
"sex": "男",
"weight": 85
},
{
"age": 35,
"height": 175,
"id": 2,
"name": "老王",
"sex": "男",
"weight": 70
},
{
"age": 32,
"height": 160,
"id": 3,
"name": "老陈",
"sex": "女",
"weight": 55
},
{
"age": 28,
"height": 170,
"id": 4,
"name": "老李",
"sex": "女",
"weight": 65
},
{
"age": 40,
"height": 182,
"id": 5,
"name": "老张",
"sex": "男",
"weight": 80
}
]