Skip to content

结构化输出

ecnu-plusecnu-turbo 现已支持基于 XGrammar 的结构化输出能力。

要点

Xgrammar 利用约束解码来确保输出的 100% 结构正确性。但并非开启了结构化输出就一定可以获得正确的结构,您还是需要在提示词中给出好的提示和示例,以确保在约束解码的过程中能够选到合理正确的令牌。并且给出足够的 max_tokens 以确保模型有足够的空间生成完整的结构化内容。

结构化输出参数

  • response_format: 用于指定结构化输出的格式,字段为 json_schema。需要传入 json_schema 参数,指定输出的 JSON 结构。详见以下示例

示例代码

json object 输出

python
import json
from openai import OpenAI

client = OpenAI(
    api_key='sk-xxxxx',  # 替换为您的API密钥
    base_url="https://chat.ecnu.edu.cn/open/api/v1",
)

json_schema = json.dumps(
    {
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "department": {"type": "string"},
            "technical": {"type": "string"}
        },
        "required": ["name", "department", "technical"],
    }
)

system_prompt = '''
你需要根据用户的提供的文本提取其中的关键信息,包含姓名,部门和职称。以下是一组示例。

<example>
<input>
张三,法律事务部高级总监。主要负责法律合规和风险管理工作。
</input>
<output>
{
    "name": "张三",
    "department": "法律事务部",
    "technical": "高级总监"
}
</output>
</example>
'''

user_prompt = "冯骐,信息化治理办公室高级工程师,主要负责学校数据治理与大语言模型平台的开发工作。"
response = client.chat.completions.create(
    model="ecnu-turbo",
    messages=[
        {
            "role": "system",
            "content": system_prompt,
        },
        {
            "role": "user",
            "content": user_prompt,
        }
    ],
    temperature=0.3,
    response_format={
        "type": "json_schema",
        "json_schema": {"name": "foo", "schema": json.loads(json_schema)},
    },
    max_tokens=1024
)

print(response.choices[0].message.content)
# 预期输出
{
    "department": "信息化治理办公室",
    "name": "冯骐",
    "technical": "高级工程师"
}
'''

json array 输出

python
import json
from openai import OpenAI

client = OpenAI(
    api_key='sk-xxxxxxx',  # 替换为您的API密钥
    base_url="https://chat.ecnu.edu.cn/open/api/v1",
)

json_schema = json.dumps(
    {
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
        "id": {
            "type": "integer",
            "minimum": 1
        },
        "name": {
            "type": "string",
        },
        "sex": {
            "type": "string",
        },
        "age": {
            "type": "integer",
            "minimum": 0
        },
        "height": {
            "type": "integer",
            "minimum": 0
        },
        "weight": {
            "type": "integer",
            "minimum": 0
        },
        },
        "required": ["id", "name", "sex", "age", "height", "weight"]
    },
    "minItems": 1
    }
)

system_prompt = '''
你需要根据用户的提供的文本提取其中的关键信息,包含姓名,性别,年龄,身高,体重。以下是一组示例。

<example>
<input>
小明,男,25岁,身高180cm,体重75kg。
小李,女,30岁,身高165cm,体重60kg。
</input>
<output>
[
    {
        "id": 1,
        "name": "小明",
        "sex": "男",
        "age": 25,
        "height": 180,
        "weight": 75
    },
    {
        "id": 2,
        "name": "小李",
        "sex": "女",
        "age": 30,
        "height": 165,
        "weight": 60
    }
]

</output>
</example>
'''

user_prompt = '''
老冯,男,36岁,身高188cm,体重85kg。
老王,男,35岁,身高175cm,体重70kg。
老陈,女,32岁,身高160cm,体重55kg。
老李,女,28岁,身高170cm,体重65kg。
老张,男,40岁,身高182cm,体重80kg。

'''
response = client.chat.completions.create(
    model="ecnu-turbo",
    messages=[
        {
            "role": "system",
            "content": system_prompt,
        },
        {
            "role": "user",
            "content": user_prompt,
        }
    ],
    temperature=0.3,
    response_format={
        "type": "json_schema",
        "json_schema": {"name": "foo", "schema": json.loads(json_schema)},
    },
    max_tokens=2048
)

print(response.choices[0].message.content)
# 预期输出
'''
[
    {
        "age": 36,
        "height": 188,
        "id": 1,
        "name": "老冯",
        "sex": "男",
        "weight": 85
    },
    {
        "age": 35,
        "height": 175,
        "id": 2,
        "name": "老王",
        "sex": "男",
        "weight": 70
    },
    {
        "age": 32,
        "height": 160,
        "id": 3,
        "name": "老陈",
        "sex": "女",
        "weight": 55
    },
    {
        "age": 28,
        "height": 170,
        "id": 4,
        "name": "老李",
        "sex": "女",
        "weight": 65
    },
    {
        "age": 40,
        "height": 182,
        "id": 5,
        "name": "老张",
        "sex": "男",
        "weight": 80
    }
]