-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation
Description
Result
Here is our final train result which 50% SFT data comes from GraphGen.
Domain | Dataset | our-7B-model | Qwen2.5-7B-Instruct |
---|---|---|---|
Plant | SeedBench | 65.9 | 51.5 |
Common | CMMLU | 73.6 | 75.8 |
Logic | GPQA-Diamond | 40.0 | 33.3 |
Math | AIME24 | 20.6 | 16.7 |
AIME25 | 22.7 | 7.2 |
Garbage in, garbage out
First, it's essential to ensure the high quality of the input chunk.
- Positive example: A complete small story segment
- Negative example: A part of a paper citation, only have title, lack of information
Secondly, filter the QA pairs according to business needs. The synthetic QA data contains entity words, but not every entity word should be present.
- Positive example: The glorious deeds of the company's boss
- Negative example: Meaningless coreference resolution.. "fig 5.1", "it"
API usage
- Make sure LLM API supports
logprobs
(such asvllm serve
withv0.6.6post1
) and enableTrainee Model
for hardcase mining. SiliconCloud on OpenXLab web is just for free trial, real production would not be free.
- Use a bigger synthesizer model. Ensure that the synthesizer and the trainee are of the same origin.
Howe829, tpoisonooo and ChenZiHong-GavinKatehuuh
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation