[megatron] Add extra args and provider support for easily customize megatron #4240

liuyanyi · 2025-05-16T02:25:20Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

CN

我们正在使用swift训练一个具有自定义架构的LLM，现在通过MegatronModelMeta，我们可以使用我们自定义的model_provider，但是还有一部分数据无法通过参数传入provider。为了加入这些参数我们需要同时修改megatron和swift中的argument，因此我们添加了自定义参数的功能。

核心改动如下：

extra_megatron_kwargs：通过在命令行参数中添加这个参数，通过JSON字符串进行传入，该参数会透传到megatron，这样可以直接配置目前swift未覆盖的megatron 参数，例如vpp_size等参数。
extra_args_provider：添加在MegatronModelMeta中，可以扩展megatron的arg，参数默认为None，这个参数扩展了megatron的args，这样才能在provider获取到我们添加的额外参数

ENG

We are training a model with some custom arch modify. With MegatronModelMeta we can use our customized model_provider, but there is still a part of the data that can't be passed into the provider via arguments. in order to add these arguments we need to modify the arguments in megatron and swift at the same time, so we added the ability to customize the arguments.

The core changes are as follows:

extra_megatron_kwargs: by adding this parameter to the command line arguments, and passing it through a JSON string, the parameter will be passed through to the megatron, so that we can directly configure the megatron arguments that are not currently covered by swift, such as vpp_size and so on.
extra_args_provider: added in MegatronModelMeta, can extend the args of megatron, the default parameter is None, this parameter extends the args of megatron, so that we can get the extra parameters we added in the provider.

Translated with DeepL.com (free version)

示例：

在convert_hf_config中使用extra_megatron_kwargs

为megatron添加 args

透传现有的megatron参数

megatron sft \
    *****其他参数
    --extra-megatron-kwargs "{\"num_layers_per_virtual_pipeline_stage\":2}"

Experiment results

Paste your experiment result here(if needed).

Jintao-Huang · 2025-05-22T09:53:17Z

Hi! That's a really necessary feature!
Please merge the main branch and run the following code to format the code:

pip install pre-commit
pre-commit run --all-files

liuyanyi · 2025-05-23T02:50:03Z

Hi! That's a really necessary feature! Please merge the main branch and run the following code to format the code:
pip install pre-commit
pre-commit run --all-files

done

…o_padding_ulysses * commit 'e9475f1a306614b30fc6314cc08eb5b40a3f17aa': qwen2_5_vl support video use image_dir (modelscope#4326) [megatron] Add extra args and provider support for easily customize megatron (modelscope#4240) Update internvl.py, solve the exception when setting customized INPUT_SIZE. (modelscope#4320) [grpo] support liger loss (modelscope#3781) compat transformer_engine update (modelscope#4317) compat transformers==4.52 (modelscope#4308) [grpo] support dp in external mode (modelscope#4279) fix vllm engine return empty in stream generation (modelscope#4303) fix (modelscope#4316) update swift image (modelscope#4309) update load_args (modelscope#4296) fix n > 1 with vLLM V1 Engine (modelscope#4295) Reuse existing code [grpo] fix num of reward_model > 1 (modelscope#4287) modify grpo system fix grpo tab support grpo web_ui # Conflicts: # swift/trainers/sequence_parallel/ulysses.py

liuyanyi added 2 commits May 15, 2025 09:10

增加对额外参数提供者的支持，允许在初始化和转换过程中传递额外的模型参数

6e33a79

fix loading issue

eb7ddc6

Jintao-Huang approved these changes May 22, 2025

View reviewed changes

liuyanyi added 3 commits May 23, 2025 01:42

Merge remote-tracking branch 'upstream/HEAD' into megatron_ext

32e19a4

fix lint

b6a88a7

fix lint

00071eb

Jintao-Huang merged commit 5479249 into modelscope:main May 23, 2025
2 checks passed

Jintao-Huang mentioned this pull request May 23, 2025

update link & update extra_megatron_kwargs #4330

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[megatron] Add extra args and provider support for easily customize megatron #4240

[megatron] Add extra args and provider support for easily customize megatron #4240

Uh oh!

liuyanyi commented May 16, 2025

Uh oh!

Jintao-Huang commented May 22, 2025

Uh oh!

liuyanyi commented May 23, 2025

Uh oh!

Uh oh!

Uh oh!

[megatron] Add extra args and provider support for easily customize megatron #4240

[megatron] Add extra args and provider support for easily customize megatron #4240

Uh oh!

Conversation

liuyanyi commented May 16, 2025

PR type

PR information

CN

ENG

Experiment results

Uh oh!

Jintao-Huang commented May 22, 2025

Uh oh!

liuyanyi commented May 23, 2025

Uh oh!

Uh oh!

Uh oh!