Skip to content

Static dependency pyproject.toml proposal#9337

Open
Seas0 wants to merge 4 commits into
modelscope:mainfrom
Seas0:pyproject-static-deps
Open

Static dependency pyproject.toml proposal#9337
Seas0 wants to merge 4 commits into
modelscope:mainfrom
Seas0:pyproject-static-deps

Conversation

@Seas0
Copy link
Copy Markdown

@Seas0 Seas0 commented May 14, 2026

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

Add a secondary pyproject.toml project description in addition to legacy setup.py to make uv sync work for ms-swift main repo.

All requirements/*.txt have been imported into the pyproject.toml statically, with python version guards addition for some package occasionally to allow the declarative dependency resolver happy.

Experiment results

Just enough for uv sync.

Seas0 added 2 commits May 13, 2026 18:09
To modernize pkg management workflow
and remove redundant dep entries
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a pyproject.toml file to define the build system, metadata, and dependencies for the ms_swift project. Feedback highlights several dependency issues, including a version conflict between datasets and the required Python version, the inclusion of the unmaintained attrdict package, and redundant or deprecated dependencies like modelscope and recommonmark. Additionally, improvements were suggested for dependency naming, environment markers for importlib-metadata, and adjusting restrictive Python version requirements for trl and Megatron-related packages.

Comment thread pyproject.toml Outdated
"charset_normalizer",
"cpm_kernels",
"dacite",
"datasets>=3.0,<4.8.5",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

datasets>=3.0 requires Python 3.9 or higher. This conflicts with the project's declared requires-python = ">=3.8" on line 9. This will cause installation to fail on Python 3.8 environments. Consider lowering the minimum version of datasets to 2.0 or increasing the minimum Python version to 3.9.

Suggested change
"datasets>=3.0,<4.8.5",
"datasets>=2.0,<4.8.5",

Comment thread pyproject.toml Outdated
"accelerate",
"addict",
"aiohttp",
"attrdict",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

attrdict is unmaintained and incompatible with Python 3.10+ (it fails with AttributeError: module 'collections' has no attribute 'Mapping'). Since this project supports Python 3.10, 3.11, and 3.12, this dependency will cause runtime errors. It is recommended to remove it and use addict (already listed on line 28) instead.

Comment thread pyproject.toml Outdated
"einops",
"fastapi",
"gradio>=3.40.0,<6.0",
"importlib_metadata",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The canonical package name is importlib-metadata (with a hyphen). Additionally, since the project requires Python >= 3.8, the standard library importlib.metadata is available. If the backport is still needed for specific features (like the improved entry_points API in 3.10+), it should use an environment marker to avoid unnecessary installation on newer Python versions.

Suggested change
"importlib_metadata",
"importlib-metadata; python_version < '3.10'",

Comment thread pyproject.toml
"tqdm",
"transformers>=4.33,<5.9.0",
"transformers_stream_generator",
"trl>=0.15,<1.0; python_version>='3.10'",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Including trl in the main dependencies with a python_version>='3.10' marker means it won't be installed on Python 3.8/3.9. If the codebase imports trl without runtime checks, it will crash. If trl is a core dependency, the project's minimum Python version should be bumped to 3.10. If it is optional, it should be moved to [project.optional-dependencies].

Comment thread pyproject.toml
Comment on lines +84 to +86
"mcore-bridge>=1.2.0; python_version>='3.12'",
"megatron-core>=0.15; python_version>='3.12'",
"peft>=0.15; python_version>='3.12'",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The python_version>='3.12' markers for Megatron-related dependencies are very restrictive. megatron-core and mcore-bridge typically support Python 3.10+. This restriction prevents users on Python 3.10 and 3.11 from using the Megatron integration. Unless there is a specific technical reason for 3.12+, consider lowering this to 3.10.

Comment thread pyproject.toml Outdated
"expecttest",
"flake8",
"isort>=4.3.21",
"modelscope",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

modelscope is already listed in the main dependencies (line 42), so it is redundant to include it in the tests optional dependencies.

Comment thread pyproject.toml
docs = [
"docutils>=0.16.0",
"myst_parser",
"recommonmark",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

recommonmark is deprecated and has been superseded by myst-parser, which is already included in the docs extra. It should be removed to avoid redundancy.

Do as gemini-code-assist said
@Seas0
Copy link
Copy Markdown
Author

Seas0 commented May 14, 2026

@gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a pyproject.toml file to manage project metadata, dependencies, and build configuration for the ms_swift package. The review feedback suggests several improvements to dependency management, including lowering the Python version requirement for Megatron integration to increase compatibility, standardizing version bounds for datasets and transformers according to semantic versioning, and adding a fallback for the trl library on Python versions below 3.10. There is also a recommendation to consolidate dependency definitions to prevent potential synchronization issues between configuration files.

Comment thread pyproject.toml
Comment on lines +85 to +87
"mcore-bridge>=1.2.0; python_version>='3.12'",
"megatron-core>=0.15; python_version>='3.12'",
"peft>=0.15; python_version>='3.12'",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The Python version restriction python_version>='3.12' for the megatron optional dependencies appears overly restrictive. megatron-core and its associated packages typically support Python 3.10 and 3.11. Restricting these to 3.12+ will prevent many users in common research environments (which often use Python 3.10 or 3.11) from utilizing the Megatron integration features.

Suggested change
"mcore-bridge>=1.2.0; python_version>='3.12'",
"megatron-core>=0.15; python_version>='3.12'",
"peft>=0.15; python_version>='3.12'",
"mcore-bridge>=1.2.0; python_version>='3.10'",
"megatron-core>=0.15; python_version>='3.10'",
"peft>=0.15; python_version>='3.10'",

Comment thread pyproject.toml
"charset_normalizer",
"cpm_kernels",
"dacite",
"datasets>=3.0,<4.8.5; python_version>='3.9'",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The upper bound <4.8.5 for datasets seems arbitrary. Since datasets is currently on version 3.x, using a major version bound like <4.0.0 is more standard and avoids confusion with specific patch versions that do not yet exist.

Suggested change
"datasets>=3.0,<4.8.5; python_version>='3.9'",
"datasets>=3.0,<4.0.0; python_version>='3.9'",

Comment thread pyproject.toml
"tensorboard",
"tiktoken",
"tqdm",
"transformers>=4.33,<5.9.0",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The upper bound <5.9.0 for transformers is highly specific. Unless there is a known reason to cap it at exactly 5.9.0, it is better to use a major version bound like <5.0.0 to follow standard semantic versioning practices.

Suggested change
"transformers>=4.33,<5.9.0",
"transformers>=4.33,<5.0.0",

Comment thread pyproject.toml
"tqdm",
"transformers>=4.33,<5.9.0",
"transformers_stream_generator",
"trl>=0.15,<1.0; python_version>='3.10'",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The trl dependency is guarded by python_version>='3.10'. Since the project supports Python 3.8+, users on older versions will not have trl installed, which may lead to runtime errors if the package depends on it. Consider adding a fallback for Python < 3.10 using an older version of trl that still supports those environments.

Suggested change
"trl>=0.15,<1.0; python_version>='3.10'",
"trl>=0.15,<1.0; python_version>='3.10'",
"trl<0.15; python_version<'3.10'",

Comment thread pyproject.toml
Comment on lines +26 to +67
dependencies = [
"accelerate",
"addict; python_version>='3.10'",
"aiohttp",
"attrdict; python_version<'3.10'",
"binpacking",
"charset_normalizer",
"cpm_kernels",
"dacite",
"datasets>=3.0,<4.8.5; python_version>='3.9'",
"datasets>=2.0,<3; python_version<'3.9'",
"einops",
"fastapi",
"gradio>=3.40.0,<6.0",
"importlib-metadata; python_version<'3.10'",
"json_repair",
"matplotlib",
"modelscope>=1.23",
"nltk",
"numpy",
"openai",
"oss2",
"pandas",
"peft>=0.11,<0.20",
"pillow",
"PyYAML>=5.4",
"requests",
"rouge",
"safetensors",
"scipy",
"sentencepiece",
"simplejson>=3.3.0",
"sortedcontainers>=1.5.9",
"tensorboard",
"tiktoken",
"tqdm",
"transformers>=4.33,<5.9.0",
"transformers_stream_generator",
"trl>=0.15,<1.0; python_version>='3.10'",
"uvicorn",
"zstandard",
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding a static list of dependencies in pyproject.toml while setup.py continues to parse them from requirements/*.txt introduces a maintenance risk where the two sources of truth can fall out of sync. It is recommended to consolidate dependency management into pyproject.toml and either remove setup.py or refactor it to read from the TOML file to maintain a single source of truth.

@Seas0 Seas0 marked this pull request as ready for review May 14, 2026 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant