【Hackathon 8th No.28】在 PaddleNLP 中复现 Phi3 by robinbg · Pull Request #10688 · PaddlePaddle/PaddleNLP (original) (raw)

robinbg pushed a commit to robinbg/PaddleNLP that referenced this pull request

Jun 8, 2025

Fix(phi3): Address comments from PR PaddlePaddle#10688

This commit incorporates your suggestions and requirements from the review comments on PR PaddlePaddle#10688 for the Phi3 model implementation.

The following changes were made:

Tokenizer Configuration Cleanup:
- Removed pretrained_resource_files_map, pretrained_init_configuration, and max_model_input_sizes from paddlenlp/transformers/phi3/tokenizer.py as you requested, to decouple it from specific pre-trained model download paths.
Test Init File Completion:
- Added a docstring to tests/transformers/phi3/__init__.py to ensure it's a valid and non-empty Python module initialization file.
PretrainedModel Mapping Methods:
- Implemented _get_name_mappings, _get_tensor_parallel_mappings, and _get_fuse_or_split_param_mappings in the Phi3PreTrainedModel class in paddlenlp/transformers/phi3/modeling.py. These methods are crucial for model conversion and tensor parallelism, based on the Qwen2 model's implementation.
Parallel Strategy Support:
- Integrated support for sequence parallelism and recomputation into paddlenlp/transformers/phi3/modeling.py.
- This includes:
  - Configuration flags for enabling/disabling these features.
  - Modifications to Phi3Model, Phi3DecoderLayer, Phi3Attention, and Phi3MLP to handle sequence-parallel linear layers and recomputation logic (full layer, full attention, and core attention granularities).
  - Necessary imports and utilities for sequence parallelism (ScatterOp, GatherOp, sequence-parallel linear layers) and recomputation.
  - Tensor parallelism considerations for weight initialization and layer configurations.
Code Formatting:
- Applied pre-commit to all modified files to ensure code style consistency and address linting issues. This included removing some unused imports and a duplicated code segment.