Update dependency accelerate to v0.34.2 #57
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
==0.28.0->==0.34.2Warning
Some dependencies could not be looked up. Check the warning logs for more information.
Release Notes
huggingface/accelerate (accelerate)
v0.34.2Compare Source
v0.34.1: PatchfixCompare Source
Bug fixes
DataLoaderscould no longer be pickled in #3074 thanks to @byi8220default_transformers_cls_names_to_wrapwould separate_no_split_modulesby characters instead of keeping it as a list of layer names in #3075Full Changelog: huggingface/accelerate@v0.34.0...v0.34.1
v0.34.0: : StatefulDataLoader Support, FP8 Improvements, and PyTorch Updates!Compare Source
Dependency Changes
safetensorsversion 0.4.3.numpy2.0.0Core
New Script Behavior Changes
acceleratelibrary will handle this automatically withaccelerator.end_training(), or you can do it manually usingPartialState().destroy_process_group().transfer_to_npu, ensuring better performance and compatibility.DataLoader Enhancements
StatefulDataLoaderfromtorchdata, allowing better handling of data loading states. Enable by passinguse_stateful_dataloader=Trueto theDataLoaderConfiguration, and when callingload_state()theDataLoaderwill automatically be resumed from its last step, no more having to iterate through passed batches.prepare_data_loader()function is now independent of theAccelerator, giving you more flexibility towards which API levels you would like to use.DataLoaderstates, ensuring smoother training sessions.set_epochfunction forMpDeviceLoaderWrapper.FP8 Training Improvements
TransformerEngineFP8 training, including better defaults for the quantized FP8 weights.TransformerEngineintegration works exactly as intended. These scripts run one half using 🤗 Accelerate's integration, the other with rawTransformersEngine, providing users with a nice example of what we do under the hood with accelerate, and a good sanity check to make sure nothing breaks down over time. Find them hereTransformerEngineandaccelerateas well. Usedocker pull huggingface/accelerate@gpu-fp8-transformerengineto quickly get an environment going.torchpippyno more, long livetorch.distributed.pipeliningtorchpippyis now fully integrated into torch core, and as a result we are exclusively supporting the PyTorch implementation from now on[1, n, n]rather than[2, n, n]as before.pipeliningno longer supports encoder/decoder models, so thet5example has been removed.torchpippypotentially if needed.Fully Sharded Data Parallelism (FSDP)
FullyShardedDataParallelPluginyourself manually with no need for environment patching:accelerate launchand need to ensure the env variables are setup properly for model loading:New Examples
axolotllibrary, so very big kudos to their wonderful workBug Fixes
stepwhen loading the state by @muellerzr in #2992find_tied_paramsfor models with shared layers by @qubvel in #2986transformer_engineon import by @oraluben in #3056skip_first_batchessupport for StatefulDataloader and fix all the tests by @muellerzr in #3068New Contributors
Full Changelog:
stepwhen loading the state by @muellerzr in #2992find_tied_paramsfor models with shared layers by @qubvel in #2986end_trainingby @SunMarc in #3012torchdata.stateful_dataloader.StatefulDataLoaderwithin theAcceleratorby @byi8220 in #2895prepare_data_loader()from Accelerator by @siddk in #3047transformer_engineon import by @oraluben in #3056skip_first_batchessupport for StatefulDataloader and fix all the tests by @muellerzr in #3068Detailed Full Changelog:
v0.33.0: : MUSA backend support and bugfixesCompare Source
MUSA backend support and bugfixes
Small release this month, with key focuses on some added support for backends and bugs:
torch.float8_e4m3fnformatdtype_byte_sizeby @SunMarc in #2945What's Changed
device_map="auto"by @muellerzr in #2914multi_gpuwas being set and warning being printed even withnum_processes=1by @HarikrishnanBalagopal in #2921pipcaching in CI by @SauravMaheshkar in #2952New Contributors
Full Changelog: huggingface/accelerate@v0.32.1...v0.33.0
v0.32.1Compare Source
v0.32.0: : Profilers, new hooks, speedups, and more!Compare Source
Core
huggingface_hubrather than our own implementation (#2795)dispatch_model(#2855)Accelerator.stepnumber is now restored when usingsave_stateandload_state(#2765)import accelerateand any other major core import by 68%, now should be only slightly longer than doingimport torch(#2845)get_backendand added aclear_device_cacheutility (#2857)Distributed Data Parallelism
allreduce. (#2841)log_line_prefix_templateoptional thenotebook_launcher(#2888)FSDP
accelerate merge-weights, one will be automatically created (#2854).safetensors(#2853)XPU
torch>=2.4(#2825)@require_tritontest decorator and enabletest_dynamowork on xpu (#2878)load_state_dictnot working onxpuand refine xpusafetensorsversion check (#2879)XLA
Examples
accelerate launch(#2902)Full Changelog
dispatch_modelby @panjd123 in #2855test_tracking.ClearMLTestby @faaany in #2863torch_deviceinstead of0for device check by @faaany in #2861test_zero3_integrationby @faaany in #2864log_line_prefix_templateOptional in Elastic Launcher for Backward Compatibility by @yhna940 in #2888require_tritonand enabletest_dynamowork on xpu by @faaany in #2878load_state_dictfor xpu and refine xpu safetensor version check by @faaany in #2879New Contributors
Full Changelog: huggingface/accelerate@v0.31.0...v0.32.0
v0.31.0: : Better support for sharded state dict with FSDP and BugfixesCompare Source
Core
timeoutdefault to PyTorch defaults based on backend by @muellerzr in #2758notebook_launcherby @yhna940 in #2788FSDP
Megatron
What's Changed
loggingto log the actual user call site (instead of the call site inside the logger wrapper) of log functions by @luowyang in #2730notebook_launcherby @yhna940 in #2788get_balanced_memoryby @faaany in #2826stage3_prefetch_bucket_sizevalue to an integer by @adk9 in #2814New Contributors
Full Changelog: huggingface/accelerate@v0.30.1...v0.31.0
v0.30.1: : BugfixesCompare Source
Patchfix
Full Changelog: huggingface/accelerate@v0.30.0...v0.30.1
v0.30.0: : Advanced optimizer support, MoE DeepSpeed support, add upcasting for FSDP, and moreCompare Source
Core
tqdmwrapper to make it fully passthrough, no need to havetqdm(main_process_only, *args), it is now justtqdm(*args)and you can pass inis_main_processas a kwarg.cannversion info to command accelerate env for NPU by @statelesshz in #2689Documentation
DeepSpeed
deepspeed-specific Docker image by @muellerzr in #2707. To use, pull thegpu-deepspeedtagdocker pull huggingface/accelerate:cuda-deepspeed-nightlyMegatron
Big Modeling
Bug Fixes
is_train_batch_mintype in DeepSpeedPlugin by @yhna940 in #2646free_memoryto deal with garbage collection by @muellerzr in #2716Full Changelog
execution_deviceby @faaany in #2612is_train_batch_mintype in DeepSpeedPlugin by [@yhna940](https://redirect.github.com/yhna9Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
To execute skipped test pipelines write comment
/ok-to-test.Documentation
Find out how to configure dependency updates in MintMaker documentation or see all available configuration options in Renovate documentation.