Skip to content

[pull] main from python:main #38

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 23, 2025
Merged

[pull] main from python:main #38

merged 4 commits into from
May 23, 2025

Conversation

pull[bot]
Copy link

@pull pull bot commented May 23, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

Summary by Sourcery

Introduce explicit PyMutex-based locking across the Zstd extension to replace critical section macros, refactor and unify dictionary loading logic, strengthen thread-safety through new concurrency tests, and apply related fixes in asyncio components.

New Features:

  • Add multithreaded tests for safe concurrent use of shared dictionaries in ZstdCompressor and ZstdDecompressor.

Bug Fixes:

  • Skip closing pipes in asyncio.BaseSubprocess.close() when the event loop is already closed to prevent errors.

Enhancements:

  • Introduce PyMutex locks in ZstdCompressor, ZstdDecompressor, and ZstdDict to ensure thread safety.
  • Refactor dictionary loading into shared internal _zstd_load_impl helpers for both compressor and decompressor.
  • Remove @critical_section macros from Clinic-generated getters in Zstd bindings.
  • Replace the dataclass decorator with an explicit init in CycleFoundException.

Documentation:

Tests:

  • Add test_compress_shared_dict and test_decompress_shared_dict to cover concurrent dictionary access.

emmatyping and others added 4 commits May 22, 2025 23:30
Move from using critical sections to locks for the (de)compression methods.
Since the methods allow other threads to run, we should use a lock rather
than a critical section.
@pull pull bot added the ⤵️ pull label May 23, 2025
@pull pull bot merged commit 366d95d into Powerm4:main May 23, 2025
Copy link

sourcery-ai bot commented May 23, 2025

Reviewer's Guide

This PR enhances thread-safety in the _zstd extension by replacing critical-section macros with explicit PyMutex locks across compressor, decompressor, and dictionary modules, refactors and unifies dictionary loading logic, cleans up generated clinic code, adds multithreading tests for shared dictionaries, and includes minor library fixes in asyncio and types.

Sequence Diagram for ZstdCompressor Dictionary Loading with Locking

sequenceDiagram
    participant C as Caller
    participant ZSC as ZstdCompressor
    participant ZDict as ZstdDict

    C->>ZSC: _zstd_load_c_dict(dict_obj, type)
    activate ZSC
    ZSC->>ZDict: PyMutex_Lock(&zd->lock)
    activate ZDict
    ZSC->>ZSC: Calls _zstd_load_impl(zd, type)
    activate ZSC
    alt type == DICT_TYPE_DIGESTED
        ZSC->>ZDict: _get_CDict(zd, compression_level)
        activate ZDict
        note right of ZDict: asserts zd->lock is held
        ZDict-->>ZSC: ZSTD_CDict*
        deactivate ZDict
        ZSC->>ZSC: ZSTD_CCtx_refCDict(cctx, c_dict)
    else type == DICT_TYPE_UNDIGESTED
        ZSC->>ZSC: ZSTD_CCtx_loadDictionary(cctx, zd->dict_content)
    else type == DICT_TYPE_PREFIX
        ZSC->>ZSC: ZSTD_CCtx_refPrefix(cctx, zd->dict_content)
    end
    deactivate ZSC
    ZSC->>ZDict: PyMutex_Unlock(&zd->lock)
    deactivate ZDict
    ZSC-->>C: result
    deactivate ZSC
Loading

Sequence Diagram for ZstdCompressor Compression with Locking

sequenceDiagram
    participant C as Caller
    participant ZSC as ZstdCompressor

    C->>ZSC: _zstd_ZstdCompressor_compress_impl(data, mode)
    activate ZSC
    ZSC->>ZSC: PyMutex_Lock(&self->lock)
    activate ZSC
    alt use_multithread and mode == ZSTD_e_continue
        ZSC->>ZSC: compress_mt_continue_lock_held(data)
        note right of ZSC: asserts self->lock is held
    else
        ZSC->>ZSC: compress_lock_held(data, mode)
        note right of ZSC: asserts self->lock is held
    end
    ZSC->>ZSC: PyMutex_Unlock(&self->lock)
    deactivate ZSC
    ZSC-->>C: compressed_data
    deactivate ZSC
Loading

Sequence Diagram for ZstdDecompressor Dictionary Loading with Locking

sequenceDiagram
    participant C as Caller
    participant ZSD as ZstdDecompressor
    participant ZDict as ZstdDict

    C->>ZSD: _zstd_load_d_dict(dict_obj, type)
    activate ZSD
    ZSD->>ZDict: PyMutex_Lock(&zd->lock)
    activate ZDict
    ZSD->>ZSD: Calls _zstd_load_impl(zd, type)
    activate ZSD
    alt type == DICT_TYPE_DIGESTED
        ZSD->>ZDict: _get_DDict(zd)
        activate ZDict
        note right of ZDict: asserts zd->lock is held
        ZDict-->>ZSD: ZSTD_DDict*
        deactivate ZDict
        ZSD->>ZSD: ZSTD_DCtx_refDDict(dctx, d_dict)
    else type == DICT_TYPE_UNDIGESTED
        ZSD->>ZSD: ZSTD_DCtx_loadDictionary(dctx, zd->dict_content)
    else type == DICT_TYPE_PREFIX
        ZSD->>ZSD: ZSTD_DCtx_refPrefix(dctx, zd->dict_content)
    end
    deactivate ZSD
    ZSD->>ZDict: PyMutex_Unlock(&zd->lock)
    deactivate ZDict
    ZSD-->>C: result
    deactivate ZSD
Loading

Sequence Diagram for ZstdDecompressor Decompression with Locking

sequenceDiagram
    participant C as Caller
    participant ZSD as ZstdDecompressor

    C->>ZSD: _zstd_ZstdDecompressor_decompress_impl(data, max_length)
    activate ZSD
    ZSD->>ZSD: PyMutex_Lock(&self->lock)
    activate ZSD
    ZSD->>ZSD: stream_decompress_lock_held(data, max_length)
    activate ZSD
    note right of ZSD: asserts self->lock is held
    ZSD->>ZSD: decompress_lock_held(in_buffer, max_length)
    opt Error or Completion
        ZSD->>ZSD: decompressor_reset_session_lock_held()
        note right of ZSD: asserts self->lock is held
    end
    deactivate ZSD
    ZSD->>ZSD: PyMutex_Unlock(&self->lock)
    deactivate ZSD
    ZSD-->>C: decompressed_data
    deactivate ZSD
Loading

Sequence Diagram for Accessing ZstdDecompressor.unused_data with Locking

sequenceDiagram
    participant C as Caller
    participant ZSD as ZstdDecompressor

    C->>ZSD: get unused_data
    activate ZSD
    ZSD->>ZSD: PyMutex_Lock(&self->lock)
    activate ZSD
    ZSD->>ZSD: Access internal state (self->eof, self->unused_data)
    ZSD->>ZSD: PyMutex_Unlock(&self->lock)
    deactivate ZSD
    ZSD-->>C: unused_data_bytes
    deactivate ZSD
Loading

Class Diagram for _zstd Module Structs with Added Locks

classDiagram
    direction LR
    class ZstdCompressor {
        +PyMutex lock
        +int compression_level
        +PyObject* dict
        +ZSTD_CCtx* cctx
        +int use_multithread
    }
    class ZstdDecompressor {
        +PyMutex lock
        +PyObject* dict
        +ZSTD_DCtx* dctx
        +PyObject* unused_data
        +int eof
    }
    class ZstdDict {
        +PyMutex lock
        +PyObject* dict_content
        +uint32_t dict_id
        +PyObject* c_dicts
        +ZSTD_DDict* d_dict
    }
Loading

Updated Class Diagram for asyncio.tools.CycleFoundException

classDiagram
    class Exception {
        <<Python Base Class>>
    }
    class CycleFoundException {
        +cycles: list~list~int~~
        +id2name: dict~int, str~~
        +__init__(self, cycles, id2name) None
    }
    CycleFoundException --|> Exception
Loading

File-Level Changes

Change Details Files
Introduce explicit PyMutex locks for thread safety throughout the zstd modules
  • Add PyMutex field to compressor, decompressor, and ZstdDict structs
  • Replace Py_BEGIN_CRITICAL_SECTION/Py_END_CRITICAL_SECTION macros with PyMutex_Lock/Unlock and assertions
  • Initialize and assert lock state in constructors and deallocators
Modules/_zstd/compressor.c
Modules/_zstd/decompressor.c
Modules/_zstd/zstddict.c
Modules/_zstd/zstddict.h
Refactor dictionary loading into shared _load_impl functions
  • Extract common _zstd_load_impl to handle DIGESTED, UNDIGESTED, PREFIX cases
  • Lock around dictionary operations when loading
  • Simplify _get_CDict and error paths with ret codes
Modules/_zstd/compressor.c
Modules/_zstd/decompressor.c
Clean up clinic-generated code by removing critical_section decorators
  • Remove Py_BEGIN_CRITICAL_SECTION imports and usages in clinic headers
  • Return directly from generated getters without wrapping locks
Modules/_zstd/clinic/zstddict.c.h
Modules/_zstd/clinic/decompressor.c.h
Add multithreaded tests for shared dictionary usage
  • Add test_compress_shared_dict and test_decompress_shared_dict using threading.Barrier
  • Use threading_helper decorators to enforce concurrency
Lib/test/test_zstd.py
Apply minor library adjustments outside _zstd
  • Convert CycleFoundException to explicit init instead of dataclass
  • Guard subprocess pipe.close against closed loops
  • Remove obsolete TODOs in types.py
Lib/asyncio/tools.py
Lib/asyncio/base_subprocess.py
Lib/types.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants