Fix: -i and -d flags now respected in both classify modes (v2.7.1)#17
Merged
Conversation
Backport of the isoform-flag-parity fix from the v2.4.x maintenance line (v2.4.3) onto origin/main (v2.7.0). The in-memory post-extraction filter hardcoded longest_only=True and include_duplicates=False, ignoring CLI flags. The streaming per-contig worker hardcoded longest_only=True and did not have allow_multiple_isoforms plumbed into its config_dict. The extraction prefilter `should_extract_sequences_for` always skipped coord-duplicates regardless of include_duplicates. All three sites now consult config consistently; streaming/in-memory equivalence holds across all (-i, -d) flag combinations. Tests added: synthetic alt-isoform fixture (1 gene, 2 isoforms, 1 shared intron, 1 alt-spliced intron) + 16 behavior assertions across (in-memory, streaming) x 4 flag combos. Existing streaming-equivalence suite extended with test_streaming_matches_in_memory_with_flags parametrized over the same flag combos. tests/unit/test_filters.py updated to reflect corrected include_duplicates semantics in should_extract_sequences_for.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport of the isoform-flag-parity fix from the v2.4.x maintenance line (v2.4.3) onto origin/main (v2.7.0).
The in-memory post-extraction filter hardcoded
longest_only=Trueandinclude_duplicates=False, ignoring CLI flags. The streaming per-contig worker hardcodedlongest_only=Trueand did not haveallow_multiple_isoformsplumbed into itsconfig_dict. The extraction prefiltershould_extract_sequences_foralways skipped coord-duplicates regardless ofinclude_duplicates. All three sites now consult config consistently; streaming/in-memory equivalence holds across all (-i,-d) flag combinations.Tests added: synthetic alt-isoform fixture (1 gene, 2 isoforms, 1 shared intron, 1 alt-spliced intron) + 16 behavior assertions across (in-memory, streaming) x 4 flag combos. Existing streaming-equivalence suite extended with
test_streaming_matches_in_memory_with_flagsparametrized over the same flag combos.tests/unit/test_filters.pyupdated to reflect correctedinclude_duplicatessemantics inshould_extract_sequences_for.