Skip to content

feat(heuristics): add SimilarProjectAnalyzer to detect structural similarity across packages from same maintainer #1089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AmineRaouane
Copy link
Member

Summary

This PR adds a new heuristic analyzer called SimilarProjectAnalyzer. It checks whether a PyPI package has a similar file/folder structure to other packages maintained by the same user. This helps in identifying potentially malicious packages that replicate existing structures.

Description of changes

  • Created a new analyzer: SimilarProjectAnalyzer.
  • The analyzer fetches the list of maintainers of the target package and retrieves other packages published by those maintainers.
  • For each package, it computes a normalized structure hash from its sdist tarball and compares it to the structure hash of the target package.
  • If any match is found, the heuristic fails, flagging potential structural duplication.
  • Added this analyzer to the heuristics.py registry.
  • Modified detect_malicious_metadata_check.py to include and utilize the new heuristic.
  • Added test cases to validate the functionality and edge cases of the analyzer.

Related issues

None

  • I have reviewed the contribution guide.
  • My PR title and commits follow the Conventional Commits convention.
  • My commits include the "Signed-off-by" line.
  • I have signed my commits following the instructions provided by GitHub. Note that we run GitHub's commit verification tool to check the commit signatures. A green verified label should appear next to all of your commits on GitHub.
  • I have updated the relevant documentation, if applicable.
  • I have tested my changes and verified they work as expected.

…ilarity across packages from same maintainer

Signed-off-by: Amine <[email protected]>
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label May 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant