Integrate Semgrep for static analysis (SAST)#51
Open
connorcarpenter15 wants to merge 1 commit intomainfrom
Open
Integrate Semgrep for static analysis (SAST)#51connorcarpenter15 wants to merge 1 commit intomainfrom
connorcarpenter15 wants to merge 1 commit intomainfrom
Conversation
Add semgrep, an AI-assisted SAST tool to provide static analysis for the repository. Run semgrep on the codebase and save the output to both a JSON and txt file.
Author
Tool Overview & Analysis Type
Types of Problems CaughtThis run flagged several highly relevant issues across different layers of the stack:
Customization
Development Process IntegrationSemgrep is lightweight enough to run locally and in CI, though it does require a Python environment:
Finding Accuracy & Noise
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR integrates Semgrep into the project on a dedicated testing branch named for the tool and documents installation, run artifacts, and an assessment of the tool.
1. Evidence of successful installation (trackable file changes)
The following files were added or modified to install and configure Semgrep:
requirements-semgrep.txtpip install -r requirements-semgrep.txt..semgrepignorenode_modules/,vendor/,build/,public/language/,*.min.js) so the scan focuses on project code.package.json"semgrep": "semgrep scan --config auto"so the tool can be run vianpm run semgrepafter Semgrep is installed (Python/pip).Note: Semgrep is distributed as a Python package, not an NPM package. No new NPM packages were added; the script assumes
semgrepis onPATHafter installing fromrequirements-semgrep.txt.2. Artifacts demonstrating successful run
The following artifacts show that Semgrep was run successfully on this repository:
semgrep-output.txt— Full human-readable report (1,868 lines) fromsemgrep scan --config auto, including all findings with file paths, line numbers, rule IDs, and messages.semgrep-output.json— Machine-readable JSON output of the same run (e.g. for CI or tooling).Summary from the run:
.semgrepignore(e.g.node_modules/,public/language/)3. Assessment: pros, cons, and customization
3.1 Customization
A priori (before use):
pip install -r requirements-semgrep.txt. No NPM packages; the only project change is thesemgrepscript inpackage.jsonand the config/ignore files above.--config auto, which pulls the community rule set. No custom rules were written for this test..semgrepignorewas added to excludenode_modules/,vendor/,build/,public/language/,coverage/,.nyc_output/, and*.min.js, reducing noise and scan time.Over time:
.semgrep.yml) or by choosing a subset of rule IDs. Severity and blocking behavior can be adjusted.3.2 Strengths (with evidence)
https://sg.run/...) to documentation and remediation.no-new-privilegesor with writable root filesystem) and common app-security patterns (e.g. path traversal, regex injection, prototype pollution, session/cookie settings).Quantitative:
path-join-resolve-traversal(113); others includedetect-non-literal-regexp(17),express-path-join-resolve-traversal(8),prototype-pollution-loop(7).3.3 Weaknesses (with evidence)
path.join/path.resolveusages are flagged as possible path traversal even when inputs are constrained (e.g. internal config or fixed segments). Tuning or disabling rules per file/pattern will be needed for a clean baseline.pip install -r requirements-semgrep.txt. There is nonpm install-only story.--config auto, 188 blocking findings may be too strict for an initial rollout; teams may want to start with a smaller rule set or non-blocking severity and then tighten over time.install/web.js) may not apply to production code paths.Quantitative:
.semgrepignore; without that, the run would be slower and noisier.3.4 Conclusion
Semgrep is a strong fit for broad, multi-language static analysis with minimal setup. For this repo, it quickly surfaced infrastructure and application-security issues and produced both human- and machine-readable artifacts. The main follow-up work is to reduce noise (e.g. by adjusting or disabling rules and refining
.semgrepignore) and to integrate the command into CI with a policy for which findings are blocking.