Developing RAG to address LLM hallucinations in SOCs
This project implements a Retrieval-Augmented Generation (RAG) model designed to validate the use of Common Vulnerabilities and Exposures (CVEs) in security reports. The system leverages Metas' Llama3 to ensure that CVEs are accurately referenced and used in context within threat intelligence reports.
The following document goes into deeper detail on how to run the application, update the database, and how the program works:
- CVE Retrieval: Retrieves relevant CVE information from a database or API.
- Contextual Validation: Validates that CVEs are used correctly in the context of security reports.
- Report Integration: Integrates with security reports to provide feedback on CVE usage.
- Recommendation CVE: Give a recommendation of a similar CVE that does not exist the database.
- Python 3.10
- Accelerate
- Bitsandbytes
- Langchain
- Sentence-transformers
- Transformers
- Tqdm
- pytorch
- tokenizers
- torchaudio
- Torchvision
- Huggingface_hub
- Pandas
For a more interactive learning experience, check out our YouTube videos: