This repository contains a simple Bash script that crawls a user-provided webpage, extracts important points (headings, paragraphs, and links), and identifies potential vulnerabilities such as insecure forms and links.
- Extracts:
- Headings (
h1
,h2
,h3
, etc.) - Paragraphs
- Links
- Headings (
- Checks for vulnerabilities:
- Insecure forms (
http
action URLs). - External scripts loaded from different domains.
- Absence of Content Security Policy (CSP).
- Links using
http
instead ofhttps
.
- Insecure forms (
Before running the script, ensure your system meets the following requirements:
- Bash: Installed by default on Linux/macOS. For Windows, install Git Bash or use WSL (Windows Subsystem for Linux).
- Utilities:
curl
,grep
,sed
, and other standard Unix utilities are required. These come pre-installed on most Unix-like systems.
git clone https://github.com/avidzcheetah/bash-web-crawler.git
cd bash-web-crawler
chmod +x crawler.sh
./crawler.sh
When prompted, enter the URL of the webpage you want to analyze.
When you run the script, it will:
- Extract headings, paragraphs, and links from the webpage.
- Display potential vulnerabilities, such as insecure forms or links.
Below is a screenshot of the tool's output:
Contributions are welcome! If you:
- Encounter any issues
- Have suggestions or feature requests
Feel free to open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for more details.
Developed by Avidu Witharana (Avidz).
For inquiries or suggestions, feel free to contact me at [email protected].