Skip to content

ariantron/WebSearchEngine

Repository files navigation

WebSearchEngine

A final project for designing and implementing a web search engine using Java, PHP (Laravel), and MySQL. This project includes components for crawling, indexing, ranking, and searching web pages.

Features

  • Web Crawler: Fetches web pages and their links for processing.
  • Indexer: Extracts, analyzes, and indexes content from crawled pages.
  • Ranker: Implements algorithms to rank web pages based on relevance.
  • Search Module: Provides a search interface to query indexed data.

Technologies Used

Component Programming Language Framework/Library Database
Web Crawler Java Jsoup MySQL
Indexer Java Custom Logic MySQL
Ranker Java Custom Logic MySQL
Search Interface PHP Laravel MySQL

Project Structure

Web Crawler

  • Fetches web pages recursively and extracts links using Jsoup.
  • Stores discovered links and their relationships in the database.

Indexer

  • Processes HTML content to extract keywords.
  • Removes stop words and applies stemming for indexing.
  • Stores indexed data in the MySQL database.

Ranker

  • Calculates the relevance of pages based on indexed keywords and other metrics.
  • Stores the ranking results in the database for efficient retrieval.

Search Module

  • Laravel-based front-end for querying the indexed data.
  • Displays ranked search results to the user.

Installation and Setup

  1. Clone the repository:

    git clone https://github.com/ariantron/WebSearchEngine.git
    cd WebSearchEngine
  2. Set up the database:

    • Install MySQL and create a database named search_engine.
    • Import the schema provided in the project.
  3. Configure the database connection:

    • Update the connection settings in Java and Laravel projects.
  4. Build and run the components:

    • Web Crawler: Compile and execute the Java classes under the Crawler package.
    • Indexer: Run the Indexer package for indexing web pages.
    • Search Interface: Start the Laravel application for querying and displaying results.
  5. Start the Laravel development server:

    php artisan serve

Libraries and Tools

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

Web Search Engine - Core

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages