An AI-powered tool to find and validate LinkedIn profiles based on a persona with hybrid scoring approach.
- Generate a persona from social media profiles (GitHub, Twitter)
- Enrich persona data with People Data Labs API for comprehensive professional details
- Further enhance persona with AI using Google's Gemini model
- Generate optimized LinkedIn search queries based on persona details
- Find and score potential LinkedIn profile matches
- Validate matches with image similarity using CLIP
- Complete profile scoring with weighted matching (name, semantic, location, etc.)
- Streamlit UI for easy interaction
- Persona Creation: Start with basic information about the person you're looking for (name, social media profiles, professional details)
- Profile Enrichment: Scrape social profiles to gather more details about the person
- Professional Data Enrichment: Use People Data Labs API to fill in professional details, skills, and background
- AI Enhancement: Use Gemini AI to infer additional professional details, skills, and other attributes
- Search Query Generation: Create optimized search queries for LinkedIn
- LinkedIn Search: Find potential matching profiles using SerpAPI
- Profile Scoring: Score each candidate with a hybrid approach:
- Name similarity using fuzzy matching
- Semantic similarity of professional descriptions using BERT
- Industry and location matching
- Social profile validation
- Image similarity using CLIP (optional)
- Final Ranking: Combine all scores for a final confidence score
- Python 3.8+
- Required API keys:
- SERPAPI_API_KEY: For LinkedIn searches
- PEOPLE_API_KEY: For professional data enrichment (People Data Labs)
- GEMINI_API_KEY: For AI enrichment (Google Gemini)
- SCRAPINGDOG_API_KEY: For LinkedIn profile image extraction (optional)
- TWITTER_BEARER_TOKEN: For Twitter profile scraping (optional)
- Clone the repository
git clone https://github.com/yourusername/linkedin-profile-finder.git
cd linkedin-profile-finder- Install dependencies
pip install -r requirements.txt- Create a
.envfile with your API keys
SERPAPI_API_KEY=your_serpapi_api_key_here
PEOPLE_API_KEY=your_people_data_labs_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
SCRAPINGDOG_API_KEY=your_scrapingdog_api_key_here
TWITTER_BEARER_TOKEN=your_twitter_bearer_token_here
Launch the Streamlit UI:
streamlit run app.pyOr run the command-line demo:
python main.pyThe profile scoring system uses a hybrid approach that considers:
| Score Type | Weight | Description |
|---|---|---|
| Name Score | 35% | Fuzzy matching of names, accounting for variations |
| Semantic Score | 25% | BERT-based similarity of professional introductions |
| Industry Score | 10% | Matching of industry and professional domain |
| Location Score | 15% | Geographic proximity and timezone alignment |
| Social Score | 10% | Validation through social media profiles |
| Image Score | 5% | Visual similarity of profile photos (using CLIP) |
- Recruiting: Find potential candidates matching a specific profile
- Business Development: Locate decision-makers at target companies
- Research: Find professionals in specific domains
- Networking: Locate colleagues or contacts with limited information
main.py: Entry-point for command-line usageapp.py: Streamlit web appcore/profile_scraper.py: Social media profile scrapingapi/people_api.py: Professional data enrichment with People Data Labsapi/gemini_api.py: AI enrichment with Geminicore/image_similarity.py: Handles lightweight perceptual hash-based image comparisoncore/name_utils.py: Name parsing and expansioncore/search.py: LinkedIn search query generationcore/profile_scoring.py: Candidate matching and scoring functions
This project is licensed under the MIT License - see the LICENSE file for details.
This tool is for educational and research purposes, and developed as a learning project only. Always respect LinkedIn's terms of service and privacy policies when using this tool. The creators are not responsible for any misuse of this tool.