Aspiring Data Scientist | Machine Learning Enthusiast | Cloud Practitioner
π I completed my B.Tech in Computer Science and Engineering at KL University, maintaining a CGPA of 9.2/10.
π‘ Passionate about building data-driven solutions and transforming raw data into actionable insights.
π I love working on projects involving data engineering, analytics, and cloud technologies that bring efficiency and scalability.
π± Constantly learning new tools and techniques in data pipelines, cloud infrastructure, and applied ML.
| Category | Skills / Tools |
|---|---|
| Programming | Python, Java |
| Data Manipulation | Pandas, NumPy |
| Databases & Querying | MySQL, SQL |
| Data Visualization | Power BI, Tableau, Matplotlib, Seaborn |
| Machine Learning | Supervised Learning , Unsupervised Learning |
| Statistical Analysis | Descriptive Statistics, Hypothesis Testing, Predictive Analytics |
| Cloud Technologies | AWS (Lambda, S3, CloudFront, IAM) |
| Tools & Platforms | Excel, Jupyter Notebook, Git, VScode |
π Built a complete data analysis workflow on 50K+ loan records to identify key drivers of loan default risk.
- Engineered a data integrity pipeline using Pandas and NumPy, improving data accuracy by 99%.
- Conducted EDA to uncover 3 key demographic factors affecting default risk.
- Created interactive Power BI dashboards and 8+ static visualizations with Matplotlib and Seaborn.
Tools: Pandas, NumPy, Matplotlib, Seaborn, Power BI, Jupyter Notebook
π Built an end-to-end Autism dataset preprocessing pipeline (cleaning, encoding, scaling, leakage removal).
- Developed and evaluated a Decision Tree classifier with clear, interpretable performance metrics.
- Improved model performance to 83.5% accuracy through enhanced preprocessing and Decision Tree tuning, followed by Flask-based deployment for real-time predictions.
- Implemented a modular, reusable ML pipeline with trainβtest consistency, model persistence (pickle), and input validation to ensure reliable and scalable real-time inference in production.
Tools: Python, Machine Learning, NumPy, Pandas, Flask, Scikit-learn, Jupyter Notebook
βοΈ Designed and deployed a serverless architecture using AWS for large-scale media processing.
- Reduced processing costs by 40% and scaled to handle 200K+ files monthly.
- Integrated AWS CloudFront, S3, and Lambda with a React frontend, improving file retrieval speeds by 60%.
Tools: AWS (Lambda, CloudFront, S3, IAM),HTML,CSS,JS
- SAS Statistical Business Analyst (Coursera) β Regression, Predictive Analytics, and Hypothesis Testing
- AWS Cloud Practitioner β Cloud architecture, scalability, and cost optimization
- π§ Email: [email protected]
- πΌ LinkedIn: linkedin.com/in/hemanth-polineni
- π» GitHub: github.com/Hemanthpolineni
- π Portfolio: polineni-hemanth.lovable.app
"Data is the new oil β I strive to refine it into intelligence that drives innovation."