Skip to content
View hrs19's full-sized avatar
:atom:
:atom:

Highlights

  • Pro

Block or report hrs19

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
hrs19/README.md

๐Ÿ‘‹ Hi, I'm Harshit!

Profile Picture

Iโ€™m a Machine Learning Engineer/Data Scientist. Currently pursuing my Masterโ€™s in Computer Engineering with a Machine Learning concentration from Northeastern University, soon to graduate in December 2024, I am passionate about solving challenging problems and building impactful ML applications.

๐Ÿš€ About Me

  • ๐ŸŽ“ Master's Student in Computer Engineering (ML Concentration) at Northeastern University.
  • ๐Ÿ‘จโ€๐Ÿ’ป Previously worked as a Data Scientist at Kemper Insurance, where I improved ML model accuracy and engineered data pipelines that drove significant improvements in customer targeting and product adoption.
  • ๐Ÿค– Former Machine Learning Engineer at Accenture, applying ML models to solve real-world problems in energy forecasting, route optimization, and operational efficiency.
  • ๐Ÿ” I have strong skills in machine learning, deep learning, generative AI, LLM fine-tuning, retrieval-augmented generation (RAG), data engineering, data visualization, and MLOps, using tools like Python, TensorFlow, PyTorch, and cloud services like AWS, Azure, and GCP.

๐Ÿ”ง Technologies & Skills

  • Languages: Python, R, C++, SQL, Scala, MATLAB
  • ML & Data Science: TensorFlow, PyTorch, Scikit-learn, XGBoost, Pandas, NumPy
  • Data Engineering: Apache Spark, Hadoop, Kafka, Airflow, Snowflake, ETL Pipelines
  • Cloud & Tools: AWS (SageMaker, EC2, Lambda), GCP, Azure, Docker, Terraform, Jenkins, Git
  • Data Visualization: Tableau, Power BI, Plotly, Matplotlib, Seaborn

๐Ÿ“ˆ What Iโ€™ve Been Working On

Fine-tuned BART and FLAN-T5 models using innovative techniques like LoRA to enhance meeting summarization accuracy. Implemented tokenization strategies for informal content, improving summarization performance.

Built an interactive data visualization app using Streamlit to explore Airbnb listings across U.S. cities. It features interactive maps, word clouds, calendar heatmaps, and more, providing a detailed neighborhood-level analysis.

Used PySpark to develop a distributed ML pipeline, reducing training time for credit card fraud detection by 8x on large datasets.

Developed an image captioning model using an encoder-decoder architecture with ResNet50 and LSTM. Achieved significant accuracy improvement using attention mechanisms.

๐Ÿ“ซ Connect With Me!

Pinned Loading

  1. airbnb-data-visualization airbnb-data-visualization Public

    Streamlit-based data visualization app for exploring Airbnb listings. Features interactive maps, word clouds, charts, and filters to provide insightful analytics for travelers and market analysts

    Python

  2. ComputerVision ComputerVision Public

    Jupyter Notebook

  3. Dialogue-Summarization Dialogue-Summarization Public

    Fine-tuning and evaluating BART and FLAN-T5 models for dialogue summarization using the SAMSum dataset. Includes Jupyter notebooks for training and a Streamlit app for summarization demonstrations.

    Jupyter Notebook

  4. distributed-ml-credit-fraud-pyspark distributed-ml-credit-fraud-pyspark Public

    A scalable machine learning pipeline for credit card fraud detection using PySpark, featuring distributed implementations of Logistic Regression, Support Vector Machines and Random Forests to signiโ€ฆ

    Python

  5. Image2Text-CaptionGen Image2Text-CaptionGen Public

    This project implements an image captioning model using an encoder-decoder architecture. ResNet50 is utilized as the CNN encoder, and an LSTM decoder with attention mechanisms generates text captioโ€ฆ