- Text preprocessing is one of the important step while working on text.
- We have to spend more time in cleaning the data to get better results.
- I have created a module to preprocess text.
- Link to the module: Text preprocessing
- Pandas is one of the most used libraries by Data scientists.
- It contains some basic usage of Pandas.
- Link to the notebook: Pandas Part 1
- Source - Data analysis in Python with pandas
- Link to the official Pandas Documentation
- They are many tricks which makes our life easier instead of wasting time in writing long codes. So, I have explored Pandas library thoroughly so that I can save a lot of time while coding.
- Please feel free to explore the notebook.
- Link to the notebook: Pandas Part 2
- Source - Data analysis in Python with pandas
- Link to the official Pandas Documentation
- Participated in the ZS Challenge organised by Hacker Earth. It was a four day challenge.
- Link to the Challenge
- Link to the Repository
- The Hackathon was awesome and the problem was challenging enough to test your skills.
- Secured a rank of 223 out of 4743 (top 5%).
- Participated in the Machine Learning Challenge - "Predict the damage to a building" organised by Hacker Earth.
- Link to the Challenge
- Link to the Repository
- It is a Classification problem which involved a lot of preprocessing, joining tables. The data is huge which also helps us to solve the challenge in an optimized way.
- I faced a lot of problems like Data balancing, duplicate data, hyper parameter tuning. It's quite interesting.
- This project helped me alot in learning different ways to handle huge data.
- Participated in the American Express AI Challenge(Problem 2) - "Supervised Modeling with Emphasis on LAUC " organised by Hacker Earth.
- Link to the Challenge
- Link to the Repository
- It is a Binary Classification problem. This problem involves chosing the correct algorithm(modelling).
- Data visualization using Seaborn
- Link to the Notebook
- Link to the official Seaborn Documentation
- Source - Python for Data Visualization - using Seaborn
- This Notebook contains different plots that were explored on sample datasets, that were provided by the Seaborn library.
- Resumed my work on Machine Learning Challenge - "Predict the damage to a building" organised by Hacker Earth.
- Link to the Challenge
- Link to the Repository
- Increased my score from 0.68571 to 0.72056 after spending a lot of time in tuning hyperparameters.
- A Complete Tutorial on Tree Based Modeling
- This tutorial explains about various topics such as:
- What is a Decision Tree? How does it work?
- Regression Trees vs Classification Trees
- How does a tree decide where to split?
- What are the key parameters of model building and how can we avoid over-fitting in decision trees?
- Are tree based models better than linear models?
- Working with Decision Trees in R and Python
- What are the ensemble methods of trees based model?
- What is Bagging? How does it work?
- What is Random Forest ? How does it work?
- What is Boosting ? How does it work?
- Which is more powerful: GBM or Xgboost?
- Working with GBM in R and Python
- Working with Xgboost in R and Python
- Where to Practice ?
- The tutorial is very intuitive and any one can understand the concepts easily. It is worth reading and spending time.
- Resumed my work on Machine Learning Challenge - "Predict the damage to a building" organised by Hacker Earth.
- Link to the Challenge
- Link to the Repository
- Increased my score from 0.72056 to 0.72498 after spending a lot of time in tuning hyperparameters.
- Resumed my work on American Express AI Challenge(Problem 2) - "Supervised Modeling with Emphasis on LAUC " organised by Hacker Earth.
- Link to the Challenge
- Link to the Repository
- Increased my score from 0.96655 to 0.966556 after spending a lot of time in feature extraction and tuning hyperparameters.
- Successfully completed the challenge - American Express AI Challenge(Problem 2) - "Supervised Modeling with Emphasis on LAUC" organised by Hacker Earth.
- Link to the Challenge
- Link to the Repository
- Rank: 36 out of 1354 participants
- Successfully completed the Machine Learning Challenge - "Predict the damage to a building" organised by Hacker Earth.
- Link to the Challenge
- Link to the Repository
- Rank: 242 out of 7540 participants.
- Participated in the Machine Hack Challenge - "Predict house proces in Bangalore".
- Link to the Challenge
- Link to the Repository
- It is a Regression problem. It involves a lot preprocessing and handling missing values. It's really fun though.
- Started reading about Time series analysis from scratch.
- Text Book: Time Series Analysis Forecasting and control by Box and Jenkins.
- Completed reading first 2 chapters - This gives introduction to time series analysis and the types of methods used.
- Resumed my work on the Machine Hack Challenge - "Predict house proces in Bangalore".
- Link to the Challenge
- Link to the Repository
- Read third chapter from the text book: Time Series Analysis Forecasting and control by Box and Jenkins.
- Chapter 3: Linear Statioanry models (Autoregressive, Moving average, Mixed Autoregressive moving average). It also explains about the conditions of stationarity, invertibility, Autocorrelation function and Partial autocorrelation functions.
- This text book explains the concepts mathematically in a very simple way.
- Working on different Time Series data.
- Appying different stochastic models and comparing with latest models.