Welcome to the Machine Learning Course Repository! This repository contains code and resources related to various topics in machine learning covered in the course. Below you will find a description of each topic, along with the relevant code and examples.
This repository contains a collection of machine learning algorithms, techniques, and examples.
Supervised learning involves learning a function that maps an input to an output based on example input-output pairs. It includes tasks such as classification and regression.
Decision Trees are used for classification and regression tasks. They work by splitting the data into subsets based on the value of input features.
- Code: See the
DecisionTreesdirectory for implementation details.
Feature Engineering involves transforming raw data into meaningful features that better represent the underlying problem to predictive models.
- Code: See the
FeatureEngineering-NumericalTransformationsdirectory for implementation details.
Feature Selection is the process of selecting a subset of relevant features for use in model construction.
- Code: See the
FeatureSelectiondirectory for implementation details.
K-Nearest Neighbors is an instance-based learning algorithm used for classification and regression.
- Code: See the
K_nearestNeighborsdirectory for implementation details.
Linear Regression is a linear approach to modeling the relationship between a dependent variable and one or more independent variables.
- Code: See the
LinearRegressiondirectory for implementation details.
Logistic Regression is used for binary classification problems and models the probability that a given input point belongs to a certain class.
- Code: See the
LogisticRegressiondirectory for implementation details.
Evaluation metrics for classification are quantitative measures used to assess the performance of a classification model. They provide insights into the accuracy, precision, recall, F1 score, and overall effectiveness of the model in correctly predicting class labels.
- Code: See the
EvaluationMetricsClassificationdirectory for implementation details.
Naive Bayes classifiers are simple probabilistic classifiers based on Bayes' theorem with strong independence assumptions between the features.
- Code: See the
NaiveBayesdirectory for implementation details.
Support Vector Machines are supervised learning models used for classification and regression by finding the hyperplane that best divides a dataset into classes.
- Code: See the
SupportVectorMachinedirectory for implementation details.
Regularization is used to prevent overfitting, while hyperparameter tuning involves optimizing the hyperparameters of a model to improve its performance.
- Code: See the
Regularizationdirectory for implementation details.
Random Forests are an ensemble learning method that constructs multiple decision trees and merges them to get a more accurate and stable prediction.
- Code: See the
RandomForestsdirectory for implementation details.
Boosting is an ensemble technique that combines the predictions of several base estimators to improve robustness over a single estimator.
- Code: See the
Boostingdirectory for implementation details.
Unsupervised learning involves learning patterns from unlabeled data. It includes tasks such as clustering and dimensionality reduction.
K-Means Clustering is an unsupervised learning algorithm used to partition a dataset into K distinct, non-overlapping clusters.
- Code: See the
KmeansClusteringdirectory for implementation details.
Principal Component Analysis is a technique used to emphasize variation and bring out strong patterns in a dataset, reducing its dimensionality.
- Code: See the
Principal_component_analysisdirectory for implementation details.
- Code: See the
Deep_Learning_TensorFlowdirectory for implementation details.
- Code: See the
PandasPracticedirectory for examples and exercises on using pandas for data manipulation and analysis.
- Code: See the
mlModelsScikitLearndirectory for various machine learning models implemented using scikit-learn.
- Code: See the
ClassificationMilestoneProject1 - This project aims to build an end-to-end machine learning pipeline for heart disease classification. The goal is to predict the presence of heart disease in patients based on various medical attributes. The project covers all the essential steps from data preprocessing to model evaluation and deployment.
- Uses Logistic Regression, KNN, Random Forest Classifier
- Explanations are inspired by the Complete Machine Learning and Data Science Zero to Mastery course on Udemy.
- Explanations are inspired by the Tensorflow 2.0: Deep Learning and Artificial Intelligence course on Udemy.
- Explanations are inspired by the Codecademy.