A complete machine learning library built in PHP from the ground up, demonstrating fundamental concepts without external dependencies. This library implements all 12 articles of our comprehensive ML series - from basic linear algebra to production-ready neural networks!
This codebase proves that you can understand and implement complete, working machine learning systems in any language - even PHP! While Python dominates ML, understanding the concepts in a familiar language builds deeper comprehension and actually trains networks that learn and improve.
| Directory | Articles | Concepts Implemented |
|---|---|---|
LinearAlgebra/ |
Article 2 | Vectors, Matrices, Dot Products, Matrix Operations |
Algorithms/ |
Article 3 | Sorting (5 algorithms), Searching (Linear, Binary) |
DataProcessing/ |
Article 4 | Data Cleaning, Missing Values, Outlier Detection |
NeuralNetwork/ |
Articles 5-7 | Perceptrons, Neural Networks, Loss Functions, Backpropagation |
Training/ |
Articles 8, 10-11 | Gradient Descent, Hyperparameter Tuning, Complete Learning |
Evaluation/ |
Article 9 | Metrics, Data Splitting, Cross-Validation |
Examples/ |
Article 12 | Complete ML Pipeline |
ml-php/
βββ src/
β βββ LinearAlgebra/ # Article 2: Mathematical foundations
β β βββ Vector.php # Vector operations and dot products
β β βββ Matrix.php # Matrix operations and transformations
β βββ Algorithms/ # Article 3: Algorithmic thinking
β β βββ Sorting.php # 5 sorting algorithms with performance analysis
β β βββ Searching.php # Linear and Binary search
β βββ DataProcessing/ # Article 4: The "car wash" for data
β β βββ DataCleaner.php # Missing values, outliers, validation
β βββ NeuralNetwork/ # Articles 5-7: The core ML components
β β βββ Perceptron.php # The "bouncer" - basic learning unit
β β βββ NeuralNetwork.php # The "assembly line" with Backpropagation
β β βββ LossFunctions.php # The "teacher's red pen" - performance measurement
β βββ Training/ # Articles 8, 10-11: Complete Learning
β β βββ GradientDescent.php # The "mountain hiking" optimizer
β β βββ HyperparameterTuner.php # Automated parameter optimization
β βββ Evaluation/ # Article 9: Performance Measurement
β β βββ Metrics.php # Accuracy, precision, recall, F1, AUC
β β βββ DataSplitter.php # Train/test splits, cross-validation
β βββ Examples/ # Article 12: Full Demonstration
β βββ CompleteExample.php # Full ML pipeline + production checks
βββ comprehensive_demo.php # COMPLETE 12-ARTICLE DEMONSTRATION
βββ test_run.php # Quick component testing
βββ tests/ # Unit tests
βββ data/ # Sample datasets
βββ README.md # This file
cd ml-php
php comprehensive_demo.phpThis runs the complete demonstration showing all 12 articles:
- β Linear algebra operations (Article 2)
- β Sorting and searching algorithms (Article 3)
- β Data cleaning and preprocessing (Article 4)
- β Perceptron learning with logic gates (Article 5)
- β Neural network forward propagation (Article 6)
- β Loss function comparisons (Article 7)
- β Gradient descent training (Article 8)
- β Performance evaluation & metrics (Article 9)
- β Hyperparameter tuning (Article 10)
- β Backpropagation learning (Article 11)
- β Complete ML pipeline (Article 12)
php test_run.phpThe mathematical foundation of machine learning
use MLPyHP\LinearAlgebra\Vector;
use MLPyHP\LinearAlgebra\Matrix;
// Vector operations (like movie recommendations)
$userPrefs = new Vector([4, 2, 5, 1]); // Comedy, Action, Drama, Horror
$movie = new Vector([3, 1, 4, 0]);
$similarity = $userPrefs->dotProduct($movie); // How much you'd like this movie
// Matrix operations (data transformations)
$data = new Matrix([[25, 50000], [35, 75000]]); // Age, Income
$weights = new Matrix([[0.1], [0.001]]); // Feature weights
$scores = $data->multiply($weights); // Customer scoresUnderstanding efficiency and trade-offs
use MLPyHP\Algorithms\Sorting;
use MLPyHP\Algorithms\Searching;
// Compare different sorting approaches
$data = [64, 34, 25, 12, 22, 11, 90];
$bubbleSorted = Sorting::bubbleSort($data); // Simple but slow O(nΒ²)
$quickSorted = Sorting::quickSort($data); // Fast O(n log n)
// Search efficiently
$index = Searching::binarySearch($bubbleSorted, 25); // O(log n) - much faster!The "car wash" for messy real-world data
use MLPyHP\DataProcessing\DataCleaner;
// Handle messy real-world data
$messyData = [
['age' => 25, 'income' => 50000],
['age' => null, 'income' => 75000], // Missing age
['age' => 35, 'income' => 'N/A'] // Missing income
];
// Clean it up
$cleaned = DataCleaner::handleMissingValues($messyData, 'mean');
$outliers = DataCleaner::detectOutliers([50000, 75000, 45000, 200000]); // Find the outlierThe "bouncer at the club" - making binary decisions
use MLPyHP\NeuralNetwork\Perceptron;
// Train a perceptron to learn the AND gate
$perceptron = new Perceptron(2, 0.1, 'step');
$andData = Perceptron::createAndGateData();
$history = $perceptron->train($andData, 100);
// Test it
$result = $perceptron->predict([1, 1]); // Should output 1 (true AND true = true)The "assembly line" - multiple layers working together
use MLPyHP\NeuralNetwork\NeuralNetwork;
// Create a multi-layer network
$network = new NeuralNetwork([2, 4, 1], 'sigmoid'); // 2 inputs, 4 hidden, 1 output
// Forward propagation - data flows through the assembly line
$output = $network->forwardPropagate([0.5, 0.8]);
// Solve XOR (impossible for single perceptron!)
NeuralNetwork::demonstrateXorSolution();The "teacher's red pen" - measuring performance
use MLPyHP\NeuralNetwork\LossFunctions;
// For regression (predicting numbers)
$predictions = [200000, 250000, 180000];
$actual = [210000, 240000, 190000];
$mse = LossFunctions::meanSquaredError($predictions, $actual);
// For classification (predicting categories)
$probabilities = [0.9, 0.1, 0.8, 0.3];
$labels = [1, 0, 1, 0 ];
$crossEntropy = LossFunctions::binaryCrossEntropy($probabilities, $labels);The full pipeline - networks that actually learn!
use MLPyHP\NeuralNetwork\NeuralNetwork;
use MLPyHP\Training\GradientDescent;
use MLPyHP\Training\HyperparameterTuner;
use MLPyHP\Evaluation\DataSplitter;
use MLPyHP\Evaluation\Metrics;
// 1. Prepare data with proper splitting
$inputs = [[0,0], [0,1], [1,0], [1,1]];
$targets = [[0], [1], [1], [0]]; // XOR problem
$split = DataSplitter::trainValidationTestSplit($inputs, $targets, 0.2, 0.2);
// 2. Find optimal hyperparameters
$searchSpace = [
'learning_rate' => [0.01, 0.05, 0.1],
'architecture' => [[2, 4, 1], [2, 8, 1], [2, 8, 4, 1]],
'momentum' => [0.0, 0.5, 0.9]
];
$tuner = new HyperparameterTuner($searchSpace, 'f1_score');
$results = $tuner->randomSearch($trainingData, $validationData, 20);
// 3. Train with best parameters using gradient descent + backpropagation
$network = new NeuralNetwork($results['best_parameters']['architecture']);
$optimizer = new GradientDescent(
$results['best_parameters']['learning_rate'],
'mini-batch',
$results['best_parameters']['momentum']
);
$history = $optimizer->train($network, $trainingData, 1000, 32);
// 4. Comprehensive evaluation
$predictions = [];
foreach ($split['test']['inputs'] as $input) {
$predictions[] = $network->forwardPropagate($input)[0];
}
$report = Metrics::evaluationReport($predictions, $split['test']['targets']);
Metrics::printReport($report);
// Networks that actually learn and improve! π