Skip to content
View AhmedIkram05's full-sized avatar

Block or report AhmedIkram05

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AhmedIkram05/README.md

Ahmed Ikram

Typing SVG


About Me

I'm a 3rd-year BSc Computer Science (Data Science & AI) student at the University of Dundee, on track for a First-Class degree. I build production-grade systems across three tracks: event-driven data pipelines (Kafka, Airflow, AWS, Star Schema), end-to-end ML and LLM systems (XGBoost, MLflow, RAG, LangChain, LLM-as-judge evaluation), and full-stack cloud applications (React, Flask/FastAPI, AWS, CI/CD) - backed by 1,452 automated tests across one project and 670 across another. I care about engineering rigour: dead-letter routing before data hits a database, leakage prevention before any CV fold runs, and deployment pipelines that abort on failure rather than hoping nothing breaks.

I'm currently seeking a post-graduate role starting in 2027, in Data Engineering, ML/AI Engineering, or Software Engineering.


How I Build

I treat reliability and observability as non-negotiable from the start, not retrofitted after the fact. That means dead-letter routing before data reaches a database, leakage prevention baked into sklearn Pipelines before any cross-validation fold runs, and CI/CD gates that abort deployment on any test failure rather than hoping nothing breaks in production. I'm drawn to problems where silent failures are the hardest kind to debug - concurrent write contention, foreign key mismatches, retrieval quality in RAG systems, and I build systems that make those failures impossible to miss. I work from requirements before writing code - functional, non-functional, and acceptance criteria first - and treat API documentation, schema contracts, and test plans as deliverables in their own right, not afterthoughts.


Featured Projects

ATM Log Aggregation, Analysis & Diagnostics Platform

Python FastAPI PostgreSQL Apache Kafka Redis XGBoost Scikit-learn LangChain ChromaDB MLflow React Vite Docker AWS

Industry project for NCR Atleos - production-grade log ingestion pipeline with 3-layer anomaly detection and Agentic RAG diagnostic assistant. Led backend, data engineering, and ML end-to-end across a 7-person Agile team.

  • Kafka event streaming: KRaft mode, 2 topics × 3 partitions, at-least-once delivery with manual offset commits. Hybrid deduplication: Redis SET with 1h TTL + 10K-entry in-memory LRU fallback.
  • 3-layer detection engine: ML_ENSEMBLE (XGBoost + Isolation Forest, 99.8% CV accuracy) + ZSCORE (rolling 20-window sigma) + HEURISTIC (7 deterministic multi-source correlators). 600s configurable sliding window, 10-min cross-layer dedup.
  • Agentic RAG: Cross-encoder reranking (ms-marco-MiniLM), 3-sample self-consistency with 3-gram Jaccard similarity, Reflexion (self-critique → regenerate), citation grounding with regex entity verification. 4-signal confidence fusion: retrieval (30%) + consistency (25%) + verbalized (25%) + grounding (20%).
  • Redis 8 patterns: Rate limiting (sorted set), deduplication (set + TTL), JWT blacklist (string + TTL), distributed locks (SET NX EX), Pub/Sub streaming, response caching, dead-letter queue (streams with exponential backoff), analytics counters (INCR + HLL + ZINCRBY).
  • MLOps via MLflow: Experiment tracking, model registry with "champion" aliases. 7 artifacts per training run: xgb_classifier, isolation_forest, scaler, label_encoder, feature names, IF feature indices, calibrated UNKNOWN threshold.
  • 670 automated tests (521 backend + 149 frontend) across 10 tiers: unit, integration, stress, security, ML, RAG, Redis, Kafka, generators, parsers

DevSync - Real-Time Project Tracker with GitHub Integration

React Tailwind CSS Flask Socket.IO PostgreSQL Docker AWS GitHub Actions Pytest Jest Cypress

Multi-service full-stack application - Flask API with Socket.IO real-time sync, React SPA served through nginx, and PostgreSQL on AWS RDS. Designed for team collaboration: task management, GitHub issue/PR linking, and role-based access control with OIDC-authenticated CI/CD.

  • Multi-stage Docker: Backend compressed to 330MB (python:3.11-slim runtime, build deps stripped in stage 1). Frontend built on node:20-alpine, served by nginx:1.27-alpine with envsubst template for API_UPSTREAM. Docker resolver (127.0.0.11) for runtime DNS. Two-compose-file pattern cleanly separates PostgreSQL from the app stack.
  • Real-time collaboration: Flask-SocketIO with JWT-authenticated WebSocket handshake, project-scoped rooms preventing cross-project data leaks. Socket.IO-client on React side broadcasts task updates, comments, and notifications to all room members - zero polling.
  • Full OIDC CI/CD pipeline: GitHub Actions with OIDC federation (no static credentials). Path-filtered test gates run BE (pytest) and FE (Jest) independently. On merge: ECR push to ECS Fargate rolling update with 200-second health check deployment gate, then S3/CloudFront frontend distribution. Any test failure aborts the pipeline.
  • JWT dual auth: Access + refresh token flow with both cookie and Bearer header transport. 3-tier RBAC (Developer / Team Lead / Admin) enforced at endpoint level via decorators - middleware validates the numerical hierarchy so higher roles inherit all lower permissions.
  • Database design: 12 PostgreSQL tables with SQLAlchemy ORM, composite indexes on (project_id, status) and (user_id, notification_type) for common query patterns. Full-text search on task titles. Audit logging with automatic timestamping across all entity mutations.
  • 1,452 automated tests (518 Pytest + 929 Jest + 5 Cypress) gate every PR. Backend tests run on SQLite in-memory - zero external database dependency required. Coverage thresholds at 85% for both backend and frontend.

W3C Web Logs ETL Pipeline

Apache Airflow Python PostgreSQL AWS RDS Power BI Power Automate

Fully automated ETL pipeline transforming raw W3C IIS logs into a 9-dimension Star Schema on AWS RDS. 9-way parallel Airflow fan-out makes phase three 8× faster than sequential. Geolocation enrichment across 78 countries, −1 surrogate key fallback ensuring zero dropped records, and Power Automate failure alerting. 7-page Power BI dashboard including P95 response time via DAX.


StockLens - FinTech Spending & Investment App

React Native TypeScript Firebase Node.js Jest Alpha Vantage API

Full-stack mobile app converting physical receipts via OCR into structured financial records, mapping spending to stock tickers via Alpha Vantage, and projecting portfolio performance using ARIMA forecasting and Linear Regression. AES encryption at rest, biometric auth, 78 Jest tests.


Haggis Species Classification & Predictive Modelling

Python Scikit-learn XGBoost Pandas Matplotlib Jupyter Notebook

End-to-end ML pipeline: 7 classifiers benchmarked (~90% accuracy), 2 novel ratio-based features engineered that became top-3 predictors, GridSearchCV 5-fold CV, K-Means + DBSCAN clustering, and Linear Regression (R²=0.756). Strict leakage prevention via sklearn Pipelines with ColumnTransformer throughout.


CineMatch - AI/ML Movie Recommendation System

Python Scikit-learn Flask MovieLens Dataset NumPy Pandas

Hybrid recommendation engine (collaborative filtering + content-based) on MovieLens. ~78% hit rate, ~0.22 Precision@10. Dependency-injected strategy pattern means recommendation algorithms are fully swappable without touching the API layer. Cold-start problem addressed via hybrid signal combination.


Rental Car Management System

C++

Modular C++ OOP system - polymorphic vehicle hierarchy, generic repository template pattern, zero raw pointer usage (smart pointers throughout). Levenshtein distance fuzzy search, automated late fee and loyalty rewards logic, file-based persistence, and an 8-scenario E2E test suite.


Unix Version Control System

Bash Unix

Git-like VCS built from scratch in pure Bash - zero external dependencies beyond native Unix utilities. File locking, timestamped versioning, automatic diff generation, filterable activity logs with user attribution, multi-repo support, and compressed archive export. Currently implementing branching, three-way merge, and benchmarking against Git.


Tech Stack

Organized to mirror CV track structure - Software Engineering, ML & AI Engineering, and Data Engineering.

Software Engineering

Python TypeScript JavaScript SQL Bash Java C++

React React Native Tailwind CSS Vite HTML5 CSS3 Bootstrap Chart.js

Flask FastAPI Node.js GraphQL Socket.IO Gunicorn Redis

ML & AI Engineering

Scikit-learn XGBoost LangChain ChromaDB Ollama MLflow

Pandas NumPy Matplotlib Seaborn Jupyter Power BI

Data Engineering

Apache Airflow Apache Kafka

PostgreSQL MongoDB MySQL SQLite

Power Automate

Cloud & DevOps

AWS Azure Docker nginx GitHub Actions Linux

Testing

Pytest Jest React_Testing_Library Cypress


GitHub Stats & Contributions

stats graph languages graph streak graph

github contribution grid snake animation

Pinned Loading

  1. devsync devsync Public

    A Project Tracker Application with GitHub Integration

    JavaScript

  2. laad laad Public

    Log Aggregation, Analysis & Diagnostics Platform, Developed for NCR Atleos

    Python

  3. w3c-etl-pipeline w3c-etl-pipeline Public

    ETL Pipeline Utilising Python, Apache Airflow, Power BI and Power Automate

    Python

  4. stocklens stocklens Public

    Scan Your Spending Receipts, See Your Missed Investing

    TypeScript

  5. haggis-predictive-modeling haggis-predictive-modeling Public

    Haggis Dataset, data mining and prediction modelling notebook

  6. movie-recommendation-system movie-recommendation-system Public

    AI powered movie recommendation system made using Python and Flask

    Python