Skip to content
View hackdavid's full-sized avatar
  • symphonyai
  • london
  • 19:01 (UTC -12:00)

Highlights

  • Pro

Block or report hackdavid

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hackdavid/README.md

Daud Ibrahim

Senior Machine Learning Engineer | LLM Systems & Production AI

London, United Kingdom
ibrahimdaud03@gmail.comLinkedInGitHubMedium devReddit


🎯 Building Production AI Systems That Scale

I architect and deploy end-to-end machine learning systems from distributed training to low-latency inference. Specialized in LLM fine-tuning, inference optimization, and MLOps infrastructure that powers real-world applications serving millions of users.


📊 Impact Metrics

7B+
Parameters Trained
Multilingual LLM
50k+
Requests/Hour
Peak Production Traffic
25%
Latency Reduction
Inference Optimization
500+
Stores Deployed
Retail AI Platform
240
H100 GPUs
Distributed Training
5TB+
Data Processed
Training Pipelines
94%
Model Accuracy
Document Classification
100k+
Daily Users
Production APIs

🚀 Core Expertise

🤖 LLM & Generative AI

Model Architectures

  • GPT-4, BERT, LLaMA
  • Mistral, MPT, Qwen
  • Custom transformer variants

Advanced Training

  • LoRA, QLoRA, PEFT
  • DPO, PPO, RLHF alignment
  • DeepSpeed, Megatron
  • Distributed GPU training

Inference Systems

  • TensorRT-LLM, vLLM, TGI
  • KV caching, dynamic batching
  • Flash Attention, PagedAttention
  • INT8/FP16 quantization

Agentic AI

  • Multi-agent orchestration
  • Model Context Protocol (MCP)
  • Tool use and function calling
  • Autonomous reasoning systems

🔧 MLOps & Infrastructure

Production Deployment

  • FastAPI microservices
  • Docker containerization
  • Kubernetes orchestration
  • Auto-scaling strategies

Cloud Platforms

  • AWS (SageMaker, EC2, S3)
  • Lambda serverless
  • CI/CD pipelines
  • Infrastructure as Code

Model Operations

  • A/B testing frameworks
  • Performance monitoring
  • Model versioning
  • Automated retraining

High Availability

  • Load balancing
  • Fault tolerance
  • Disaster recovery
  • 99.9% uptime SLAs

📊 Data & RAG Systems

Vector Intelligence

  • Pinecone, Weaviate, Chroma
  • FAISS similarity search
  • Embedding optimization
  • Hybrid search strategies

RAG Pipelines

  • Retrieval-Augmented Generation
  • Context window management
  • Re-ranking strategies
  • Multi-modal retrieval

Data Engineering

  • Apache Spark, Airflow
  • ETL at petabyte scale
  • Real-time streaming (Kafka)
  • Feature stores

Databases

  • PostgreSQL, MongoDB
  • Time-series optimization
  • Query performance tuning
  • Sharding strategies

💻 Programming & Tools

Core Languages

  • Python (PyTorch, TensorFlow)
  • CUDA kernel programming
  • SQL optimization
  • JavaScript/TypeScript

ML Frameworks

  • Hugging Face ecosystem
  • LangChain, LlamaIndex
  • Scikit-learn, XGBoost
  • Custom neural architectures

Development Tools

  • Git version control
  • REST API design
  • GraphQL
  • WebSockets

👁️ Computer Vision

Vision Models

  • CNNs, Vision Transformers
  • Object detection (YOLO, R-CNN)
  • Semantic segmentation
  • GANs for image generation

OCR & Document AI

  • Tesseract, EasyOCR
  • Layout analysis
  • Form understanding
  • 94% extraction accuracy

Libraries

  • OpenCV, PIL
  • Detectron2
  • MMDetection
  • Albumentations

💼 Professional Experience

SymphonyAI · Senior Machine Learning Engineer

Aug 2024 - present | Remote, London- UK

AI-Driven Retail Intelligence Platform

Impact: Deployed across 500+ stores, improving on-shelf availability by 20%

Technical Implementation:

  • Architected multi-agent LLM workflows for automated data analysis and insight generation
  • Built business analytics assistant automating sales trend identification, reducing manual reporting by 5+ hours/week
  • Designed scalable inference services handling 10k+ daily queries with 50k requests/hour peak traffic
  • Implemented RAG pipelines with vector search for real-time product and inventory intelligence

Stack: LLMs, Multi-agent systems, RAG, FastAPI, Kubernetes, AWS, Vector DBs

Ola Krutrim · Data Scientist

Oct 2023 - Aug 2024 | Bangalore, India

Multilingual LLM Training & Deployment

Impact: Trained 7B-parameter model from scratch, reducing inference latency by 25%

Technical Implementation:

  • Orchestrated distributed training across 240 H100-80GB GPUs, processing 2TB+ of text data
  • Built large-scale data pipelines (5TB) with automated filtering, deduplication, and quality validation
  • Implemented custom tokenization for 12 Indic languages using SentencePiece
  • Optimized MPT architecture inference using TensorRT-LLM and vLLM frameworks
  • Developed custom FastAPI serving layer with KV caching and dynamic batching (1024 context length)
  • Conducted comprehensive benchmark evaluation across math, reasoning, and coding tasks

Stack: PyTorch, DeepSpeed, TensorRT-LLM, vLLM, FastAPI, H100 GPUs, SentencePiece

Publication: Krutrim LLM: Multilingual Foundational Model

RenewBuy · Senior Software Engineer

Sep 2022 - Oct 2023 | Gurgaon, India

Insurance Document Processing Automation

Impact: F1 score improvement from 0.78 to 0.94, processing time reduced from hours to minutes

Technical Implementation:

  • Fine-tuned transformer models (BERT-based) for document classification and entity extraction
  • Applied parameter-efficient fine-tuning (LoRA, PEFT) reducing training cost by 60%
  • Implemented model alignment using DPO and RLHF for business-specific outputs
  • Integrated ML models into production backend handling tens of thousands of daily transactions
  • Built monitoring and retraining pipelines ensuring model performance stability

Stack: Transformers, LoRA, PEFT, DPO, RLHF, FastAPI, PostgreSQL

Unified Credit Solution · Python Developer

Feb 2021 - Sep 2022 | Gurgaon, India

OCR & Document Processing at Scale

Impact: 94% extraction accuracy, processing millions of records monthly

Technical Implementation:

  • Deployed OCR pipelines using Tesseract and custom CNN models for structured data extraction
  • Built ML-based form automation reducing manual processing by 80%
  • Developed ETL workflows with Apache Airflow processing millions of records
  • Maintained production APIs serving 100k+ users with optimized database performance

Stack: OpenCV, Tesseract, CNNs, Apache Airflow, PostgreSQL, FastAPI


🛠️ Technical Stack

Category Technologies
LLM Frameworks PyTorch • TensorFlow • Hugging Face • LangChain • LlamaIndex
Model Architectures GPT-4 • BERT • LLaMA • Mistral • MPT • Qwen
Training & Fine-tuning LoRA • QLoRA • PEFT • DPO • PPO • RLHF • DeepSpeed • Megatron
Inference Optimization TensorRT-LLM • vLLM • HuggingFace TGI • ONNX • Triton
AI Orchestration Multi-agent Systems • MCP (Model Context Protocol) • RAG Pipelines
Vector Databases Pinecone • Weaviate • Chroma • FAISS
MLOps & Cloud Docker • Kubernetes • AWS (SageMaker, EC2, S3) • FastAPI • GitHub Actions
Data Engineering Apache Spark • Apache Airflow • Kafka • PostgreSQL • MongoDB
Programming Python • CUDA • SQL • JavaScript • C • Git
Computer Vision OpenCV • OCR • Tesseract • CNNs • Object Detection

🎓 Education

MSc in Artificial Intelligence
University of Roehampton, London
Sep 2025 - Jun 2026

BTech in Computer Science and Engineering
Uttarakhand Technical University, India
Aug 2018 - Jul 2022


📝 Publications & Research

Krutrim LLM: Multilingual Foundational Model for Large-Scale Deployment
Large-scale multilingual LLM training and deployment strategies for Indic languages

Virtual Try-On Clothing Using Deep Learning
Computer vision approach for virtual garment fitting using GANs

Human Body Measurement Estimation from 2D Images
CNN-based body measurement extraction from single images


🌟 Current Focus

Agentic AI Systems
Building autonomous agents with MCP and tool use
LLM Inference at Scale
Sub-100ms latency for production workloads
MLOps Best Practices
End-to-end model lifecycle management

💬 Open to collaborations on LLM systems, agentic AI, and open source tooling.

If you're working on something interesting, let's talk

📧 Email Me💼 LinkedIn🐙 GitHub


Last updated: march 2026

Popular repositories Loading

  1. engram-memory engram-memory Public

    Python 6

  2. Human-body-measurement-using-image-processing Human-body-measurement-using-image-processing Public

    Python 1

  3. LLM-and-RAG-Enabled-Slack-Bot-for-Document-Insights LLM-and-RAG-Enabled-Slack-Bot-for-Document-Insights Public

    Python 1

  4. Language-Attention-Guided-Reconstruction-for-Robot-Manipulation Language-Attention-Guided-Reconstruction-for-Robot-Manipulation Public

    Python 1

  5. Elevate.api Elevate.api Public

    Forked from kausmeows/Elevate.api

    Python

  6. LLM-model-using-torch-from-scratch-Build-GPT- LLM-model-using-torch-from-scratch-Build-GPT- Public

    In this repo, i am going to build a LLM model from scratch using torch and also we will cover the transformer archticture

    Jupyter Notebook