Skip to content
View MyDarapy's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report MyDarapy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MyDarapy/README.md

Hello, I'm Dara 👋

I am a machine learning engineer, and an aspiring AGI researcher.

My work primarily evolves around foundation multimodal models, training optimization and inference optims. I am concerned about AI safety and interpretability so I occasionally do some mechnaistic intrepretability probing during the weekends and write about my findings here


My Work

  • Infuse audio: A framework for aligning audio representations with the embedding space of LLMs (multimodality)
  • Ablate compliance: Finding jailbreak directions within the activation subspace of a LLMs
  • Flash Attention and Diffusion Kernels in Triton: Highly performant, highly optimized flash attention kernels, linear attention and diffusion models kernels
  • Upcycle MoE: A framework for upcylcing any dense model to a sparse Mixture of expert arch

Get in touch

My Resumè

Link

Pinned Loading

  1. ablate-compliance ablate-compliance Public

    identifying and ablating the activation-space directions that enable jailbreaks in large language models

    Python

  2. probing_gcg probing_gcg Public

    investigating why most gcg adversarial suffixes succeed or fail at jailbreaking language models.

    Python

  3. smollm-experiments smollm-experiments Public

    (Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)

    Python 1

  4. transformer-attenttion transformer-attenttion Public

    barebone implementation of every transformer component.

    Python 1

  5. gpt-1-from-scratch gpt-1-from-scratch Public

    Rewriting and pretraining GPT-1 from scratch. Implementing Multihead Attention (MHA) in pyTorch from the original paper Improving Language Understanding by Generative Pre-Training (https://cdn.open…

    Python 5

  6. multimodal-llms multimodal-llms Public

    framework for fusing continuous audio embeddings into a causal language model for audio understanding

    Python