Skip to content
Change the repository type filter

All

    Repositories list

    • truss

      Public
      The simplest way to serve AI/ML models in production
      Python
      MIT License
      849716316Updated Apr 1, 2025Apr 1, 2025
    • Examples of models deployable with Truss
      Python
      MIT License
      411671248Updated Apr 1, 2025Apr 1, 2025
    • Provides the function of slack notification to GitHub Actions.
      TypeScript
      MIT License
      137001Updated Mar 28, 2025Mar 28, 2025
    • TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
      C++
      Apache License 2.0
      1.3k000Updated Mar 27, 2025Mar 27, 2025
    • A GitHub action to create a pull request for changes to your repository in the actions workspace
      TypeScript
      MIT License
      457000Updated Mar 26, 2025Mar 26, 2025
    • 0000Updated Mar 24, 2025Mar 24, 2025
    • Add Honeycomb Markers to your GitHub Actions workflows.
      Dockerfile
      Other
      6000Updated Mar 17, 2025Mar 17, 2025
    • ✨ A Github Action which sets the base and head SHAs required for `nx affected` commands in CI
      TypeScript
      MIT License
      83000Updated Mar 17, 2025Mar 17, 2025
    • setup-mpi

      Public
      Set up your GitHub Actions workflow to use MPI
      Shell
      MIT License
      4000Updated Mar 17, 2025Mar 17, 2025
    • Reports junit test results as GitHub Pull Request Check
      TypeScript
      Apache License 2.0
      140001Updated Mar 17, 2025Mar 17, 2025
    • :octocat: Github action to retrieve all (added, copied, modified, deleted, renamed, type changed, unmerged, unknown) files and directories.
      TypeScript
      MIT License
      274000Updated Mar 17, 2025Mar 17, 2025
    • A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
      Python
      Other
      62001Updated Mar 13, 2025Mar 13, 2025
    • lws

      Public
      LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
      Go
      Apache License 2.0
      63001Updated Mar 13, 2025Mar 13, 2025
    • llm-tools

      Public
      Python
      MIT License
      0000Updated Mar 12, 2025Mar 12, 2025
    • FlashInfer: Kernel Library for LLM Serving
      Cuda
      Apache License 2.0
      262000Updated Feb 6, 2025Feb 6, 2025
    • .github

      Public
      0100Updated Jan 13, 2025Jan 13, 2025
    • Autoscaling components for Kubernetes
      Go
      Apache License 2.0
      4.1k003Updated Dec 11, 2024Dec 11, 2024
    • axolotl

      Public
      Go ahead and axolotl questions
      Python
      Apache License 2.0
      985002Updated Nov 7, 2024Nov 7, 2024
    • Jupyter Notebook
      1200Updated Sep 14, 2024Sep 14, 2024
    • Python
      121700Updated Jun 26, 2024Jun 26, 2024
    • NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes
      Go
      Apache License 2.0
      334003Updated Apr 19, 2024Apr 19, 2024
    • The Triton Inference Server provides an optimized cloud and edge inferencing solution.
      Python
      BSD 3-Clause "New" or "Revised" License
      1.6k000Updated Jan 11, 2024Jan 11, 2024
    • The Triton TensorRT-LLM Backend
      Python
      Apache License 2.0
      119000Updated Jan 11, 2024Jan 11, 2024
    • Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.
      C++
      BSD 3-Clause "New" or "Revised" License
      161000Updated Jan 11, 2024Jan 11, 2024
    • langchain

      Public
      ⚡ Building applications with LLMs through composability ⚡
      Python
      MIT License
      17k000Updated Dec 22, 2023Dec 22, 2023
    • diffusers

      Public
      🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
      Python
      Apache License 2.0
      5.8k000Updated Nov 27, 2023Nov 27, 2023
    • A public github repo for testing truss deploy flow
      Python
      0000Updated Oct 25, 2023Oct 25, 2023
    • Chainlit's cookbook repo
      Python
      362000Updated Aug 17, 2023Aug 17, 2023
    • pygmalion-6b-truss

      Public archive
      A Truss to deploy Pygmalion 6B on Baseten.
      Python
      0100Updated Jul 25, 2023Jul 25, 2023
    • mpt-7b-base-truss

      Public archive
      A deployment "truss" for the MPT-7B Base model from MosaicML
      Python
      Apache License 2.0
      4002Updated Jul 23, 2023Jul 23, 2023