[go: up one dir, main page]

Skip to content
Change the repository type filter

All

    Repositories list

    • ROCm Implementation of torchac_cuda from LMCache
      Cuda
      1000Updated Nov 17, 2024Nov 17, 2024
    • vllm

      Public
      vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      4.6k8940Updated Nov 16, 2024Nov 16, 2024
    • Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
      Python
      MIT License
      113001Updated Nov 16, 2024Nov 16, 2024
    • Go ahead and axolotl questions
      Python
      Apache License 2.0
      872000Updated Nov 16, 2024Nov 16, 2024
    • Efficient Triton Kernels for LLM Training
      Python
      BSD 2-Clause "Simplified" License
      202000Updated Nov 16, 2024Nov 16, 2024
    • ROCm support of Ultra-Fast and Cheaper Long-Context LLM Inference
      Python
      Apache License 2.0
      23000Updated Nov 14, 2024Nov 14, 2024
    • Typescript Documentation of JamAISDK
      HTML
      0000Updated Nov 14, 2024Nov 14, 2024
    • skypilot

      Public
      SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
      Python
      Apache License 2.0
      510000Updated Nov 7, 2024Nov 7, 2024
    • This is a repository that contains a CI/CD that will try to compile docker images that already built flash attention into the image to facilitate quicker development and deployment of other frameworks.
      Shell
      Apache License 2.0
      0100Updated Oct 26, 2024Oct 26, 2024
    • ROCm Fork of Fast and memory-efficient exact attention (The idea of this branch is to hope to generate flash attention pypi package to be readily installed and used.
      Python
      BSD 3-Clause "New" or "Revised" License
      1.3k000Updated Oct 26, 2024Oct 26, 2024
    • A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      4.6k000Updated Oct 23, 2024Oct 23, 2024
    • etalon

      Public
      LLM Serving Performance Evaluation Harness
      Python
      Apache License 2.0
      5000Updated Oct 17, 2024Oct 17, 2024
    • A Python client for the Unstructured hosted API
      Python
      MIT License
      17001Updated Oct 14, 2024Oct 14, 2024
    • EmbeddedLLM: API server for Embedded Device Deployment. Currently support CUDA/OpenVINO/IpexLLM/DirectML/CPU
      Python
      02162Updated Oct 6, 2024Oct 6, 2024
    • Go
      1000Updated Sep 26, 2024Sep 26, 2024
    • JamAIBase

      Public
      The collaborative spreadsheet for AI. Chain cells into powerful pipelines, experiment with prompts and models, and evaluate LLM responses in real-time. Work together seamlessly to build and iterate on AI applications.
      Python
      Apache License 2.0
      1742621Updated Sep 23, 2024Sep 23, 2024
    • PowerToys

      Public
      Windows system utilities to maximize productivity
      C#
      MIT License
      6.6k000Updated Aug 9, 2024Aug 9, 2024
    • Arena-Hard-Auto: An automatic LLM benchmark.
      Jupyter Notebook
      Apache License 2.0
      71000Updated Jul 15, 2024Jul 15, 2024
    • Python
      Apache License 2.0
      117000Updated Jul 11, 2024Jul 11, 2024
    • Python
      Apache License 2.0
      53000Updated Jul 9, 2024Jul 9, 2024
    • Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
      HTML
      Apache License 2.0
      755100Updated Jul 9, 2024Jul 9, 2024
    • workshop

      Public
      Jupyter Notebook
      0000Updated Jun 25, 2024Jun 25, 2024
    • ai-town

      Public
      A MIT-licensed, deployable starter kit for building and customizing your own version of AI town - a virtual town where AI characters live, chat and socialize.
      TypeScript
      MIT License
      708000Updated Jun 23, 2024Jun 23, 2024
    • JamAI Base cookbook repo
      Python
      Apache License 2.0
      0400Updated Jun 10, 2024Jun 10, 2024
    • TypeScript
      1000Updated May 31, 2024May 31, 2024
    • TypeScript
      0100Updated May 31, 2024May 31, 2024
    • The 𝗣𝗼𝘄𝗲𝗿𝗳𝘂𝗹 Conversational AI JavaScript Library
      TypeScript
      Other
      64000Updated May 31, 2024May 31, 2024
    • Python
      Apache License 2.0
      1.1k500Updated Apr 22, 2024Apr 22, 2024
    • dspy

      Public
      DSPy: The framework for programming—not prompting—foundation models
      Python
      MIT License
      1.4k000Updated Apr 19, 2024Apr 19, 2024
    • Causal depthwise conv1d in CUDA, with a PyTorch interface
      Cuda
      BSD 3-Clause "New" or "Revised" License
      61000Updated Apr 12, 2024Apr 12, 2024