sglang

SGLang is a fast serving framework for large language models and vision language models.

Python

Website

📍 🏆

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python

Website

51k ⭐

cline

Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.

TypeScript

Website

47k ⭐

ColossalAI

Making large AI models cheaper, faster and more accessible

Python

Website

41k ⭐

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python

Website

39k ⭐

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python

38k ⭐

ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python

Website

37k ⭐

mlx

MLX: An array framework for Apple silicon

C++

Website

21k ⭐

flash-attention

Fast and memory-efficient exact attention

Python

18k ⭐

ai

The AI Toolkit for TypeScript. From the creators of Next.js, the AI SDK is a free open-source library for building AI-powered applications and agents

TypeScript

Website

15k ⭐

Megatron-LM

Ongoing research training transformer models at scale

Python

Website

12k ⭐

TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++

Website

11k ⭐

Projects

sglang

vllm

cline

ColossalAI

DeepSpeed

FastChat

ray

mlx

flash-attention

ai

Megatron-LM

TensorRT