Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Keep this document concise.
- Core user, developer, and design docs are in-repo under fluxon_doc_cn/ and fluxon_doc_en/
- Detailed bilingual doc writing rules are indexed at `fluxon_doc_en/dev_doc/Developer - 3 - Documentation Writing Rules.md` and `fluxon_doc_cn/dev_doc/开发者 - 3 - 文档写作规约.md`
- teststack has two steps: start testbed and testrunner
- teststack has UI support; testrunner should own the UI, and the UI should reuse the ops interfaces underneath
- teststack has UI support; testrunner should own the UI authority and API surface, and the UI should run as a long-lived service that reuses the ops interfaces underneath
- All Python code in this project must be compatible with Python >=3.10
- YAML files in this project are examples by default. Do not edit them directly; create a YAML file for your specific development environment
- Start long-running commands in `tmux`. Do not run long-lived services directly in the foreground.
Expand Down
2 changes: 1 addition & 1 deletion AGENTS_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
- 核心用户文档、开发文档和设计文档都在仓库内的 `fluxon_doc_cn/` 和 `fluxon_doc_en/` 下
- 详细的中英文文档写作规约索引见 `fluxon_doc_cn/dev_doc/开发者 - 3 - 文档写作规约.md` 和 `fluxon_doc_en/dev_doc/Developer - 3 - Documentation Writing Rules.md`
- `teststack` 有两个步骤:`start testbed` 和 `testrunner`
- `teststack` 支持 UI;`testrunner` 应负责 UIUI 应复用下层的 ops 接口
- `teststack` 支持 UI;`testrunner` 应负责 UI 的 authority 和 API surface,但 UI 应作为常驻服务运行,并复用下层的 ops 接口
- 本项目所有 Python 代码都必须兼容 Python `>= 3.10`
- 本项目中的 YAML 文件默认都是示例。不要直接修改它们;请为你的具体开发环境创建单独的 YAML 文件
- 长时间运行的命令请放到 `tmux` 里启动。不要直接在前台运行长生命周期服务
Expand Down
45 changes: 34 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@

Fluxon is a high-performance distributed communication and caching substrate for world models and other AI-native training and inference systems. It uses a single Rust-based integrated storage-and-transport foundation to provide unified key-value caching and remote procedure call (`KV/RPC`), message queue (`MQ`), and S3-compatible file and object caching (`FS`) interfaces, focusing on three classes of problems: cross-process and cross-node reuse of inference-side `KVCache` and `latent cache`, decoupled elastic message transport across heterogeneous resource pools, and remote access, `S3` forwarding, cache acceleration, and large-scale cross-cluster data migration for AI data and model files. As GPU performance keeps increasing, bottlenecks and wasted resources on CPU and IO paths become more visible. This increasingly calls for more efficient infrastructure to handle this high-performance work and reuse it across different business scenarios. Fluxon addresses this by first consolidating the complexity of low-level storage and transport in Rust, then exposing scenario-oriented `KV/RPC`, `MQ`, and `FS` interfaces on top.

## Contents
<a id="contents"></a>

## 🧭 Contents

- [Foundation Capabilities](#foundation-capabilities)
- [Interface Capabilities](#interface-capabilities)
Expand All @@ -29,7 +31,9 @@ Fluxon is a high-performance distributed communication and caching substrate for
- [License](#license)
- [Stargazers over time](#stargazers-over-time)

## Foundation Capabilities
<a id="foundation-capabilities"></a>

## 🧱 Foundation Capabilities

- End-to-end Rust: moves connection handling, protocol encoding/decoding, state-machine progression, shared-memory management, and observability collection into Rust hot paths
- Integrated storage and transport: prioritizes the cross-process shared-memory fast path and optimizes storage and transport within one unified data plane
Expand All @@ -46,7 +50,9 @@ Fluxon is a high-performance distributed communication and caching substrate for

![](./pics/topology_ui.png)

## Interface Capabilities
<a id="interface-capabilities"></a>

## 🔌 Interface Capabilities

### Fluxon KV/RPC

Expand Down Expand Up @@ -82,7 +88,9 @@ Fluxon FS is an S3-compatible file and object cache for AI data and model files.
- Specialized optimization for small-file / large-file reads and writes: optimizes concurrency and transport paths by file granularity and read / write path to improve bandwidth utilization and overall throughput
- Large-scale cross-cluster migration: supports `PB`-scale data migration and keeps caching, transport, and failure recovery in one unified path

## Benchmark
<a id="benchmark"></a>

## 📊 Benchmark

The benchmark section mainly covers the `RPC`, `KV`, and `FS` data planes, and the related scripts and configurations are primarily under `fluxon_test_stack/`.

Expand All @@ -108,15 +116,19 @@ The benchmark results show that small-file reads and large-file writes are alrea

`MQ` currently focuses mainly on scenario problems and data-plane design. The automated runtime entrypoints are `test_runner.py` and `fluxon_test_stack/`.

## Runtime Requirements
<a id="runtime-requirements"></a>

## 🧰 Runtime Requirements

- Linux only
- Python `>= 3.10`
- When building from source, the Rust toolchain follows [fluxon_rs/rust-toolchain.toml](./fluxon_rs/rust-toolchain.toml), currently pinned to `1.93.0`
- External middleware dependencies: the minimum service plane requires `etcd` and `greptime`; `FluxonFS` features such as directory transfer and pre-scan that persist task state also require `TiKV PD` and `TiKV`
- Quick Start and runtime packaging workflows depend on Docker

## Quick Start
<a id="quick-start"></a>

## 🚀 Quick Start

Quick Start is the shortest path to try Fluxon. For formal installation, deployment, and operations, see [User Docs](https://tele-ai.github.io/fluxon/user_doc/).

Expand Down Expand Up @@ -222,7 +234,9 @@ Related interface docs:

- [FS Interface](https://tele-ai.github.io/fluxon/user_doc/User---5---FS-Interface/)

## Repository Structure
<a id="repository-structure"></a>

## 🗂️ Repository Structure

- `fluxon_rs/`: Rust core implementation and low-level capabilities
- `fluxon_py/`: Python interfaces, runtime, and bindings
Expand All @@ -232,15 +246,20 @@ Related interface docs:
- `examples/fluxon_quick_start/`: minimal runnable environment entrypoint
- `fluxon_test_stack/`: test stack, benchmarks, and gitops entrypoint

## Contributing
<a id="contributing"></a>

## 🤝 Contributing

Contributions are welcome. Before you start, please read the developer docs on GitHub Pages:

- [Developer Docs](https://tele-ai.github.io/fluxon/dev_doc/)
- [Developer - 1 - Package core install artifacts](https://tele-ai.github.io/fluxon/dev_doc/Developer---1---Package-Core-Install-Artifacts/)
- [Developer - 2 - Package middleware and images](https://tele-ai.github.io/fluxon/dev_doc/Developer---2---Package-Middleware-and-Images/)
- [Developer - 4 - Publish a release](https://tele-ai.github.io/fluxon/dev_doc/Developer---4---Publish-a-Release/)

<a id="contributors"></a>

## Contributors
## 👥 Contributors

<a href="https://github.com/Tele-AI/fluxon/graphs/contributors">
<img src="https://contrib.rocks/image?repo=Tele-AI/fluxon" />
Expand Down Expand Up @@ -268,10 +287,14 @@ Some earlier contribution records are no longer fully reflected in the current c
- `RuileLu`: KV lease support
- `Summage`: Initial KV architecture optimization

## License
<a id="license"></a>

## 📄 License

Fluxon is open-sourced under Apache License 2.0, see [LICENSE](./LICENSE).

## Stargazers over time
<a id="stargazers-over-time"></a>

## ⭐ Stargazers over time

[![Stargazers over time](https://starchart.cc/Tele-AI/fluxon.svg)](https://starchart.cc/Tele-AI/fluxon)
45 changes: 34 additions & 11 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,9 @@ Fluxon 是一套面向世界模型与其他 AI-native 场景训推系统的高

</div>

## 当前目录
<a id="当前目录"></a>

## 🧭 当前目录

- [底座能力](#底座能力)
- [接口能力](#接口能力)
Expand All @@ -31,7 +33,9 @@ Fluxon 是一套面向世界模型与其他 AI-native 场景训推系统的高
- [许可证](#许可证)
- [Star 增长趋势](#star-增长趋势)

## 底座能力
<a id="底座能力"></a>

## 🧱 底座能力

- 全链路 Rust:把连接处理、协议编解码、状态机推进、共享内存管理和观测采集收进 Rust 热路径
- 存传一体:优先走跨进程共享内存快路径,把“存”和“传”放进同一套数据面统一优化
Expand All @@ -48,7 +52,9 @@ Fluxon 是一套面向世界模型与其他 AI-native 场景训推系统的高

![](./pics/topology_ui.png)

## 接口能力
<a id="接口能力"></a>

## 🔌 接口能力

### Fluxon KV/RPC 接口

Expand Down Expand Up @@ -84,7 +90,9 @@ Fluxon FS 是一款面向 AI 数据与模型文件、兼容 `S3` 的高性能文
- 小文件 / 大文件读写特化优化:针对不同文件粒度和读写路径分别做并发与链路优化,提升带宽利用率与整体吞吐
- 跨集群大规模搬迁:支持 `PB` 级数据迁移,并把缓存、传输和失败恢复放进统一链路

## 基准测试
<a id="基准测试"></a>

## 📊 基准测试

benchmark 主要覆盖 `RPC`、`KV` 和 `FS` 三类数据面;相关脚本和配置主要位于 `fluxon_test_stack/`。

Expand All @@ -110,15 +118,19 @@ benchmark 显示,小文件读和大文件写已显著领先 `Alluxio`,大文

`MQ` 目前主要展示场景问题和数据面设计,自动化运行入口见 `test_runner.py` 与 `fluxon_test_stack/`。

## 运行要求
<a id="运行要求"></a>

## 🧰 运行要求

- Linux only
- Python `>= 3.10`
- 从源码构建时,Rust 工具链以 [fluxon_rs/rust-toolchain.toml](./fluxon_rs/rust-toolchain.toml) 为准,当前固定为 `1.93.0`
- 依赖的外部中间件:最小服务平面需要 `etcd`、`greptime`;启用 `FluxonFS` 的目录传输、预扫描等持久任务状态能力时还需要 `TiKV PD`、`TiKV`
- Quick Start 或运行时打包链路会依赖 Docker

## 快速开始
<a id="快速开始"></a>

## 🚀 快速开始

Quick Start 用于最短路径体验;正式安装、部署和运维入口见 [用户文档](https://tele-ai.github.io/fluxon/cn/user_doc/)。

Expand Down Expand Up @@ -224,7 +236,9 @@ FS Quick Start 会额外打印:

- [FS 接口](https://tele-ai.github.io/fluxon/cn/user_doc/%E7%94%A8%E6%88%B7---5---FS%E6%8E%A5%E5%8F%A3/)

## 项目结构
<a id="项目结构"></a>

## 🗂️ 项目结构

- `fluxon_rs/`:Rust 核心实现与底层能力
- `fluxon_py/`:Python 接口、运行时与绑定
Expand All @@ -234,15 +248,20 @@ FS Quick Start 会额外打印:
- `examples/fluxon_quick_start/`:最小可运行环境入口
- `fluxon_test_stack/`:测试栈、benchmark 与 gitops 入口

## 贡献
<a id="贡献"></a>

## 🤝 贡献

欢迎参与贡献。开始之前,建议先阅读 GitHub Pages 上的开发者文档:

- [开发者文档总入口](https://tele-ai.github.io/fluxon/cn/dev_doc/)
- [开发者 - 1 - 打包核心安装包](https://tele-ai.github.io/fluxon/cn/dev_doc/%E5%BC%80%E5%8F%91%E8%80%85---1---%E6%89%93%E5%8C%85%E6%A0%B8%E5%BF%83%E5%AE%89%E8%A3%85%E5%8C%85/)
- [开发者 - 2 - 打包中间件和镜像](https://tele-ai.github.io/fluxon/cn/dev_doc/%E5%BC%80%E5%8F%91%E8%80%85---2---%E6%89%93%E5%8C%85%E4%B8%AD%E9%97%B4%E4%BB%B6%E5%92%8C%E9%95%9C%E5%83%8F/)
- [开发者 - 4 - 发布 Release](https://tele-ai.github.io/fluxon/cn/dev_doc/%E5%BC%80%E5%8F%91%E8%80%85---4---%E5%8F%91%E5%B8%83-Release/)

<a id="contributors"></a>

## Contributors
## 👥 Contributors

<a href="https://github.com/Tele-AI/fluxon/graphs/contributors">
<img src="https://contrib.rocks/image?repo=Tele-AI/fluxon" />
Expand Down Expand Up @@ -270,10 +289,14 @@ FS Quick Start 会额外打印:
- `RuileLu`: KV lease 功能支持
- `Summage`: 初始 KV 架构设计优化

## 许可证
<a id="许可证"></a>

## 📄 许可证

Fluxon 基于 Apache License 2.0 开源,见 [LICENSE](./LICENSE)。

## Star 增长趋势
<a id="star-增长趋势"></a>

## ⭐ Star 增长趋势

[![Stargazers over time](https://starchart.cc/Tele-AI/fluxon.svg)](https://starchart.cc/Tele-AI/fluxon)
Loading