Scala3GPT: Pure PyTorch GPT Implementation in Scala3 🌟

* 🎖️ AI Infra 3.0 ON Scala3 ! 🎖️

Overview 📇 🏠 🪟 🍎 🐧

Scala3GPT is the first pure PyTorch implementation of the GPT (Generative Pre-trained Transformer) model using Scala and Java. This project demonstrates that Scala, combined with Java, can serve as a viable and efficient language for large-scale machine learning model development, challenging the dominance of Python in the LLM (Large Language Model) ecosystem.

Built on the Storch library (Scala bindings for PyTorch), ScalaGPT showcases the practicality of using JVM languages for cutting-edge AI research and production-grade LLM development.

Key Innovations 🐍 🏠 🪟 🐧

First Pure PyTorch GPT in Scala/Java:突破了 Python 对 PyTorch 生态的垄断，首次实现了完全基于 Scala 和 Java 的 PyTorch GPT 模型，证明了 JVM 语言在大模型开发中的可行性。
Scala for Large Model Development: Demonstrates Scala's suitability for LLM engineering through its strong type system, functional programming paradigms, and seamless JVM integration—critical for building maintainable, large-scale AI systems.
Storch-Powered: Leverages the Storch library to bridge Scala with PyTorch, offering unique advantages over traditional Python-based PyTorch workflows.

Why STorch AI (Scala3) Over Python PyTorch? 🏎 🏠/☁️ 🐧/🍎

Storch brings Scala's strengths to PyTorch development, offering compelling benefits for LLM engineering:

1. Static Typing & Compile-Time Safety

Scala's static type system catches errors at compile time, reducing runtime bugs common in dynamic Python code—essential for maintaining large codebases with hundreds of model components (e.g., Block.scala, MultiHeadAttention.scala in this project).

2. Conciseness & Expressiveness

Scala's functional syntax (e.g., pattern matching, higher-order functions) enables more compact and readable model implementations compared to Python. For example, Storch's tensor operations combine PyTorch's flexibility with Scala's elegance:

3. JVM Ecosystem Integration

Seamlessly interoperates with Java libraries and enterprise tools (e.g., Spark for distributed training, Kafka for data pipelines), eliminating Python's "glue code" overhead in production environments.

4. Performance Optimizations

Storch leverages JVM's mature garbage collection and just-in-time (JIT) compilation, delivering comparable or superior runtime performance to Python for long-running training workloads.

5. Scalability

Scala's actor model (via Akka) and parallel collections simplify distributed training implementation, a critical requirement for scaling LLMs to billions of parameters. Project Stru

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
project		project
src/main/scala		src/main/scala
.gitignore		.gitignore
.scalafmt.conf		.scalafmt.conf
build.sbt		build.sbt
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scala3GPT: Pure PyTorch GPT Implementation in Scala3 🌟

* 🎖️ AI Infra 3.0 ON Scala3 ! 🎖️

Overview 📇 🏠 🪟 🍎 🐧

Key Innovations 🐍 🏠 🪟 🐧

Why STorch AI (Scala3) Over Python PyTorch? 🏎 🏠/☁️ 🐧/🍎

1. Static Typing & Compile-Time Safety

2. Conciseness & Expressiveness

3. JVM Ecosystem Integration

4. Performance Optimizations

5. Scalability

About

Uh oh!

Releases

Packages

Languages

bytedeco/scala3gpt-llm

Folders and files

Latest commit

History

Repository files navigation

Scala3GPT: Pure PyTorch GPT Implementation in Scala3 🌟

* 🎖️ AI Infra 3.0 ON Scala3 ! 🎖️

Overview 📇 🏠 🪟 🍎 🐧

Key Innovations 🐍 🏠 🪟 🐧

Why STorch AI (Scala3) Over Python PyTorch? 🏎 🏠/☁️ 🐧/🍎

1. Static Typing & Compile-Time Safety

2. Conciseness & Expressiveness

3. JVM Ecosystem Integration

4. Performance Optimizations

5. Scalability

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages