GitHub - martyanov/papers: The most influential papers about distributed systems, databases, algorithms and data structures

Name	Name	Last commit message	Last commit date
Latest commit History 40 Commits
algorithms	algorithms
consensus	consensus
correctness	correctness
database	database
datastore	datastore
gossip	gossip
locking	locking
scheduler	scheduler
README.md	README.md

Overview

The list is highly subjective and by no means complete. If you need more comprehensive list of papers, then probably Papers We Love is a much better resource.

Algorithms and data structures

Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs
FIFO Queues are All You Need for Cache Eviction
Hashed and Hierarchical Timing Wheels: Efficient Data Structures for Implementing a Timer Facility
SPFresh: Incremental In-Place Update for Billion-Scale Vector Search

Consensus

A simple totally ordered broadcast protocol
Amazon Aurora: On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes
Calvin: Fast Distributed Transactions for Partitioned Database Systems
In Search of an Understandable Consensus Algorithm
Logical Physical Clocks and Consistent Snapshots in Globally Distributed Databases
Paxos Made Live - An Engineering Perspective
Paxos Made Simple
Time, Clocks, and the Ordering of Events in a Distributed System
Unreliable Failure Detectors for Reliable Distributed Systems
Viewstamped Replication Revisited

Correctness, testing and implementation

Are You Sure You Want to Use MMAP in Your Database Management System?
Can Applications Recover from fsync Failures?
Simple Testing Can Prevent Most Critical Failures

Database

A Critique of ANSI SQL Isolation Levels
Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases
Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service
Cassandra - A Decentralized Structured Storage System
Dynamo: Amazon’s Highly Available Key-value Store
F1: A Distributed SQL Database That Scales
High Performance Transactions via Early Write Visibility
Highly Available Transactions: Virtues and Limitations
Large-scale Incremental Processing Using Distributed Transactions and Notifications
Linearizability: A Correctness Condition for Concurrent Objects
Procella: Unifying serving and analytical data at YouTube
Spanner, TrueTime & The CAP Theorem
Spanner: Becoming a SQL System
Spanner: Google’s Globally-Distributed Database

Datastore

Bigtable: A Distributed Storage System for Structured Data
CFS: A Distributed File System for Large Scale Container Platforms
CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data
Ceph: A Scalable, High-Performance Distributed File System
Dynamic Metadata Management for Petabyte-scale File Systems
Facebook’s Tectonic Filesystem: Efficiency from Exascale
Finding a needle in Haystack: Facebook’s photo storage
Megastore: Providing Scalable, Highly Available Storage for Interactive Services
RADOS: A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters
Replex: A Scalable, Highly Available Multi-Index Data Store
SLM-DB: Single-Level Key-Value Store with Persistent Memory
The Google File System
WiscKey: Separating Keys from Values in SSD-conscious Storage
f4: Facebook’s Warm BLOB Storage System

Gossip

SWIM: Scalable Weakly-consistent Infection-style Process Group Membership

Locking

The Chubby lock service for loosely-coupled distributed systems

Scheduler

Borg, Omega, and Kubernetes
Large-scale cluster management at Google with Borg
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
Omega: flexible, scalable schedulers for large compute clusters

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

Algorithms and data structures

Consensus

Correctness, testing and implementation

Database

Datastore

Gossip

Locking

Scheduler

License

About

Uh oh!

Uh oh!

martyanov/papers

Folders and files

Latest commit

History

Repository files navigation

Overview

Algorithms and data structures

Consensus

Correctness, testing and implementation

Database

Datastore

Gossip

Locking

Scheduler

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!