Jepsen

Jepsen is an effort to improve the safety of distributed databases, queues, consensus systems, etc. We maintain an open source software library for systems testing, as well as blog posts and conference talks exploring particular systems’ failure modes. In each analysis we explore whether the system lives up to its documentation’s claims, file new bugs, and suggest recommendations for operators.

Since 2013, Jepsen has analyzed over two dozen databases, coordination services, and queues—and we’ve found replica divergence, data loss, stale reads, read skew, lock conflicts, and much more. Here’s every analysis we’ve published.

Delta Lake & Iceberg

Performance Benchmarking

Indexes

Concepts

  • Fsync Machines vs Join Machines Two Machines
  • How Query Engines Work A query engine is a piece of software that can execute queries against data to produce answers to questions.

Databases

A list of databases and links to relevant information shared in the channel

SQL Server

PostgreSQL

Amazon Aurora

ClickHouse

  • CPU Dispatch in ClickHouse - How vectorization works, what CPU dispatch is, how to find places for CPU dispatch optimizations and how we use CPU dispatch in ClickHouse.

MegaStore

Megastore: Providing Scalable, Highly Available Storage for Interactive Services Megastore is a storage system developed to meet the requirements of today’s interactive online services. Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability. We provide fully serializable ACID semantics within fine-grained partitions of data. This partitioning allows us to synchronously replicate each write across a wide area network with reasonable latency and support seamless failover between datacenters. This paper describes Megastore’s semantics and replication algorithm. It also describes our experience supporting a wide range of Google production services built with Megastore.