Skip to content

System Design

Cheat Sheet

Tips
  • Lead the discussion
  • Take notes
  • Create TODO list
  • Back or envelope calculation
  • API
    • Error handling
  • Discuss trade offs!
  • Define success metrics
  • Cleanup: Eviction, workers to clean things up
Powers of 10
10^x Bytes
10^0 1 Byte
10^2 128 Bytes
- 256 Bytes
10^3 KB (Kilo)
10^6 MB (Mega)
10^9 GB (Giga)
10^12 TB (Tera)
10^15 PB (Peta)
10^18 EB (Exa)
10^21 ZB (Zetta)
10^24 YB (Yotta)

1 byte = 8 bits

Timing 35m

Remember! Discuss Tradeoffs

5m Discovery / Estimations
  • Functional - How the user interacts with the system
  • Non Functional - Consistency, Availability, Latency, Fault Tolerant, Write vs Read
  • Estimations QPS - Back of envelope, read, write, data storage
    1. Determine use caes
    2. API calls per day DAU
    3. Divide by 100k (10^5)
    4. Storage
      1. Incoming/Outgoing Data
      2. GB / Day / 86,400 = GB per sec
  • Worst Case - What is the worst case?

2-5m on API

5m on High Level (diagram) - breadth first!
  • Stay high level, avoid going into details here
  • Microservices
2-5m on Data Model
  • Tables logical structure and relations
  • Total DB Storage
15m dive deeper into thing
  • Walk through data flow
  • Handle Edge conditions
  • How does this get stored
  • What can go wrong
  • Prove your design is correct
2m on Scaling
  • How many servers to handle the QPS
Data Stores
Type Description Pros Cons
SQL Collection of data items organized in tables - ACID
- Transactions
- Relational
- Strict Schema
- Consitent
- Durable
- Scaling Writes
NoSQL NoSQL is a collection of data items which favors availability over consistency - Available
- Flexible Schema
- Non Relational
- Large Storage Capacity TB or PB
- High IOPS
- Horizontally Scalable
- Eventually Consistent
NoSQL Key/Value High performance for simple data models due to being stored in memory - O(1) Read/Write
- Very Fast
- Simple Data Model
- Limited Storage Capacity
NoSQL Document A document store is basically a fancy key-value store. The values are blobs of semi-structured data, such as JSON or XML, and we treat the data store like it’s just a big array of these blobs. The query language of the document store will then allow you to filter or sort based on the content inside of those document blobs. - Documents
- Scalable
- Available
- Not good for data aggregation
- Not ACID
NoSQL Column A wide column store is somewhere in between a document store and a relational DB. It still uses tables, rows, and columns like a relational DB, but the names and formats of the columns can be different for various rows in the same table. This strategy combines the strict table structure of a relational database with the flexible content of a document store. - Retrieve Aggregate Data Efficiently
- Scalable
- Available
- Flexible Schema
- Not ACID
NoSQL Graph Each node is a record and each arc is a relationship between two nodes. Graph databases are optimized to represent complex relationships with many foreign keys or many-to-many relationships. - Fast traversal of relationships
- Relations
Search Engine Special data store for searching text based documents. Could use a Trie or Inverted Index - Very good text seraches
- Trie (Type Ahead)
- Inverted Index (Full Text)
Message Queues Good for when you need to ship some data between services in a way that is fast, reliable, and distributed. - Reliable
- Fast
- Support Distributing Loads
- Asynchronous
QuadTrees Tree structure where every non-leaf node has 4 childres. - Geolocation + Caching
- Hit Detection
- Find Nearest Neighbor
- Searching Sparse Data
- Keep a balance among the precision and validity of results
Byte Estimations
Name Bytes Use
Ascii Char 1 Backend
Unicode Char 2 User Facing
Int 8 16 more precision
Ascii Word 10 Text
Storage Assumptions
Type Quality Size Example
Images Low 20 KB Thumbnails
Images Medium 200 KB Most website images
Images High 2 MB Most smartphone images
Video Low 256 MB One Min 720p video
Video Medium 403 MB One Min 1080p video
Video High 2 GB One min 4K video
eBook - 1 - 5 MB
song mp3 - 3 - 4 MB
comment - 100 bytes
API Tips
  • Use Nouns
  • Follow CRUD
Methods
Method Purpose Success Response Codes Notes
GET Read 200 Transfer a current representation of the target resource
POST Create 200 created + result, 201 created Perform resource-specific processing on the request payload
PUT Update 201 created, 200/204 updated Idempotent, Replace all current representations of the target resource with the request payload
DELETE Delete 202 likely to succeed, 204 completed, 200 result Remove all current representations of the target resource
  • POST-PUT Creation Pattern
  • Client sends empty POST which creates an empty resource with no side effects
  • Client sends a PUT with content for the resource, since PUT is idempotent resending it multiple times will have no ill effects
Status Codes
Code Name Description
200 OK Request succeeded
201 Created Request succeeded, new resource created
202 Accepted Request received but not acted upon
204 No Content Request succeeded, client can stay on current page
301 Moved Permanently The URL has been moved permanently
302 Found The URL has been moved temporarily
304 Not Modified The response has not been modified so the client can use the same cached response
400 Bad Request Invalid request syntax
401 Unauthorized The client must authenticate itself to get the requested response
403 Forbidden The client is authenticated, but it’s not allowed to access a resource
404 Not Found Server cannot find the resource
405 Method Not Allowed The request method is knwon by the server but not supported by the target resource
409 Conflict The request conflicts with the current state of the server
500 Internal Server Error The server encountered a situation it does not know how to handle
502 Bad Gateway The server while working as a gateway got an invalid response from an upstream server
503 Service Unavailable The server is not ready to handle the request, could be overloaded
Hashing / IDs
Name Bits / Bytes Chars Purpose
UUIDv4 128 / 16 36 Unique Random ID
MD5 128 / 16 32 Simple Hash Vulnerable
SHA-1 160 / 20 40 Better Still Vulnerable
SHA-2 (256) 256 / 32 64 NIST Recommended Secure
SHA-3 (256) 256 / 32 64 May be slightly better than SHA-2
Encodings
Type Details Notes
Base64 10 Digits + 26 Lower Char + 26 Upper Char Includes +/ and 65th char = for padding
Base62 10 Digits + 26 Lower Char + 26 Upper Char All alphanumeric, no padding
Base58 Base62 without 0 I O l
Base32 6 Digits (2-7) + 26 Upper Char Includes = for padding. Takes 20% more space than Base64
z-base-32 8 Digits (1-9 exclude 2) + 24 Lower Char (excludes l and v) No padding
Base16 (Hex) Uses 0–9 and A-F
Request Types
Type Details
Polling Request made from client every x seconds
Long Polling Request made from client, if data is not available it hangs until the data is available then responds or times out. Client sends the next request
Web Sockets BiDirectional, persistent connection

Tradeoffs

Databases

MySQL vs NoSQL

Requests

Long Polling Requests Websockets

Powers Table

Power 10 Bytes Name
10^0 2^0 = 1 Byte
10^2 2^7 = 128 Bytes
- 2^8 = 256 Bytes
10^3 2^10 = 1k 1 KB (Kilo)
10^6 2^20 = 1mil 1 MB (Mega)
10^9 2^30 = 1bil 1 GB (Giga)
10^12 2^40 = 1tril 1 TB (Tera)
10^15 2^50 = 1quadr 1 PB (Peta)
10^18 2^60 = 1quint 1 EB (Exa)
10^21 2^70 = 1sext 1 ZB (Zetta)
10^24 2^80 = 1sept 1 YB (Yotta)

Shortcuts

  • K * K = M
  • G / M = K

Bytes Storage

* 1 bit: a binary decision
* 1 byte: a character
* 5 Megabytes: The complete works of Shakespeare
* 2 Gigabytes: 20 meters of shelved books
* 10 Terabytes: The printed collection of the US Library of Congress
* 200 Petabytes: All printed material
* 5 Exabytes: All words ever spoken by human beings

Unique Characters

10 Characters
Base62 Encoded
62^10 = 8.3 * 10^17 over a quadrillion

6 Characters
62^6 = 58B

Latency

Latency Comparison Numbers
--------------------------
L1 cache reference                           0.5 ns
Branch mispredict                            5   ns
L2 cache reference                           7   ns                      14x L1 cache
Mutex lock/unlock                           25   ns
Main memory reference                      100   ns                      20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy            10,000   ns       10 us
Send 1 KB bytes over 1 Gbps network     10,000   ns       10 us
Read 4 KB randomly from SSD*           150,000   ns      150 us          ~1GB/sec SSD
Read 1 MB sequentially from memory     250,000   ns      250 us
Round trip within same datacenter      500,000   ns      500 us
Read 1 MB sequentially from SSD*     1,000,000   ns    1,000 us    1 ms  ~1GB/sec SSD, 4X memory
HDD seek                            10,000,000   ns   10,000 us   10 ms  20x datacenter roundtrip
Read 1 MB sequentially from 1 Gbps  10,000,000   ns   10,000 us   10 ms  40x memory, 10X SSD
Read 1 MB sequentially from HDD     30,000,000   ns   30,000 us   30 ms 120x memory, 30X SSD
Send packet CA->Netherlands->CA    150,000,000   ns  150,000 us  150 ms

Notes
-----
1 ns = 10^-9 seconds
1 us = 10^-6 seconds = 1,000 ns
1 ms = 10^-3 seconds = 1,000 us = 1,000,000 ns

Caching

Caching improves page load times and can reduce the load on your servers and databases. In this model, the dispatcher will first lookup if the request has been made before and try to find the previous result to return, in order to save the actual execution.

Databases often benefit from a uniform distribution of reads and writes across its partitions. Popular items can skew the distribution, causing bottlenecks. Putting a cache in front of a database can help absorb uneven loads and spikes in traffic.

Disadvantage(s): cache

  • Need to maintain consistency between caches and the source of truth such as the database through cache invalidation.
  • Cache invalidation is a difficult problem, there is additional complexity associated with when to update the cache.
  • Need to make application changes such as adding Redis or memcached.

Types

Client

Caches can be located on the client side (OS or browser), server side, or in a distinct cache layer.

CDN

CDNs are considered a type of cache.

Web Server

Reverse proxies and caches such as Varnish can serve static and dynamic content directly. Web servers can also cache requests, returning responses without having to contact application servers.

Database

Your database usually includes some level of caching in a default configuration, optimized for a generic use case. Tweaking these settings for specific usage patterns can further boost performance.

Application

In-memory caches such as Memcached and Redis are key-value stores between your application and your data storage. Since the data is held in RAM, it is much faster than typical databases where data is stored on disk. RAM is more limited than disk, so cache invalidation algorithms such as least recently used (LRU) can help invalidate 'cold' entries and keep 'hot' data in RAM.

Redis has the following additional features: - Persistence option - Built-in data structures such as sorted sets and lists

There are multiple levels you can cache that fall into two general categories: database queries and objects: - Row level - Query-level - Fully-formed serializable objects - Fully-rendered HTML

Generally, you should try to avoid file-based caching, as it makes cloning and auto-scaling more difficult

Database Query Level

Whenever you query the database, hash the query as a key and store the result to the cache. This approach suffers from expiration issues: - Hard to delete a cached result with complex queries - If one piece of data changes such as a table cell, you need to delete all cached queries that might include the changed cell

Object Level

See your data as an object, similar to what you do with your application code. Have your application assemble the dataset from the database into a class instance or a data structure(s): - Remove the object from cache if its underlying data has changed - Allows for asynchronous processing: workers assemble objects by consuming the latest cached object

Suggestions of what to cache: - User sessions - Fully rendered web pages - Activity streams - User graph data

Updating the Cache

Since you can only store a limited amount of data in cache, you'll need to determine which cache update strategy works best for your use case.

Cache-aside

Cache Aside

The application is responsible for reading and writing from storage. The cache does not interact with storage directly. The application does the following:

  1. Look for entry in cache, resulting in a cache miss
  2. Load entry from the database
  3. Add entry to cache
  4. Return entry

Subsequent reads of data added to cache are fast. Cache-aside is also referred to as lazy loading. Only requested data is cached, which avoids filling up the cache with data that isn't requested.

Disadvantage(s): cache-aside

  • Each cache miss results in three trips, which can cause a noticeable delay.
  • Data can become stale if it is updated in the database. This issue is mitigated by setting a time-to-live (TTL) which forces an update of the cache entry, or by using write-through.
  • When a node fails, it is replaced by a new, empty node, increasing latency.

Write-through

Write Through

The application uses the cache as the main data store, reading and writing data to it, while the cache is responsible for reading and writing to the database:

  1. Application adds/updates entry in cache
  2. Cache synchronously writes entry to data store
  3. Return

Write-through is a slow overall operation due to the write operation, but subsequent reads of just written data are fast. Users are generally more tolerant of latency when updating data than reading data. Data in the cache is not stale.

Disadvantage(s): write through

  • When a new node is created due to failure or scaling, the new node will not cache entries until the entry is updated in the database. Cache-aside in conjunction with write through can mitigate this issue.
  • Most data written might never be read, which can be minimized with a TTL.

Write-behind

Write Behind

In write-behind, the application does the following:

  1. Add/update entry in cache
  2. Asynchronously write entry to the data store, improving write performance

Disadvantage(s): write-behind

  • There could be data loss if the cache goes down prior to its contents hitting the data store.
  • It is more complex to implement write-behind than it is to implement cache-aside or write-through.

Refresh-ahead

Refresh Ahead

You can configure the cache to automatically refresh any recently accessed cache entry prior to its expiration.

Refresh-ahead can result in reduced latency vs read-through if the cache can accurately predict which items are likely to be needed in the future.

Disadvantage(s): refresh-ahead

  • Not accurately predicting which items are likely to be needed in the future can result in reduced performance than without refresh-ahead.

Availability vs Consistency

CAP Overview

  • Consistency - A read is guaranteed to return the most recent write for a given client.
  • Availability - A non-failing node will return a reasonable response within a reasonable amount of time (no error or timeout).
  • Partition Tolerance - The system will continue to function when network partitions occur.

Networks aren't reliable, so you'll need to support partition tolerance. You'll need to make a software tradeoff between consistency and availability.

CP - Consistency/Partition Tolerance
CAP CP
Waiting for a response from the partitioned node might result in a timeout error. CP is a good choice if your business needs require atomic reads and writes.

AP - Availability and Partition Tolerance
CAP AP
Responses return the most readily available version of the data available on any node, which might not be the latest. Writes might take some time to propagate when the partition is resolved.

AP is a good choice if the business needs allow for eventual consistency or when the system needs to continue working despite external errors.

Databases

SQL vs NoSQL

SQL vs NoSQl

Reasons for SQL

  • Structured data
  • Strict schema
  • Relational data
  • Need for complex joins
  • Transactions
  • Clear patterns for scaling
  • More established: developers, community, code, tools, etc
  • Lookups by index are very fast

Reasons for NoSQL

  • Semi-structured data
  • Dynamic or flexible schema
  • Non-relational data
  • No need for complex joins
  • Store many TB (or PB) of data
  • Very data intensive workload
  • Very high throughput for IOPS

Sample data well-suited for NoSQL:

  • Rapid ingest of clickstream and log data
  • Leaderboard or scoring data
  • Temporary data, such as a shopping cart
  • Frequently accessed ('hot') tables
  • Metadata/lookup tables

SQL

A relational database like SQL is a collection of data items organized in tables.

ACID is a set of properties of relational database transactions.

  • Atomicity - Each transaction is all or nothing
  • Consistency - Any transaction will bring the database from one valid state to another
  • Isolation - Executing transactions concurrently has the same results as if the transactions were executed serially
  • Durability - Once a transaction has been committed, it will remain so

NoSQL

NoSQL is a collection of data items represented in a key-value store, document store, wide column store, or a graph database. Data is denormalized, and joins are generally done in the application code. Most NoSQL stores lack true ACID transactions and favor eventual consistency.

BASE is often used to describe the properties of NoSQL databases. In comparison with the CAP Theorem, BASE chooses availability over consistency.

  • Basically available - the system guarantees availability.
  • Soft state - the state of the system may change over time, even without input.
  • Eventual consistency - the system will become consistent over a period of time, given that the system doesn't receive input during that period.

NoSQL Databases

Key - Value Store

DynamoDB, Redis, Memcached

Abstraction: hash table

A key-value store generally allows for O(1) reads and writes and is often backed by memory or SSD. Data stores can maintain keys in lexicographic order, allowing efficient retrieval of key ranges. Key-value stores can allow for storing of metadata with a value.

Key-value stores provide high performance and are often used for simple data models or for rapidly-changing data, such as an in-memory cache layer. Since they offer only a limited set of operations, complexity is shifted to the application layer if additional operations are needed.

A key-value store is the basis for more complex systems such as a document store, and in some cases, a graph database.

Document Store

MongoDB, DynamoDB

Abstraction: key-value store with documents stored as values

A document store is centered around documents (XML, JSON, binary, etc), where a document stores all information for a given object. Document stores provide APIs or a query language to query based on the internal structure of the document itself. Note, many key-value stores include features for working with a value's metadata, blurring the lines between these two storage types.

Based on the underlying implementation, documents are organized by collections, tags, metadata, or directories. Although documents can be organized or grouped together, documents may have fields that are completely different from each other.

Some document stores like MongoDB and CouchDB also provide a SQL-like language to perform complex queries. DynamoDB supports both key-values and documents.

Document stores provide high flexibility and are often used for working with occasionally changing data.

Wide Column Store

Bigtable, HBase, Cassandra

Abstraction: nested map ColumnFamily<RowKey, Columns<ColKey, Value, Timestamp>>

Column Store Column Schema

A wide column store's basic unit of data is a column (name/value pair). A column can be grouped in column families (analogous to a SQL table). Super column families further group column families. You can access each column independently with a row key, and columns with the same row key form a row. Each value contains a timestamp for versioning and for conflict resolution.

Google introduced Bigtable as the first wide column store, which influenced the open-source HBase often-used in the Hadoop ecosystem, and Cassandra from Facebook. Stores such as BigTable, HBase, and Cassandra maintain keys in lexicographic order, allowing efficient retrieval of selective key ranges.

Wide column stores offer high availability and high scalability. They are often used for very large data sets.

Graph

Neo4j, FlockDB

Abstraction: graph

Graph DB

In a graph database, each node is a record and each arc is a relationship between two nodes. Graph databases are optimized to represent complex relationships with many foreign keys or many-to-many relationships.

Graphs databases offer high performance for data models with complex relationships, such as a social network. They are relatively new and are not yet widely-used; it might be more difficult to find development tools and resources. Many graphs can only be accessed with REST APIs.

Asynchronism

Asynchronous workflows help reduce request times for expensive operations that would otherwise be performed in-line. They can also help by doing time-consuming work in advance, such as periodic aggregation of data.

Message Brokers

Asynchronous messaging is a messaging scheme where message production by a producer is decoupled from its processing by a consumer. When dealing with messaging systems, we typically identify two main messaging patterns – Message Queuing and Publish/Subscribe.

Message Queueing

Multiple producers can send messages to the same queue; however, when a consumer processes a message, it is locked or removed from the queue and is no longer available. Only a single consumer consumes a specific message.

Point to Point

As a side note, if the consumer fails to process a certain message, the messaging platform typically returns the message to the queue where it is made available for other consumers. Besides temporal decoupling, queues allow us to scale producers and consumers independently as well as providing a degree of fault-tolerance against processing errors.

Publish -> Subscribe

In the Publish/Subscribe (or Pub/Sub) communication pattern, a single message can be received and processed by multiple subscribers concurrently.

Publish / Subscribe

This pattern allows a publisher, for example, to notify all subscribers that something has happened in the system. Many queueing platforms often associate pub/sub with the term topics. In RabbitMQ, topics are a specific type of pub/sub implementation (a type of exchange to be exact), but during this post, I refer to topics as a representation of pub/sub as a whole.

Generically speaking, there are two types of subscriptions:

  1. An ephemeral subscription, where the subscription is only active as long the consumer is up and running. Once the consumer shuts down, its subscription and yet-to-be processed messages are lost.
  2. A durable subscription, where the subscription is maintained as long as it is not explicitly deleted. When the consumer shuts down, the messaging platform maintains the subscription and message processing can be resumed later.

Advantages

  • Provided communication between services that may not be running at the same time. The producer can send messages regardless of whether the consumer is active or not. All it needs is a running message broker. The same applies to the consumer.
  • Improved system performance by introducing asynchronous processing. High-consuming tasks can be distributed to separate processes. That will fasten up your application and increase user experience.
  • Increased reliability by guaranteeing the transmission of messages. Message brokers offer a redelivering mechanism. In case of consumer failure, it can redeliver the message immediately or after some specified time. It also supports routing of not delivered messages – it’s called a dead-letter mechanism.

Disadvantages

  • Eventual Consistency Some components could not have up-to-date data until the messages are propagated and processed.