I Was Thinking in Databases. I Should Have Been Thinking in Networks: A Mental Model Shift for Cloudflare Storage
It took me longer than I’d like to admit to understand Cloudflare’s storage products. It’s not that the docs aren’t sufficient; it’s that I was lacking the right mental model.
Once I started thinking in terms of a global network, it all clicked. Not just how storage products work, but why Cloudflare products are designed the way they are. I hope this article does the same for you.
KV (Key/Value)
Key value stores have been around since the late 70s. Ken Thompson worked on dbm which was able to store a single key and map it to a value. Today we might think of Redis as a popular KV store.
But to think that Cloudflare KV is “basically Redis” would be a mistake. A mistake I have made myself.
You’d use Redis for rate-limiting, for example. You wouldn’t use KV for rate-limiting due to its eventual consistency model.
Storing a value in Cloudflare KV is as easy as:
import { env } from 'cloudflare:workers';
await env.KV.put('feature:dark-mode', 'true', { expirationTtl: 86400 // 24 hours});This stores the key feature:dark-mode with the string value 'true' in one of Cloudflare’s central stores.
When someone tries to read that value, it is fetched from the central store if the data is not cached (cold). The result is then cached at the edge by the requesting colocation, allowing subsequent reads to be served from the cache (hot), under 50 ms for 95% of the world.
What does that look like? Below you can see a simulation. A request will go out to the nearest central storage for the first time, then get cached at the requesting colocation. All users close to that location will get that value served for the remaining TTL.
Aha! Now eventual consistency makes sense. When you write a new value, it will take several seconds to update all locations. It will be consistent across the network, eventually.
This insight cleared up for me why KV is not the correct product for rate-limiting. Instead, you’d want to use Rate Limiting or Durable Objects.
D1 (SQLite-like)
SQLite is the most used database engine in the world. While D1 isn’t exactly SQLite, it uses the same query engine.
D1 is Cloudflare’s serverless database which allows you to store relational data. It comes with useful features like time-travel and read replication (in beta at time of writing).
There is one important constraint you need to be aware of: D1 databases have a 10GB limit. You can have many smaller databases (up to 50,000, horizontal scale), but they can’t grow past 10GB (vertical scale).
For all my personal projects, D1 has been more than enough. Yet if I were building a business today, I’d consider Hyperdrive or Durable Objects instead. More on this later.
When you create a D1 database, it is placed close to you, or close to the location hint you provided. Then you can query it with familiar SQLite syntax.
import { env } from 'cloudflare:workers';
// Write: Insert new orderawait env.DB.prepare( `INSERT INTO orders (user_id, product) VALUES (?, ?)`).bind('user-123', 'Blue Belt').run();
// Read: Get order statusconst order = await env.DB.prepare( `SELECT * FROM orders WHERE user_id = ?`).bind('user-123').first();Users all over the world will reach that database for both reads and writes. This will be fast for your users close to the location hint I mentioned earlier, but slower for people on the other side of the planet.
This slowness is addressed by read replication, which keeps read-only replicas of your primary database synchronized across Cloudflare’s edge network, allowing read queries to be served from locations closer to users.
import { env } from 'cloudflare:workers';
const session = env.DB.withSession();
// Write: Insert new orderawait env.DB.prepare(await session.prepare( `INSERT INTO orders (user_id, product) VALUES (?, ?)`).bind('user-123', 'Blue Belt').run();
// Read: Get order statusconst order = await env.DB.prepare(const order = await session.prepare( `SELECT * FROM orders WHERE user_id = ?`).bind('user-123').first();Below it is visualized, toggle between read-replication ON/OFF.
I’ve been using D1 for several personal and internal projects successfully with Drizzle ORM. If the database starts growing past 8GB I will migrate to the next product of discussion:
Hyperdrive
One of the most powerful products you might not have heard of is Hyperdrive. It is a great alternative for when you need globally fast applications but D1’s size limit is a deal-breaker.
Like the other products, Hyperdrive makes sense in the context of Cloudflare’s network. Instead of your application directly connecting to your database, Cloudflare will keep a connection pool warm close to your physical database, so you can skip the entire TCP/SSL set-up for every request.
Even better is query caching, which leverages the network to cache read queries, with a similar mechanism to KV from earlier. This means that your database in US east can feel local to a user in Europe.
import { env } from 'cloudflare:workers';import postgres from 'postgres';
const sql = postgres(env.HYPERDRIVE.connectionString);const users = await sql` SELECT * FROM users WHERE active = true`;How does this look within the network? Even though the database is in Eastern North America, you will see read queries from Europe and Asia stay local.
If you don’t have an external database, there is another great alternative for building stateful applications. The illustrious Durable Object.
Durable Objects
An extraordinary data product inside of Cloudflare’s catalog is Durable Objects. There is nothing like it. Because of that, it’s less intuitive to understand.
Like the products we discussed before, Durable Objects make sense when you understand them in the context of Cloudflare’s network.
They allow you to store arbitrary state. Imagine an object in JavaScript that only exists in memory, except it’s persisted to Cloudflare’s edge and has first class support for WebSockets and Alarms.
Each Durable Object instance can exist in a single location at a time. In most cases close to the Worker that created it.
import { DurableObject } from "cloudflare:workers";
export interface Env { BOOKING: DurableObjectNamespace<SeatBooking>;}
export class SeatBooking extends DurableObject<Env> { async bookSeat( seatId: string, userId: string ): Promise<{ success: boolean; message: string }> { const existing = this.ctx.storage.sql .exec<{ user_id: string }>( "SELECT user_id FROM bookings WHERE seat_id = ?", seatId ) .toArray();
if (existing.length > 0) { return { success: false, message: "Seat already booked" }; }
this.ctx.storage.sql.exec( "INSERT INTO bookings (seat_id, user_id, booked_at) VALUES (?, ?, ?)", seatId, userId, Date.now() );
return { success: true, message: "Seat booked successfully" }; }}
export default { async fetch(request: Request, env: Env): Promise<Response> { const url = new URL(request.url); const eventId = url.searchParams.get("event") ?? "default";
const id = env.BOOKING.idFromName(eventId); const booking = env.BOOKING.get(id);
const { seatId, userId } = await request.json<{ seatId: string; userId: string; }>(); const result = await booking.bookSeat(seatId, userId);
return Response.json(result, { status: result.success ? 200 : 409, }); },};What would this look like inside the network? See below. You can click “New Instance” to simulate creating a new Durable Object (DO).
It’s important to state how powerful Durable Objects are. D1, Queues, Workflows and Agents are all examples of Cloudflare products built on top of Durable Objects.
R2 Object Storage
Your application has users. In many cases these users need to upload data. Whether it is a profile picture or a PDF. It needs to go somewhere.
Many of us are familiar with S3. Cloudflare R2 is similar, but has no egress fees. That means you don’t pay for people downloading items stored in your bucket.
R2 uses a similar model to D1 with a single write-primary, but data is replicated within the same region for redundancy. When you want to download a file from R2, a metadata request is made to the location of the bucket.
Downloading the content is fast if the content is cached in a tiered read cache.
Outro
Did it click? Region: earth makes sense when you can visualize the network. I hope this article helped with that.
There are two honorable mentions I didn’t talk about: Vectorize and Analytics Engine. The truth is that I haven’t used them enough yet, but I will.
If you have any questions or want to give feedback, don’t hesitate to reach out to me on Twitter/X