I Used DynamoDB for 2 Years. Here’s Why I recommend SQL.

After two years using DynamoDB, I’ve concluded that the best database for a million users is often the wrong choice for your first ten.

Published: August 15, 2025

#dynamodb #databases #sql

I’ve spent a significant amount of time leveraging the power of DynamoDB, particularly with the single-table design pattern. This approach, which utilizes generic keys like PK (Partition Key) and SK (Sort Key) along with Global Secondary Indexes (GSIs), offers incredible performance at scale by eliminating the need for joins, a common bottleneck in traditional SQL databases.

However, my journey with DynamoDB has also illuminated some critical drawbacks, especially for early-stage products where agility and adaptability are paramount.

The Challenge of Predefined Access Patterns in Early-Stage Products

DynamoDB’s greatest strength is its ability to provide fast, predictable performance. This is intrinsically tied to a deep, upfront understanding of your application’s data access patterns. To design an efficient DynamoDB table, particularly a single table, you must know precisely how you will read and write data. This knowledge dictates the choice of your partition and sort keys, as well as the structure of your GSIs.

For a mature product with established features and user behaviors, defining these access patterns is a manageable task. However, for a startup or an early-stage product, the landscape is one of constant evolution. It’s nearly impossible to predict all the ways you’ll need to query your data in the future. This is where DynamoDB’s rigidity can become a significant roadblock.

Concrete Example: An E-commerce Platform

Imagine you’re building a new e-commerce platform. Initially, you might define the following access patterns for your products entity:

Get a product by its productID.
Get all products in a specific category.
Get all products by a certain brand.

With these patterns in mind, you could design a single table with a PK of PRODUCT#{productID} and an SK that also holds the productID for direct lookups. To handle the other access patterns, you might create a GSI, say GSI1, with a GSI1PK of CATEGORY#{category} and a GSI1SK of BRAND#{brand}. This setup would work efficiently for the initial requirements.

Now, let’s say after a few months and acquiring some users, you discover a crucial new feature requirement: the ability to filter products by price range within a specific category. Your current GSI1 doesn’t support this efficiently. You can query by category, but you’d have to fetch all products in that category and then filter by price on the client-side or in your application logic. This is inefficient and can become costly in terms of read capacity units as your product catalog grows.

To address this new access pattern, you would need to either:

Create a new GSI: You could add a GSI2 with a GSI2PK of CATEGORY#{category} and a GSI2SK that represents the product’s price. This would allow for efficient querying of products by price within a category.
Backfill data for the new GSI: After creating the new GSI, you would need to run a script to update all existing product items with the necessary attributes to populate this new index.

In a relational database, this new requirement would be a simple matter of adding an index to the price and category columns in your products table. The flexibility of SQL allows for ad-hoc queries, making it much more forgiving when access patterns evolve. With DynamoDB, what would be a minor change in a SQL database becomes a more involved data migration task. This friction can significantly slow down development velocity at a stage when speed is critical.

The Burden of Schema Enforcement and Boilerplate Code

DynamoDB is often described as “schemaless,” which offers flexibility in storing diverse data within a single table. However, this “flexibility” comes at a cost: the responsibility of maintaining data consistency and a coherent schema shifts from the database to your application code.

While you don’t need to define a rigid schema upfront at the database level (other than the primary key attributes), your application still needs to work with predictable data structures. This means you have to implement robust validation and data transformation logic to ensure that the data being written to and read from DynamoDB conforms to the expected format of your Data Transfer Objects (DTOs).

Concrete Example: User Profile Management

Consider a user profile in your application. In a SQL database, you would have a users table with clearly defined columns and data types (e.g., username as VARCHAR, email as VARCHAR, is_premium_member as BOOLEAN). The database itself enforces these constraints.

In DynamoDB, a user item might look like this:

{
  "PK": "USER#123",
  "SK": "PROFILE",
  "Username": "john_doe",
  "Email": "[email protected]",
  "IsPremium": true
}

Now, imagine a new developer on your team accidentally introduces a change where the IsPremium attribute is saved as a string "true" instead of a boolean true. DynamoDB will happily accept this, as it doesn’t enforce a schema on non-key attributes.

This seemingly minor inconsistency can lead to subtle and hard-to-debug issues in your application. For instance, a part of your code that expects a boolean for IsPremium might fail unexpectedly. To prevent such errors, you need to write extra “boilerplate” code in your application layer:

Validation Logic: Before writing any data to DynamoDB, you need code that validates the data against a predefined schema (e.g., using a library like Zod or Pydantic).
Data Transformation: When reading data, you may need to transform it to ensure it matches your application’s internal data models, handling cases where attributes might be missing or in an unexpected format.

This additional layer of code adds complexity and can increase the surface area for bugs. In contrast, a SQL database handles much of this schema enforcement automatically, allowing developers to focus more on business logic.

Lack of Aggregation and Tooling

A frequent and often underestimated challenge with DynamoDB is its near-total lack of built-in aggregation functions. In the world of SQL, functions like COUNT(), SUM(), AVG(), and GROUP BY are fundamental tools for generating business intelligence and analytics. DynamoDB, being a key-value store optimized for transactional (OLTP) workloads, offers none of these out of the box.

This means that any and all aggregation must be done in your application code. This approach is not only cumbersome but also has serious performance implications, as application-level code is far less optimized for data-set operations than a mature database engine.

Concrete Example: A Simple Sales Dashboard

Imagine your e-commerce platform’s leadership wants a simple, real-time dashboard answering two basic questions:

What is the total sales revenue for today?
What is the average order value per product category for this month?

In a SQL database, the queries are trivial:

SELECT SUM(order_total) FROM orders WHERE order_date = CURRENT_DATE;
SELECT category, AVG(order_total) FROM orders WHERE EXTRACT(MONTH FROM order_date) = EXTRACT(MONTH FROM CURRENT_DATE) GROUP BY category;

These queries are executed efficiently within the database engine, returning only the small, aggregated results you need.

In DynamoDB, the process is a performance nightmare:

To answer these questions, you must:

Fetch all the data: You would have to Query or Scan your table to pull every single order item from the relevant time period into your application’s memory. For a moderately successful store, this could be thousands or tens of thousands of items.
Pay the cost: This operation consumes a large number of Read Capacity Units (RCUs), which can be expensive. More importantly, it involves transferring a potentially huge amount of data over the network.
Process in your application: You then have to write code to iterate over this entire list of orders, manually calculating the sums, counts, and averages.

This process is incredibly inefficient. A task that takes milliseconds in SQL can take many seconds and consume significant memory and CPU in your application when using DynamoDB. For a startup trying to understand its own business metrics, this limitation forces a difficult choice: either build slow, expensive analytics features or invest early in a complex data pipeline to offload data to a proper analytics service like Redshift or BigQuery, adding significant architectural overhead.

Conclusion: The Right Tool for the Right Job

DynamoDB, with its single-table design, is an incredibly powerful tool for applications that have well-understood access patterns and require massive scalability. Its performance characteristics are hard to beat in those scenarios.

However, for early-stage products, where the path forward is uncertain and the ability to pivot and iterate quickly is paramount, the rigidity of DynamoDB’s access pattern requirements and the overhead of application-side schema enforcement can be significant hindrances.

A traditional SQL database, with its flexible querying capabilities and built-in schema enforcement, is often a more pragmatic choice in the initial phases of a product’s lifecycle. The focus should be on rapid feature development and discovering the product’s market fit. Once the product matures, usage grows, and performance bottlenecks in the SQL database start to appear, migrating the well-understood, high-traffic parts of your application to a finely-tuned DynamoDB single-table design can be a very effective scaling strategy.

💡 Need a Developer Who Gets It Done?

If this post helped solve your problem, imagine what we could build together! I'm a full-stack developer with expertise in Python, Django, Typescript, and modern web technologies. I specialize in turning complex ideas into clean, efficient solutions.

Let's Work Together Connect on LinkedIn