Title | The DynamoDB Book |
---|---|
Author | Azizi Othman |
Pages | 448 |
File Size | 11.6 MB |
File Type | |
Total Downloads | 409 |
Total Views | 1,032 |
The DynamoDB Book Alex DeBrie Version 1.0.1, 2020-04-16 Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...
The DynamoDB Book Alex DeBrie
Version 1.0.1, 2020-04-16
Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1. What is DynamoDB?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1. Key Properties of DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2. When to use DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3. Comparisons to other databases. . . . . . . . . . . . . . . . . . . . . . . . . . 28
2. Core Concepts in DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.1. Basic Vocabulary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2. A Deeper Look: Primary keys and secondary indexes . . . . . . 39
2.3. The importance of item collections . . . . . . . . . . . . . . . . . . . . . . 43
2.4. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3. Advanced Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1. DynamoDB Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2. Time-to-live (TTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3. Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4. Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5. DynamoDB Limits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.6. Overloading keys and indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4. The Three API Action Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.1. Item-based actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2. Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3. Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.4. How DynamoDB enforces efficiency . . . . . . . . . . . . . . . . . . . . . 73
4.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5. Using the DynamoDB API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.1. Learn how expression names and values work . . . . . . . . . . . . . 78
5.2. Don’t use an ORM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3. Understand the optional properties on individual requests . 84
5.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6. Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.1. Key Condition Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.2. Filter Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.3. Projection expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.4. Condition Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.5. Update Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7. How to approach data modeling in DynamoDB . . . . . . . . . . . . . . 121
7.1. Differences with relational databases. . . . . . . . . . . . . . . . . . . . . 122
7.2. Steps for Modeling with DynamoDB . . . . . . . . . . . . . . . . . . . . 134
7.3. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8. The What, Why, and When of Single-Table Design in DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.1. What is single-table design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.2. Downsides of a single-table design. . . . . . . . . . . . . . . . . . . . . . . 151
8.3. When not to use single-table design . . . . . . . . . . . . . . . . . . . . . 154
8.4. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
9. From modeling to implementation. . . . . . . . . . . . . . . . . . . . . . . . . 164
9.1. Separate application attributes from your indexing attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
9.2. Implement your data model at the very boundary of your application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
9.3. Don’t reuse attributes across multiple indexes . . . . . . . . . . . . 168
9.4. Add a 'Type' attribute to every item . . . . . . . . . . . . . . . . . . . . . 169
9.5. Write scripts to help debug access patterns . . . . . . . . . . . . . . . 170
9.6. Shorten attribute names to save storage . . . . . . . . . . . . . . . . . . 171
9.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
10. The Importance of Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
11. Strategies for one-to-many relationships . . . . . . . . . . . . . . . . . . . 176
11.1. Denormalization by using a complex attribute. . . . . . . . . . . . 177
11.2. Denormalization by duplicating data . . . . . . . . . . . . . . . . . . . . 181
11.3. Composite primary key + the Query API action . . . . . . . . . . 184
11.4. Secondary index + the Query API action. . . . . . . . . . . . . . . . . 187
11.5. Composite sort keys with hierarchical data . . . . . . . . . . . . . . 190
11.6. Summary of one-to-many relationship strategies . . . . . . . . 192
12. Strategies for many-to-many relationships . . . . . . . . . . . . . . . . . 194
12.1. Shallow duplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
12.2. Adjacency list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
12.3. Materialized graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
12.4. Normalization and multiple requests . . . . . . . . . . . . . . . . . . . 204
12.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
13. Strategies for filtering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
13.1. Filtering with the partition key . . . . . . . . . . . . . . . . . . . . . . . . . 210
13.2. Filtering with the sort key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
13.3. Composite sort key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
13.4. Sparse indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
13.5. Filter Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
13.6. Client-side filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
13.7. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
14. Strategies for sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
14.1. Basics of sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
14.2. Sorting on changing attributes . . . . . . . . . . . . . . . . . . . . . . . . . 239
14.3. Ascending vs. descending. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
14.4. Two relational access patterns in a single item collection . 245
14.5. Zero-padding with numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
14.6. Faking ascending order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
14.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
15. Strategies for Migrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
15.1. Adding new attributes to an existing entity . . . . . . . . . . . . . . 255
15.2. Adding a new entity type without relations . . . . . . . . . . . . . . 257
15.3. Adding a new entity type into an existing item collection . 259
15.4. Adding a new entity type into a new item collection. . . . . . 261
15.5. Joining existing items into a new item collection . . . . . . . . 266
15.6. Using parallel scans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
15.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
16. Additional strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
16.1. Ensuring uniqueness on two or more attributes. . . . . . . . . . 270
16.2. Handling sequential IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
16.3. Pagination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
16.4. Singleton items. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
16.5. Reference counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
16.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
17. Data modeling examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
17.1. Notes on the data modeling examples. . . . . . . . . . . . . . . . . . . 285
17.2. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
18. Building a Session Store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
18.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
18.2. ERD and Access Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
18.3. Data modeling walkthrough . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
18.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
19. Building an e-commerce application . . . . . . . . . . . . . . . . . . . . . . 301
19.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
19.2. ERD and Access Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
19.3. Data modeling walkthrough . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
19.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
20. Building Big Time Deals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
20.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
20.2. ERD & Access Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
20.3. Data modeling walkthrough . . . . . . . . . . . . . . . . . . . . . . . . . . 332
20.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
21. Recreating GitHub’s Backend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
21.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
21.2. ERD & Access Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
21.3. Data modeling walkthrough . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
21.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
22. Handling Migrations in our GitHub example . . . . . . . . . . . . . . . 415
22.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
22.2. ERD & Access Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
22.3. Data modeling walkthrough . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
22.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Preface My DynamoDB story begins with an abundance of unfounded confidence, slowly beaten out of me by the club of experience. I first used DynamoDB in 2015 while I was building an internal application for the engineering team I was on. Whenever there was a friction point in the development process, an engineer could type a quick command in Slack. That message would be sent to my webhook and stored for posterity. Occasionally, we would pull all the data out of my table, look it over, nod approvingly, and move along. I was proud of my application. But it’s a good thing it never had to scale past single-digit requests per day. At my next job, I helped implement a few different data models. I was all-in on the serverless ecosystem at this point, and DynamoDB was the en vogue database for serverless applications. I read what I could and thought I was pretty good at DynamoDB as I implemented a RDBMS data model on top of DynamoDB. In December of 2017, I listened to podcasts of breakout sessions from AWS re:Invent on my commute to and from work. One morning, I stumbled upon a talk on Advanced Design Patterns with DynamoDB by some guy named Rick Houlihan. I expected a tidy review of the impressive DynamoDB knowledge I had already gathered and stored. Wrong. That talk changed my career. I couldn’t believe the witchcraft I was hearing. Modeling with NoSQL databases is nothing like modeling with relational databases! Storage is cheap; it’s compute that’s sacred! Put all your
1
data into a single table! When I got home that night, I watched the video recording of the same session. Sure enough, my ears weren’t deceiving me. Rick Houlihan was an assassin, and DynamoDB was his weapon of choice. Over the next month, I spent my free time trying to decipher the ins and outs of Rick’s talk. I compared my notes against DynamoDB documentation and tried replicating his models in my own examples. The more I read, the more excited I became. My Christmas holiday was focused on sharing this knowledge with others, and in January 2018, I published DynamoDBGuide.com, a website aimed at sharing my newfound love for DynamoDB. In the time since hitting publish on that site, I’ve learned much more about DynamoDB. DynamoDBGuide.com will get you started, but it’s not going to take you to the top of the mountain. This is the book I wish I had when I started with DynamoDB. The first few chapters will warm you up with the basics on DynamoDB features and characteristics. But to paraphrase James Carville, "It’s the data model, stupid!" The hard part about DynamoDB is making the shift from an RDBMS mindset to a NoSQL mindset. We’ll go deep on DynamoDB data modeling in this book, from discussion of the DynamoDB API, to various strategies to use when using DynamoDB, to five full-length walkthroughs. There are some things you can only learn by doing, but it doesn’t hurt to have a guide along the way. And that Rick Houlihan guy? He now has the most popular re:Invent session year after year, enjoys a cult following on Twitter, and somehow agreed to write the foreword to this book.
2
Acknowledgements This book would not have been possible without help from so many people. I’m sure to leave out many of them. Thanks to Rick Houlihan for introducing me to DynamoDB, for answering my questions over the years, for taking it easy on me in a live debate in early 2020, and for writing the foreword to this book. Thanks to many other folks at AWS for building a great service, helping increase my engagement with DynamoDB, and teaching me. There are too many to name here, but Seayoung Rhee, Lizzy Nguyen, Pete Naylor, Edin Zulich, and Colin Lazier have all been great helps. This book has been significantly improved by reviews, discussions, and encouragement from a number of people in the community. Special thanks to Paul Swail, Jeremy Daly, Jared Short, Corey Quinn, and Shawn Wang (aka Swyx) for encouragement and help with the draft, and thanks to Chris Biscardi for consistent support and encouragement. Also, thanks to so many of you that sent me messages or responded to emails and gave feedback on preview chapters. Thanks to Ryan Hinojosa for great design help (a perennial weakness for me) including detailed feedback on fonts and Powerpoint slides. He’s the reason your eyes aren’t bleeding as you read this. Thanks, also, to Daniel Vassallo for sharing not only his experience self-publishing a technical book but also his AsciiDoc config without any expectation of return. I spent way too much time trying to configure AsciiDoc myself. Thanks to David Wells for sharing his marketing knowledge with me and for the breakdown of my landing page. Thanks to Andrea Passwater for assistance with copy on the DynamoDB Book website and for teaching me a ton about writing when we worked together.
3
Thanks to my parents, siblings, in-laws, and extended family members that stood by me as I moved from lawyer to developer to "self-employed" (unemployed?) author. I’m grateful for the support from all of you. Finally, thanks to my wonderful wife, Elsie. For unrelenting support. For reading every single word (sometimes multiple times!) in a 450-page book about freaking DynamoDB. For wrangling our kids as I worked early mornings, late evenings, and way too many weekends. DynamoDB changed my career, but you changed my life. I love you!
4
Foreword For some, data modeling is a passion. Identifying relationships, structuring the data, and designing optimal queries is like solving a complex puzzle. Like a builder when a job is done and a structure is standing where an empty lot used to be, the feelings are the same: satisfaction and achievement, validation of skill, and pride in the product of hard work. Data has been my life for a long time. Throughout a career that has spanned three decades, data modeling has been a constant. I cannot remember working on a project where I did not have a hand in the data layer implementation, and if there is one place in the stack I am most comfortable, it is t...