The DynamoDB Book PDF

Title The DynamoDB Book
Author Azizi Othman
Pages 448
File Size 11.6 MB
File Type PDF
Total Downloads 409
Total Views 1,032

Summary

The DynamoDB Book Alex DeBrie Version 1.0.1, 2020-04-16 Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1   Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...


Description

The DynamoDB Book Alex DeBrie

Version 1.0.1, 2020-04-16

Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1  

Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5  

1. What is DynamoDB?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11  

1.1. Key Properties of DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14  

1.2. When to use DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21  

1.3. Comparisons to other databases. . . . . . . . . . . . . . . . . . . . . . . . . . 28  

2. Core Concepts in DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34  

2.1. Basic Vocabulary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34  

2.2. A Deeper Look: Primary keys and secondary indexes . . . . . . 39  

2.3. The importance of item collections . . . . . . . . . . . . . . . . . . . . . . 43  

2.4. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44  

3. Advanced Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45  

3.1. DynamoDB Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46  

3.2. Time-to-live (TTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48  

3.3. Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49  

3.4. Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51  

3.5. DynamoDB Limits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55  

3.6. Overloading keys and indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . 58  

3.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61  

4. The Three API Action Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62  

4.1. Item-based actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64  

4.2. Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66  

4.3. Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72  

4.4. How DynamoDB enforces efficiency . . . . . . . . . . . . . . . . . . . . . 73  

4.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76  

5. Using the DynamoDB API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77  

5.1. Learn how expression names and values work . . . . . . . . . . . . . 78  

5.2. Don’t use an ORM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81  

5.3. Understand the optional properties on individual requests . 84  

5.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94  

6. Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95  

6.1. Key Condition Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96  

6.2. Filter Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100  

6.3. Projection expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104  

6.4. Condition Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106  

6.5. Update Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113  

6.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119  

7. How to approach data modeling in DynamoDB . . . . . . . . . . . . . . 121  

7.1. Differences with relational databases. . . . . . . . . . . . . . . . . . . . . 122  

7.2. Steps for Modeling with DynamoDB . . . . . . . . . . . . . . . . . . . . 134  

7.3. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143  

8. The What, Why, and When of Single-Table Design in DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144  

8.1. What is single-table design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145  

8.2. Downsides of a single-table design. . . . . . . . . . . . . . . . . . . . . . . 151  

8.3. When not to use single-table design . . . . . . . . . . . . . . . . . . . . . 154  

8.4. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163  

9. From modeling to implementation. . . . . . . . . . . . . . . . . . . . . . . . . 164  

9.1. Separate application attributes from your indexing attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165  

9.2. Implement your data model at the very boundary of your application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166  

9.3. Don’t reuse attributes across multiple indexes . . . . . . . . . . . . 168  

9.4. Add a 'Type' attribute to every item . . . . . . . . . . . . . . . . . . . . . 169  

9.5. Write scripts to help debug access patterns . . . . . . . . . . . . . . . 170  

9.6. Shorten attribute names to save storage . . . . . . . . . . . . . . . . . . 171  

9.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173  

10. The Importance of Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174  

11. Strategies for one-to-many relationships . . . . . . . . . . . . . . . . . . . 176  

11.1. Denormalization by using a complex attribute. . . . . . . . . . . . 177  

11.2. Denormalization by duplicating data . . . . . . . . . . . . . . . . . . . . 181  

11.3. Composite primary key + the Query API action . . . . . . . . . . 184  

11.4. Secondary index + the Query API action. . . . . . . . . . . . . . . . . 187  

11.5. Composite sort keys with hierarchical data . . . . . . . . . . . . . . 190  

11.6. Summary of one-to-many relationship strategies . . . . . . . . 192  

12. Strategies for many-to-many relationships . . . . . . . . . . . . . . . . . 194  

12.1. Shallow duplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196  

12.2. Adjacency list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198  

12.3. Materialized graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201  

12.4. Normalization and multiple requests . . . . . . . . . . . . . . . . . . . 204  

12.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208  

13. Strategies for filtering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209  

13.1. Filtering with the partition key . . . . . . . . . . . . . . . . . . . . . . . . . 210  

13.2. Filtering with the sort key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213  

13.3. Composite sort key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218  

13.4. Sparse indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222  

13.5. Filter Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229  

13.6. Client-side filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231  

13.7. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232  

14. Strategies for sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234  

14.1. Basics of sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235  

14.2. Sorting on changing attributes . . . . . . . . . . . . . . . . . . . . . . . . . 239  

14.3. Ascending vs. descending. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242  

14.4. Two relational access patterns in a single item collection . 245  

14.5. Zero-padding with numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 248  

14.6. Faking ascending order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250  

14.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252  

15. Strategies for Migrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254  

15.1. Adding new attributes to an existing entity . . . . . . . . . . . . . . 255  

15.2. Adding a new entity type without relations . . . . . . . . . . . . . . 257  

15.3. Adding a new entity type into an existing item collection . 259  

15.4. Adding a new entity type into a new item collection. . . . . . 261  

15.5. Joining existing items into a new item collection . . . . . . . . 266  

15.6. Using parallel scans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267  

15.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268  

16. Additional strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269  

16.1. Ensuring uniqueness on two or more attributes. . . . . . . . . . 270  

16.2. Handling sequential IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274  

16.3. Pagination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275  

16.4. Singleton items. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279  

16.5. Reference counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280  

16.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283  

17. Data modeling examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284  

17.1. Notes on the data modeling examples. . . . . . . . . . . . . . . . . . . 285  

17.2. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286  

18. Building a Session Store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287  

18.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287  

18.2. ERD and Access Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290  

18.3. Data modeling walkthrough . . . . . . . . . . . . . . . . . . . . . . . . . . . 291  

18.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299  

19. Building an e-commerce application . . . . . . . . . . . . . . . . . . . . . . 301  

19.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301  

19.2. ERD and Access Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304  

19.3. Data modeling walkthrough . . . . . . . . . . . . . . . . . . . . . . . . . . . 306  

19.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318  

20. Building Big Time Deals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321  

20.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321  

20.2. ERD & Access Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329  

20.3. Data modeling walkthrough . . . . . . . . . . . . . . . . . . . . . . . . . . 332  

20.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364  

21. Recreating GitHub’s Backend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369  

21.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369  

21.2. ERD & Access Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377  

21.3. Data modeling walkthrough . . . . . . . . . . . . . . . . . . . . . . . . . . . 380  

21.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412  

22. Handling Migrations in our GitHub example . . . . . . . . . . . . . . . 415  

22.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415  

22.2. ERD & Access Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419  

22.3. Data modeling walkthrough . . . . . . . . . . . . . . . . . . . . . . . . . . . 421  

22.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439  

Preface My DynamoDB story begins with an abundance of unfounded confidence, slowly beaten out of me by the club of experience. I first used DynamoDB in 2015 while I was building an internal application for the engineering team I was on. Whenever there was a friction point in the development process, an engineer could type a quick command in Slack. That message would be sent to my webhook and stored for posterity. Occasionally, we would pull all the data out of my table, look it over, nod approvingly, and move along. I was proud of my application. But it’s a good thing it never had to scale past single-digit requests per day. At my next job, I helped implement a few different data models. I was all-in on the serverless ecosystem at this point, and DynamoDB was the en vogue database for serverless applications. I read what I could and thought I was pretty good at DynamoDB as I implemented a RDBMS data model on top of DynamoDB. In December of 2017, I listened to podcasts of breakout sessions from AWS re:Invent on my commute to and from work. One morning, I stumbled upon a talk on Advanced Design Patterns with DynamoDB by some guy named Rick Houlihan. I expected a tidy review of the impressive DynamoDB knowledge I had already gathered and stored. Wrong. That talk changed my career. I couldn’t believe the witchcraft I was hearing. Modeling with NoSQL databases is nothing like modeling with relational databases! Storage is cheap; it’s compute that’s sacred! Put all your

1

data into a single table! When I got home that night, I watched the video recording of the same session. Sure enough, my ears weren’t deceiving me. Rick Houlihan was an assassin, and DynamoDB was his weapon of choice. Over the next month, I spent my free time trying to decipher the ins and outs of Rick’s talk. I compared my notes against DynamoDB documentation and tried replicating his models in my own examples. The more I read, the more excited I became. My Christmas holiday was focused on sharing this knowledge with others, and in January 2018, I published DynamoDBGuide.com, a website aimed at sharing my newfound love for DynamoDB. In the time since hitting publish on that site, I’ve learned much more about DynamoDB. DynamoDBGuide.com will get you started, but it’s not going to take you to the top of the mountain. This is the book I wish I had when I started with DynamoDB. The first few chapters will warm you up with the basics on DynamoDB features and characteristics. But to paraphrase James Carville, "It’s the data model, stupid!" The hard part about DynamoDB is making the shift from an RDBMS mindset to a NoSQL mindset. We’ll go deep on DynamoDB data modeling in this book, from discussion of the DynamoDB API, to various strategies to use when using DynamoDB, to five full-length walkthroughs. There are some things you can only learn by doing, but it doesn’t hurt to have a guide along the way. And that Rick Houlihan guy? He now has the most popular re:Invent session year after year, enjoys a cult following on Twitter, and somehow agreed to write the foreword to this book.

2

Acknowledgements This book would not have been possible without help from so many people. I’m sure to leave out many of them. Thanks to Rick Houlihan for introducing me to DynamoDB, for answering my questions over the years, for taking it easy on me in a live debate in early 2020, and for writing the foreword to this book. Thanks to many other folks at AWS for building a great service, helping increase my engagement with DynamoDB, and teaching me. There are too many to name here, but Seayoung Rhee, Lizzy Nguyen, Pete Naylor, Edin Zulich, and Colin Lazier have all been great helps. This book has been significantly improved by reviews, discussions, and encouragement from a number of people in the community. Special thanks to Paul Swail, Jeremy Daly, Jared Short, Corey Quinn, and Shawn Wang (aka Swyx) for encouragement and help with the draft, and thanks to Chris Biscardi for consistent support and encouragement. Also, thanks to so many of you that sent me messages or responded to emails and gave feedback on preview chapters. Thanks to Ryan Hinojosa for great design help (a perennial weakness for me) including detailed feedback on fonts and Powerpoint slides. He’s the reason your eyes aren’t bleeding as you read this. Thanks, also, to Daniel Vassallo for sharing not only his experience self-publishing a technical book but also his AsciiDoc config without any expectation of return. I spent way too much time trying to configure AsciiDoc myself. Thanks to David Wells for sharing his marketing knowledge with me and for the breakdown of my landing page. Thanks to Andrea Passwater for assistance with copy on the DynamoDB Book website and for teaching me a ton about writing when we worked together.

3

Thanks to my parents, siblings, in-laws, and extended family members that stood by me as I moved from lawyer to developer to "self-employed" (unemployed?) author. I’m grateful for the support from all of you. Finally, thanks to my wonderful wife, Elsie. For unrelenting support. For reading every single word (sometimes multiple times!) in a 450-page book about freaking DynamoDB. For wrangling our kids as I worked early mornings, late evenings, and way too many weekends. DynamoDB changed my career, but you changed my life. I love you!

4

Foreword For some, data modeling is a passion. Identifying relationships, structuring the data, and designing optimal queries is like solving a complex puzzle. Like a builder when a job is done and a structure is standing where an empty lot used to be, the feelings are the same: satisfaction and achievement, validation of skill, and pride in the product of hard work. Data has been my life for a long time. Throughout a career that has spanned three decades, data modeling has been a constant. I cannot remember working on a project where I did not have a hand in the data layer implementation, and if there is one place in the stack I am most comfortable, it is t...


Similar Free PDFs