9.10 Hash table resizing PDF

Title 9.10 Hash table resizing
Course Data Abstraction and Structures
Institution De Anza College
Pages 7
File Size 319.2 KB
File Type PDF
Total Downloads 62
Total Views 157

Summary

This lecture on the hash table resizing. The number of buckets in a hash table can be increased while all remaining elements are preserved. The next prime number N * 2 is widely used to resize a hash table of N buckets. The resize operation's time complexity is O since a new array is assigned and al...


Description

9.10 Hash table resizing Class: CIS22C - Lecture Notes Lecture: 9. Date: 05/15/2021 Description: This lecture on the hash table resizing. The number of buckets in a hash table can be increased while all remaining elements are preserved. The next prime number N * 2 is widely used to resize a hash table of N buckets. The resize operation's time complexity is O since a new array is assigned and all objects from the old array are re-inserted into it N. Resize operation A hash tableresizeoperation increases the number of buckets, while preserving all existing items. A hash table with N buckets is commonly resized to the next prime number ≥ N * 2. A new array is allocated, and all items from the old array are re-inserted into the new array, making the resize operation's time complexityON.

HashResize(hashTable, currentSize) { newSize = nextPrime(currentSize * 2) newArray = Allocate new array of size newSize Set all entries in newArray to EmptySinceStart bucket = 0 while (bucket < currentSize) { if (hashTable[bucket] is not Empty) { key = hashTable[bucket] HashInsert(newArray, key) } bucket = bucket + 1 } return newArray } Hash(key, tableSize) {

return key % tableSize }

 When resizing a hash table with 11 buckets and 7 items, the new size is computed as the next prime number >= 22, which is 23.  A new array is allocated with 23 buckets for the resized hash table.  When rehashing 88, the bucket index is computed as 88 % 23 = 19. newArray[19] is assigned with 88.  The key from each of hashTable's non-empty buckets is rehashed and inserted into newArray.  newArray is returned and is the resized hash table.

Q&A Suppose the hash table below is resized. The hash function used both before and after resizing is: hash(key) = key % N, where N is the table size.

q) What is the most likely allocated size for the resized hash table? a: Resizing will use the next prime number >= 14, which is 17. q) How many elements are in the hash table after resizing? a: The 4 existing elements are preserved during the resize. q) At what index does 99 reside in the resized table? a: 99 is rehashed using the new table size, and is inserted at index 99 % 17 = 14.

When to resize A hash table'sload factoris the number of items in the hash table divided by the number of buckets. Ex: A hash table with 18 items and 31 buckets has a load factor of18/310.58. The load factor may be used to decide when to resize the hash table. An implementation may choose to resize the hash table when one or more of the following values exceeds a certain threshold: Load factor When using open-addressing, the number of collisions during an insertion When using chaining, the size of a bucket's linked-list

 A hash table with chaining will inevitably have large linked-lists after inserting many items.  The largest bucket length can be used as resizing criteria. Ex: The hash table is resized when a bucket length is >= 4.

 A hash table with 2 items and 5 buckets has a load factor of 2 / 5 = 0.4.  Inserting 22 increases the load factor to 3 / 5 = 0.6.  An implementation may choose to resize the hash table whenever the load factor is ≥ 0.6.

1. Inserting 38 into the hash table with linear probing encounters 5 collisions before placing 38 in bucket 10. 2. If the hash table's resize criteria were to resize after encountering⌊11/3⌋ 3collisions, then the insertion would cause a resize.

Q&A q) A hash table implementation must use only one criteria for resizing.

a: Multiple criteria can be used together. Ex: Resizing could occur when either the load factor exceeds 0.5 or the number of items in a bucket exceeds 11. q) In a hash table using open addressing, the load factor cannot exceed 1.0. a: With open addressing, one bucket holds at most one item. So a table with N buckets can have at most N items, making the maximum possible load factor N / N = 1.0. q) In a hash table using chaining, the load factor cannot exceed 1.0. a: The load factor is the number of items divided by the number of buckets. Multiple items can be placed in one bucket when using chaining. So the hash table could have more items than buckets, making the load factor > 1.0. q) When resizing to a larger size, the load factor is guaranteed to decrease. a: The load factor is computed as X / Y, where X is the number of items and Y the number of buckets. Resizing will not change X. When Y increases, the fraction X / Y decreases. Therefore, the load factor always decreases when resizing to a larger size. q) If the hash table was using chaining, the load factor could be ≤ 0.1, but an individual bucket could still contain 10 items. a: In the worst case scenario, the 10 items are the only items in the table and are all added to the same bucket. The load factor, 10 / 101 = 0.099, is ≤ 0.1. q) If the hash table was using open addressing, a load factor < 0.25 guarantees that no more than 25 collisions will occur during insertion. a: A load factor < 0.25 means fewer than 25% of the 101 buckets are in use. Therefore, at most 25 buckets are in use. An insertion only collides with nonempty buckets, so no more than 25 collisions will occur during insertion. q) If the hash table was using open addressing, a load factor > 0.9 guarantees a collision during insertion. a: Unless the load factor is 1.0, empty buckets exist in the table. So if 0.9 < load factor < 1.0, a key's hash may lead to an empty bucket, allowing insertion without collision....


Similar Free PDFs