# Chapter 5: Hash Tables

**i**nput Key; Output Value

* Python has hash tables; they’re called *dictionaries*.
* *DNS resolution* :mapping a web address to an IP address
* **Using hash tables as a cache**
  * You get the web page a lot faster
  * do less work
  * Caching is a common way to make things faster. All big websites use caching. And that data is cached in a hash
    * Ex:

      ```
      cache ={}
      ​
      def get_page(url):
        if cache.get(url):
          return cache[url]
        else:
          data = get_data_from_server(url)
          cache[url] = data
          return data
      ```

      * This is called a *collision*: two keys have been assigned the same slot.
        * *Your hash function is really important.* Your hash function mapped all the keys to a single slot. Ideally, your hash function would map keys evenly all over the hash.
        * If those linked lists get long, it slows down your hash table a lot. But they won’t get long if you *use a good hash function*!
      * To avoid collisions
        * **Load factor**
          * Calculation: Load factor = number of items in hash table / total number of slots
          * Load factor measures how many empty slots remain in your hash table.
          * Once the load factor starts to grow, you need to add more slots to your hash table. This is called *resizing*.
          * With a lower load factor, you’ll have fewer collisions, and your table will perform better.
        * **A good hash function**
          * what is a good hash function? look up the SHA function
      * To recap, hashes are good for
        * Modeling relationships from one thing to another thing
        * Filtering out duplicates
        * Caching/memorizing data instead of making your server do work
