DSA Hash Maps

A hash map (also called a hash table) is a data structure that stores data in key-value pairs. It provides extremely fast access to values by using a technique called hashing to directly calculate where a value is stored in memory, rather than searching through every element.

In Python, the built-in dictionary (dict) is a hash map implementation. It is one of the most frequently used data structures in real-world Python programming.

How Hashing Works

When a key-value pair is inserted, a special function called a hash function takes the key and converts it into a number — the hash code. That number determines the position (index) in an internal array where the value is stored.

When retrieving a value, the same hash function is applied to the key to instantly calculate the position. This is why lookup is O(1) — no searching required.


# Python's built-in hash function
print(hash("apple"))    # Output: some integer (varies per session)
print(hash("banana"))   # A different integer

Using Python Dictionaries as Hash Maps

Creating a Dictionary


# Basic key-value storage
student = {
    "name": "Riya",
    "age": 21,
    "grade": "A"
}

print(student["name"])    # Output: Riya
print(student["age"])     # Output: 21

Adding and Updating Entries — O(1)


inventory = {}

# Add new key-value pairs
inventory["apples"] = 50
inventory["bananas"] = 30
inventory["oranges"] = 20

print(inventory)
# Output: {'apples': 50, 'bananas': 30, 'oranges': 20}

# Update an existing key
inventory["apples"] = 65
print(inventory["apples"])   # Output: 65

Accessing Values Safely — O(1)


scores = {"Alice": 90, "Bob": 85}

# Direct access — raises KeyError if key doesn't exist
# print(scores["Carol"])   # Would raise KeyError

# Safe access using .get() — returns None or a default value
print(scores.get("Carol"))           # Output: None
print(scores.get("Carol", 0))        # Output: 0 (default)
print(scores.get("Alice", 0))        # Output: 90

Deleting an Entry — O(1)


config = {"theme": "dark", "language": "Python", "debug": True}

del config["debug"]
print(config)   # Output: {'theme': 'dark', 'language': 'Python'}

# Or use pop() to get the removed value
removed = config.pop("theme")
print("Removed:", removed)   # Output: Removed: dark

Checking if a Key Exists — O(1)


phonebook = {"Alice": "9876543210", "Bob": "9123456780"}

if "Alice" in phonebook:
    print("Found:", phonebook["Alice"])   # Output: Found: 9876543210

if "Carol" not in phonebook:
    print("Carol not in phonebook")

Iterating Over a Dictionary


country_capitals = {
    "India": "New Delhi",
    "France": "Paris",
    "Japan": "Tokyo"
}

# Iterate over keys
for country in country_capitals:
    print(country)

# Iterate over values
for capital in country_capitals.values():
    print(capital)

# Iterate over key-value pairs
for country, capital in country_capitals.items():
    print(f"{country} --> {capital}")

Handling Hash Collisions

A collision occurs when two different keys produce the same hash code and are mapped to the same position. Python handles this internally using a technique called open addressing, so collisions are managed automatically. However, understanding that collisions exist helps explain why hash maps occasionally have O(n) worst-case performance.

Practical Use Cases of Hash Maps

1. Frequency Counter — Count Occurrences


def count_characters(text):
    frequency = {}
    for char in text:
        frequency[char] = frequency.get(char, 0) + 1
    return frequency

result = count_characters("mississippi")
print(result)
# Output: {'m': 1, 'i': 4, 's': 4, 'p': 2}

2. Two Sum Problem (Classic Interview Question)

Given a list of numbers, find two numbers that add up to a target.


def two_sum(nums, target):
    seen = {}   # Stores: value -> index
    for i, num in enumerate(nums):
        complement = target - num
        if complement in seen:
            return [seen[complement], i]
        seen[num] = i
    return []

print(two_sum([2, 7, 11, 15], 9))   # Output: [0, 1]
# 2 + 7 = 9 — indices 0 and 1

3. Caching / Memoization


cache = {}

def expensive_calculation(n):
    if n in cache:
        return cache[n]
    result = n * n * n   # Simulate expensive work
    cache[n] = result
    return result

print(expensive_calculation(5))   # Calculated: 125
print(expensive_calculation(5))   # Returned from cache: 125

4. Grouping Data (Anagram Grouping)


def group_anagrams(words):
    groups = {}
    for word in words:
        key = "".join(sorted(word))   # Sorted letters as key
        if key not in groups:
            groups[key] = []
        groups[key].append(word)
    return list(groups.values())

words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# Output: [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Big O Summary for Hash Maps

  • Insert — O(1) average, O(n) worst case
  • Access / Lookup — O(1) average, O(n) worst case
  • Delete — O(1) average
  • Search — O(1) average
  • Space — O(n)

Key Points

  • A hash map stores key-value pairs and uses a hash function for direct O(1) access.
  • Python's dict is a built-in, highly optimized hash map.
  • Use .get() for safe access that avoids KeyError.
  • Collisions are handled automatically in Python but can degrade performance in the worst case.
  • Hash maps are ideal for frequency counting, caching, grouping, and fast lookups.
  • The Two Sum problem is a classic example of how hash maps dramatically reduce time complexity from O(n²) to O(n).

Leave a Comment