DSA Hash Maps
A hash map (also called a hash table) is a data structure that stores data in key-value pairs. It provides extremely fast access to values by using a technique called hashing to directly calculate where a value is stored in memory, rather than searching through every element.
In Python, the built-in dictionary (dict) is a hash map implementation. It is one of the most frequently used data structures in real-world Python programming.
How Hashing Works
When a key-value pair is inserted, a special function called a hash function takes the key and converts it into a number — the hash code. That number determines the position (index) in an internal array where the value is stored.
When retrieving a value, the same hash function is applied to the key to instantly calculate the position. This is why lookup is O(1) — no searching required.
# Python's built-in hash function
print(hash("apple")) # Output: some integer (varies per session)
print(hash("banana")) # A different integer
Using Python Dictionaries as Hash Maps
Creating a Dictionary
# Basic key-value storage
student = {
"name": "Riya",
"age": 21,
"grade": "A"
}
print(student["name"]) # Output: Riya
print(student["age"]) # Output: 21
Adding and Updating Entries — O(1)
inventory = {}
# Add new key-value pairs
inventory["apples"] = 50
inventory["bananas"] = 30
inventory["oranges"] = 20
print(inventory)
# Output: {'apples': 50, 'bananas': 30, 'oranges': 20}
# Update an existing key
inventory["apples"] = 65
print(inventory["apples"]) # Output: 65
Accessing Values Safely — O(1)
scores = {"Alice": 90, "Bob": 85}
# Direct access — raises KeyError if key doesn't exist
# print(scores["Carol"]) # Would raise KeyError
# Safe access using .get() — returns None or a default value
print(scores.get("Carol")) # Output: None
print(scores.get("Carol", 0)) # Output: 0 (default)
print(scores.get("Alice", 0)) # Output: 90
Deleting an Entry — O(1)
config = {"theme": "dark", "language": "Python", "debug": True}
del config["debug"]
print(config) # Output: {'theme': 'dark', 'language': 'Python'}
# Or use pop() to get the removed value
removed = config.pop("theme")
print("Removed:", removed) # Output: Removed: dark
Checking if a Key Exists — O(1)
phonebook = {"Alice": "9876543210", "Bob": "9123456780"}
if "Alice" in phonebook:
print("Found:", phonebook["Alice"]) # Output: Found: 9876543210
if "Carol" not in phonebook:
print("Carol not in phonebook")
Iterating Over a Dictionary
country_capitals = {
"India": "New Delhi",
"France": "Paris",
"Japan": "Tokyo"
}
# Iterate over keys
for country in country_capitals:
print(country)
# Iterate over values
for capital in country_capitals.values():
print(capital)
# Iterate over key-value pairs
for country, capital in country_capitals.items():
print(f"{country} --> {capital}")
Handling Hash Collisions
A collision occurs when two different keys produce the same hash code and are mapped to the same position. Python handles this internally using a technique called open addressing, so collisions are managed automatically. However, understanding that collisions exist helps explain why hash maps occasionally have O(n) worst-case performance.
Practical Use Cases of Hash Maps
1. Frequency Counter — Count Occurrences
def count_characters(text):
frequency = {}
for char in text:
frequency[char] = frequency.get(char, 0) + 1
return frequency
result = count_characters("mississippi")
print(result)
# Output: {'m': 1, 'i': 4, 's': 4, 'p': 2}
2. Two Sum Problem (Classic Interview Question)
Given a list of numbers, find two numbers that add up to a target.
def two_sum(nums, target):
seen = {} # Stores: value -> index
for i, num in enumerate(nums):
complement = target - num
if complement in seen:
return [seen[complement], i]
seen[num] = i
return []
print(two_sum([2, 7, 11, 15], 9)) # Output: [0, 1]
# 2 + 7 = 9 — indices 0 and 1
3. Caching / Memoization
cache = {}
def expensive_calculation(n):
if n in cache:
return cache[n]
result = n * n * n # Simulate expensive work
cache[n] = result
return result
print(expensive_calculation(5)) # Calculated: 125
print(expensive_calculation(5)) # Returned from cache: 125
4. Grouping Data (Anagram Grouping)
def group_anagrams(words):
groups = {}
for word in words:
key = "".join(sorted(word)) # Sorted letters as key
if key not in groups:
groups[key] = []
groups[key].append(word)
return list(groups.values())
words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# Output: [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
Big O Summary for Hash Maps
- Insert — O(1) average, O(n) worst case
- Access / Lookup — O(1) average, O(n) worst case
- Delete — O(1) average
- Search — O(1) average
- Space — O(n)
Key Points
- A hash map stores key-value pairs and uses a hash function for direct O(1) access.
- Python's
dictis a built-in, highly optimized hash map. - Use
.get()for safe access that avoidsKeyError. - Collisions are handled automatically in Python but can degrade performance in the worst case.
- Hash maps are ideal for frequency counting, caching, grouping, and fast lookups.
- The Two Sum problem is a classic example of how hash maps dramatically reduce time complexity from O(n²) to O(n).
