DSA Graphs
A graph is a non-linear data structure that consists of a set of vertices (also called nodes) and a set of edges that connect pairs of vertices. Graphs are one of the most versatile and powerful data structures, used to model real-world systems such as social networks, road maps, web page links, flight routes, and dependency systems.
Core Graph Terminology
- Vertex (Node) — A point in the graph representing an entity.
- Edge — A connection between two vertices.
- Adjacent — Two vertices are adjacent if directly connected by an edge.
- Degree — The number of edges connected to a vertex.
- Path — A sequence of vertices connected by edges.
- Cycle — A path that starts and ends at the same vertex.
- Connected Graph — Every vertex has a path to every other vertex.
- Weight — A value assigned to an edge (e.g., distance, cost).
Types of Graphs
Undirected Graph
Edges have no direction — if A connects to B, then B also connects to A. Like a two-way road.
Directed Graph (Digraph)
Edges have direction — if A connects to B, it does not mean B connects to A. Like a one-way street.
Weighted Graph
Each edge has an associated weight or cost. Used in shortest path problems like GPS navigation.
Unweighted Graph
All edges are treated equally — no weights. Used to determine connectivity.
Graph Representation in Python
1. Adjacency List (Most Common)
Each vertex stores a list of its neighboring vertices. Memory-efficient for sparse graphs (few edges).
# Undirected graph as adjacency list
graph = {
"A": ["B", "C"],
"B": ["A", "D", "E"],
"C": ["A", "F"],
"D": ["B"],
"E": ["B", "F"],
"F": ["C", "E"]
}
print(graph["B"]) # Output: ['A', 'D', 'E'] — neighbors of B
2. Adjacency Matrix
A 2D grid where each cell [i][j] indicates whether an edge exists between vertex i and vertex j. Uses more memory but allows O(1) edge lookup.
# 4 vertices: 0, 1, 2, 3
# 1 = edge exists, 0 = no edge
adj_matrix = [
[0, 1, 1, 0],
[1, 0, 0, 1],
[1, 0, 0, 1],
[0, 1, 1, 0]
]
# Check if edge exists between vertex 0 and 2
print(adj_matrix[0][2]) # Output: 1 (edge exists)
Graph Traversal
Traversal means visiting every vertex in the graph. There are two fundamental approaches.
1. Breadth-First Search (BFS)
BFS explores all vertices at the current depth before moving to the next level. It uses a queue and is ideal for finding the shortest path in unweighted graphs.
from collections import deque
def bfs(graph, start):
visited = set()
queue = deque([start])
visited.add(start)
order = []
while queue:
vertex = queue.popleft()
order.append(vertex)
for neighbor in graph[vertex]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
return order
graph = {
"A": ["B", "C"],
"B": ["A", "D", "E"],
"C": ["A", "F"],
"D": ["B"],
"E": ["B", "F"],
"F": ["C", "E"]
}
print(bfs(graph, "A"))
# Output: ['A', 'B', 'C', 'D', 'E', 'F']
2. Depth-First Search (DFS)
DFS goes as deep as possible along each branch before backtracking. It uses a stack (or recursion) and is useful for cycle detection, topological sorting, and connected components.
def dfs(graph, start, visited=None):
if visited is None:
visited = set()
visited.add(start)
print(start, end=" ")
for neighbor in graph[start]:
if neighbor not in visited:
dfs(graph, neighbor, visited)
dfs(graph, "A")
# Output: A B D E F C (order may vary by neighbor order)
Detecting a Cycle in a Graph
def has_cycle(graph):
visited = set()
def dfs_cycle(vertex, parent):
visited.add(vertex)
for neighbor in graph[vertex]:
if neighbor not in visited:
if dfs_cycle(neighbor, vertex):
return True
elif neighbor != parent:
return True # Back edge found — cycle exists
return False
for vertex in graph:
if vertex not in visited:
if dfs_cycle(vertex, None):
return True
return False
cyclic_graph = {"A": ["B"], "B": ["C"], "C": ["A"]}
print("Cycle:", has_cycle(cyclic_graph)) # Output: Cycle: True
Weighted Graph and Shortest Path
Representing a Weighted Graph
# Weighted adjacency list: {vertex: [(neighbor, weight), ...]}
weighted_graph = {
"A": [("B", 4), ("C", 2)],
"B": [("D", 5)],
"C": [("B", 1), ("D", 8)],
"D": []
}
Dijkstra's Algorithm (Shortest Path)
Dijkstra's algorithm finds the shortest path from a source vertex to all other vertices in a weighted graph with non-negative weights. It uses a min heap (priority queue) for efficiency.
import heapq
def dijkstra(graph, start):
distances = {vertex: float('inf') for vertex in graph}
distances[start] = 0
heap = [(0, start)] # (cost, vertex)
while heap:
cost, vertex = heapq.heappop(heap)
if cost > distances[vertex]:
continue # Already found a shorter path
for neighbor, weight in graph[vertex]:
new_dist = cost + weight
if new_dist < distances[neighbor]:
distances[neighbor] = new_dist
heapq.heappush(heap, (new_dist, neighbor))
return distances
weighted_graph = {
"A": [("B", 4), ("C", 2)],
"B": [("D", 5)],
"C": [("B", 1), ("D", 8)],
"D": []
}
print(dijkstra(weighted_graph, "A"))
# Output: {'A': 0, 'B': 3, 'C': 2, 'D': 8}
# Shortest path A → B = 2 + 1 = 3 (via C)
Big O Summary for Graph Operations
- BFS / DFS — O(V + E), where V = vertices, E = edges
- Add vertex/edge (adjacency list) — O(1)
- Edge lookup (adjacency matrix) — O(1)
- Edge lookup (adjacency list) — O(degree)
- Dijkstra's algorithm — O((V + E) log V)
- Space (adjacency list) — O(V + E)
Key Points
- A graph consists of vertices and edges and models connected relationships.
- Graphs can be directed or undirected, weighted or unweighted.
- Adjacency lists are memory-efficient for sparse graphs; adjacency matrices allow O(1) edge lookup.
- BFS uses a queue and finds shortest paths in unweighted graphs.
- DFS uses recursion/stack and is used for cycle detection, path finding, and topological sorting.
- Dijkstra's algorithm efficiently finds shortest paths in weighted graphs using a min heap.
