SciPy CSGraph – Compressed Sparse Graph (original) (raw)

Graphs represents relationships in data like social networks or transportation systems. When graphs are large and mostly empty (i.e., sparse) normal storage becomes inefficient. Compressed Sparse Graphs (CSGraph) solve this by storing only actual connections (non-zero values) using sparse matrix formats. This saves memory and speeds up graph operations.

SciPy’s scipy.sparse.csgraph module provides tools and algorithms to work with these sparse graph structures using formats like CSR (Compressed Sparse Row) or CSC (Compressed Sparse Column).

Key Functionalities of scipy.sparse.csgraph

**scipy.sparse.csgraph subpackage offers a wide range of functionalities and algorithms for efficient graph analysis. Let's look into it:

**Shortest Paths: Efficiently compute shortest distances using Dijkstra, Bellman-Ford or Floyd-Warshall.
**Graph Traversal: Perform BFS (breadth_first_order) and DFS (depth_first_order) on sparse graphs.
**Connected Components: Identify connected or strongly connected parts of the graph.
**Minimum Spanning Tree (MST): Find the subset of edges that connects all nodes with minimum total weight.
**Maximum Flow: Calculate how much "flow" can go from one node (source) to another (sink).
**Graph Conversion: Convert dense/edge list/masked arrays to graph format using csgraph_from_dense or others.

Creating CSGraphs in SciPy

To use SciPy’s graph algorithms, graph must be represented in a compressed sparse format done by defining a graph using an adjacency matrix or edge list, converting it to a sparse matrix and then using csgraph_from_dense() to create CSGraph.

The examples below show how to do this in different ways.

Example 1: Convert an Empty Sparse Matrix to a Graph

This example shows how to create a CSGraph from an empty sparse matrix. It demonstrates basic structure of a graph with no edges, helps to understand how CSGraphs are initialized.

Python `

import numpy as np from scipy.sparse import csr_matrix from scipy.sparse.csgraph import csgraph_from_dense

Creating a 3 * 3 sparse matrix .

sparseMatrix = csr_matrix((3, 3), dtype=np.int8).toarray()

converting sparse matrix to graph

graph = csgraph_from_dense(sparseMatrix) print(graph.toarray())

**Output

[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]

**Explanation:

**csr_matrix((3, 3)): creates a 3x3 sparse matrix.
**dtype=np.int8: uses 8bit integers to save memory.
****.toarray():** converts sparse matrix to a regular NumPy array.
**csgraph_from_dense(): converts array into a graph format for SciPy algorithms.

Example 2: Directed Graph from Adjacency Matrix

A directed graph has edges that point from one node to another. An adjacency matrix represents this with a 2D grid if [i][j] = 1, there's an edge from node i to node j. It is a clear way to define graph structure to use in SciPy. Let's see below example:

Python `

from scipy.sparse import csr_matrix from scipy.sparse.csgraph import csgraph_from_dense

Define the adjacency matrix for a directed graph

adjacency_matrix = [[0, 1, 0, 1], [0, 0, 1, 0], [0, 0, 0, 1], [0, 0, 0, 0]]

Convert the adjacency matrix to CSR format

graph_sparse = csr_matrix(adjacency_matrix).toarray()

Convert CSR format to graph representation

graph = csgraph_from_dense(graph_sparse) print(graph)

**Output

Coords Values
(0, 1) 1.0
(0, 3) 1.0
(1, 2) 1.0
(2, 3) 1.0

**Explanation:

**Adjacency Matrix defines directed edges (1 means edge from node i to j).
**csr_matrix() converts it to a memory-efficient sparse format.
**csgraph_from_dense() turns it into a graph structure for SciPy algorithms.

Example 3: Create Graph from Edge List

An edge list represents a graph using node to node connections. It's simple and efficient for sparse graphs and can be easily converted to a CSGraph for fast analysis in SciPy. Let's see below code:

Python `

import numpy as np from scipy.sparse import coo_matrix from scipy.sparse.csgraph import csgraph_from_dense

creating the edge list

edgeList = coo_matrix((3, 3), dtype=np.int8).toarray()

converting the edge list to graph

graph = csgraph_from_dense(edgeList) print(graph.toarray())

**Output

[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]

**Explanation:

**coo_matrix((3, 3)): creates an empty edge list with int type.
****.toarray():** convert it to a dense array.
**csgraph_from_dense(): build a graph from the dense matrix.

Example 4: Create Undirected Graph from Symmetric Matrix

An undirected graph has edges with no direction, meaning connections go both ways. A symmetric matrix is used to represent it, where the value at [i][j] equals [j][i]. Let's see this in Example:

Python `

from scipy.sparse import csr_matrix from scipy.sparse.csgraph import csgraph_from_dense

Define the adjacency matrix for an undirected graph

Here 1 represents the edge weight between source to destination

adjacency_matrix = [[0, 1, 0, 1], [1, 0, 1, 0], [0, 1, 0, 1], [1, 0, 1, 0]]

Set the matrix symmetrically

adjacency_matrix = [[max(adjacency_matrix[i][j], adjacency_matrix[j][i]) for j in range(len(adjacency_matrix))] for i in range(len(adjacency_matrix))]

Convert the adjacency matrix to CSR format

graph_sparse = csr_matrix(adjacency_matrix).toarray() graph = csgraph_from_dense(graph_sparse) print(graph)

**Output

Coords Values
(0, 1) 1.0
(0, 3) 1.0
(1, 0) 1.0
(1, 2) 1.0
(2, 1) 1.0
(2, 3) 1.0
(3, 0) 1.0
(3, 2) 1.0

**Explanation:

**adjacency_matrix = [[.]]: creates a 4x4 matrix where 1 means an edge exists between nodes.
**max() loop ensures matrix is symmetric for an undirected graph.
**csr_matrix().toarray(): converts matrix to sparse format, then back to array.

Applying Graph Algorithms Using SciPy CSGraph

Once a CSGraph is created, SciPy lets you apply graph algorithms efficiently. These include BFS, DFS, shortest paths, MST etc. Let's look at some of the Examples:

Example 1: Breadth-First Search (BFS)

BFS (Breadth-First Search) visits nodes level by level and is useful for finding shortest path in unweighted graphs. Below code performs BFS on a directed graph starting from node 0 and prints traversal order.

Python `

from scipy.sparse import csr_matrix from scipy.sparse.csgraph import breadth_first_order

adjMat = [ [0, 1, 2, 0], [0, 0, 0, 1], [2, 0, 0, 3], [0, 0, 0, 0]]

graph = csr_matrix(adjMat) print(graph)

bfs = breadth_first_order(graph, 0, return_predecessors=False) print("Breadth-first travelling order:", bfs)

**Output

Coords Values
(0, 1) 1
(0, 2) 2
(1, 3) 1
(2, 0) 2
(2, 3) 3
Breadth-first travelling order: [0 1 2 3]

**Explanation:

**breadth_first_order() performs BFS starting from **node 0.
**return_predecessors=False ensures function returns only the visit order, not the path predecessors.

Example 2: Depth-First Search (DFS)

DFS (Depth-First Search) explores as far as possible along each branch before backtracking. The code below performs DFS on a directed graph starting from node 1 and prints order of traversal.

Python `

from scipy.sparse import csr_matrix from scipy.sparse.csgraph import depth_first_order

adjMat = [ [0, 1, 2, 0], [0, 0, 0, 1], [2, 0, 0, 3], [0, 0, 0, 0]]

graph = csr_matrix(adjMat) print(graph)

dfs = depth_first_order(graph, i_start=1, return_predecessors=False) print("Depth First Travelling order:", dfs)

**Output

Coords Values
(0, 1) 1
(0, 2) 2
(1, 3) 1
(2, 0) 2
(2, 3) 3
Depth First Travelling order: [1 3]

**Explanation: depth_first_order(): starts DFS traversal from node 1 without returning the predecessors.

Example 3: Shortest Path

Shortest Path finds minimum distance between nodes. The code computes shortest paths from a source node and between all nodes using SciPy’s **shortest_path.

Python `

from scipy.sparse import csr_matrix from scipy.sparse.csgraph import shortest_path

adjacency_matrix = [ [0, 1, 2, 0], [0, 0, 0, 1], [2, 0, 0, 3], [0, 0, 0, 0]]

graph = csr_matrix(adjacency_matrix)

Shortest path from node 1

source = 1 dist1 = shortest_path(csgraph=graph, method="auto", directed=False, indices=source) print("Distance from Node 1 to other Nodes:", dist1)

All-pairs shortest distances

dist_matrix = shortest_path(csgraph=graph, method='FW', directed=False) print("All-pairs shortest distances:\n", dist_matrix)

**Output

Distance from Node 1 to other Nodes: [1. 0. 3. 1.]
All-pairs shortest distances:
[[0. 1. 2. 2.]
[1. 0. 3. 1.]
[2. 3. 0. 3.]
[2. 1. 3. 0.]]

**Explanation: shortest_path(.): calculates shortest paths (from one node or between all nodes).

Example 4: Minimum Spanning Tree (MST)

Minimum Spanning Tree connects all nodes in a graph with least total edge weight. The code finds MST using SciPy’s **minimum_spanning_tree, ensuring no cycles and minimum cost connections.

Python `

from scipy.sparse import csr_matrix from scipy.sparse.csgraph import minimum_spanning_tree

X = csr_matrix([[0, 8, 0, 3], [0, 0, 2, 5], [0, 0, 0, 6], [0, 0, 0, 0]])

Tcsr = minimum_spanning_tree(X) print(Tcsr.toarray())

**Output

[[0. 0. 0. 3.]
[0. 0. 2. 5.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]

**Explanation:

**minimum_spanning_tree(X): computes MST using Kruskal’s algorithm.
**Tcsr.toarray(): converts result to a dense array and prints the MST.

Example 5: Maximum Flow

Maximum Flow finds greatest amount of flow that can be pushed from a source node to a sink node in a network. The code computes maximum flow and flow distribution using SciPy’s **maximum_flow function.

Python `

from scipy.sparse import csr_matrix from scipy.sparse.csgraph import maximum_flow

adjacency_matrix = [[0, 16, 13, 0, 0, 0], [0, 0, 0, 12, 0, 0], [0, 4, 0, 0, 14, 0], [0, 0, 9, 0, 0, 20], [0, 0, 0, 7, 0, 4], [0, 0, 0, 0, 0, 0]]

graph_sparse = csr_matrix(adjacency_matrix) flow_dict = maximum_flow(graph_sparse, 0, 5)

print("Maximum Flow Value:", flow_dict.flow_value) print("Flow Distribution:\n", flow_dict.flow.toarray())