Frequent Pattern Growth Algorithm (original) (raw)

Last Updated : 2 May, 2026

The FP-Growth (Frequent Pattern Growth) algorithm efficiently mines frequent itemsets from large transactional datasets. Unlike the Apriori algorithm which suffers from high computational cost due to candidate generation and multiple database scans. FP-Growth avoids these inefficiencies by compressing the data into an FP-Tree (Frequent Pattern Tree) and extracts patterns directly from it.

Frequent Pattern Growth

How FP-Growth Works

  1. **Data Compression: First FP-Growth compresses the dataset into a smaller structure called the Frequent Pattern Tree (FP-Tree). This tree stores information about item sets (collections of items) and their frequencies without need to generate candidate sets like Apriori does.
  2. **Mining the Tree: The algorithm then examines this tree to identify patterns that appear frequently based on a minimum support threshold. It does this by breaking the tree down into smaller "conditional" trees for each item making the process more efficient.
  3. **Generating Patterns: Once the tree is built and analyzed the algorithm generates the frequent patterns (itemsets) and the rules that describe relationships between items.

Imagine you’re organizing a party and want to know popular food combinations without asking every guest repeatedly.

  1. List food items each guest brought transactions.
  2. Count items and remove infrequent ones filter by support.
  3. Group items in order of popularity and create a tree where paths represent common combinations.
  4. Instead of repeatedly asking guests you explore this tree to discover patterns. For example, you might find that pizza and pasta often come together or that cake and pasta are also a common pair.

This is exactly how FP-Growth finds frequent patterns efficiently.

Working of FP- Growth Algorithm

Problem Statement: Consider a small grocery store transaction dataset. Each entry shows the set of items purchased together by a customer:

Dataset

We apply the FP-Growth algorithm to identify frequent itemsets (groups of items frequently bought together), using a minimum support count of 2.

Step 1: Compute Item Frequencies

Scan the entire dataset one time to determine how often each item appears.

Frequency

All items meet the minimum support threshold (≥ 2), so none are removed.

Step 2: Order Items in Each Transaction by Frequency

Next, sort every transaction in descending order based on how frequently the items appear overall.

Sorted Dataset

Sorting by frequency helps group common items together, allowing the FP-tree to be compressed efficiently.

Step 3: Construct the FP-Tree

Insert the sorted transactions one by one into the FP-tree, merging any shared prefixes.

FP-Tree Insertions:

**Visual FP-Tree:

FP-Tree

Each node shows: Item (Count)

Step 4: Determine Conditional Pattern Bases

A conditional pattern base contains all prefix paths leading to a specific item. Let’s examine the paths ending with Butter.

Paths that end with Butter:

Thus, the conditional pattern base for Butter is:

[ (Bread, Milk): 1, (Bread): 1, (Milk): 1 ]

Step 5: Build Conditional FP-Trees

Using the conditional pattern base, construct a smaller FP-tree for each item to identify frequent patterns involving that item. Butter’s conditional FP-tree input:

**Count all items:

**Since both meet the support threshold (≥2), we can now generate frequent patterns:

Repeat the process for Milk and Bread as needed.

From the FP-tree and conditional trees, we get these frequent itemsets:

All of these appear at least 2 times in the transactions.

Summary Table

Summary Table

Why FP-Growth is Efficient

FP-Growth vs. Apriori Algorithm

Feature FP-Growth Apriori
Candidate generation Not required Needed
Data scans 2 Several
Speed Efficient for large datasets Slower because of repeated scans
Memory use Higher (requires a tree structure) Lower
Complexity More difficult to implement Simpler to grasp and build

Applications

Advantages

Limitations