Christian Borgelt's Web Pages (original) (raw)
Eclat/LCM - Frequent Item Set Mining
Download
| eclat | (534 kb) | GNU/Linux executable |
|---|---|---|
| eclat.exe | (369 kb) | Windows console executable |
| eclat.zip | (278 kb) | C sources, version 5.24 (2022.11.22) |
| eclat.tar.gz | (255 kb) | |
| pyfim.py | (13 kb) | Python source of a simplified version |
| census.zip | (382 kb) | census data set (UCI ML repository) |
| census | (2 kb) | shell script used for the conversion |
Description
Eclat is a program to find frequent item sets (also closed and maximal as well as generators) with the Eclat algorithm[Zaki et al. 1997], which carries out a depth first search on the subset lattice and determines the support of item sets by intersecting transaction lists. This implementation also supports diffsets[Zaki and Gouda 2003]and several other algorithm variants, including certain variants of LCM (Linear-time Closed itemset Miner)[Uno et al. 2003],[Uno et al. 2004],[Uno et al. 2005], which employs an occurrence deliver scheme to determine the support of item sets. Since version 5.0 the program made available above can also be used to find association rules.
The Python version is simplified, much slower, and supports much fewer program options. It may be useful as an illustrative implementation that demonstrates some core features of the Eclat algorithm. However, if you want to use frequent item set mining in Python for actual mining tasks, it is recommended to use on thePyFIM library, which provides a Python interface to the C implementation.
Full descriptionof the Eclat program (included in the source package).
If you have trouble executing the program on Microsoft Windows, check whether you have theMicrosoft Visual C++ Redistributable for Visual Studio 2022(see under "Other Tools and Frameworks") installed, as the program was compiled with Microsoft Visual Studio 2022.
Papers that describe this algorithm and some implementation aspects of the C version (note that due to a complete redesign in version 3.0, the description in[Borgelt 2003], does not fit the program anymore):
- Frequent Item Set Mining
Christian Borgelt
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2(6):437-456.
J. Wiley & Sons, Chichester, United Kingdom 2012
doi:10.1002/widm.1074 wiley.com
(20 pages) - Efficient Implementations of Apriori and Eclat
Christian Borgelt.
Workshop of Frequent Item Set Mining Implementations (FIMI 2003, Melbourne, FL, USA).
fimi_03.pdf (304 kb)fimi_03.ps.gz (197 kb) (9 pages)
Some other references:
- New Algorithms for Fast Discovery of Association Rules
M.J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li
Proc. 3rd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'97, Newport Beach, CA), 283-296
AAAI Press, Menlo Park, CA, USA 1997 - Fast Vertical Mining Using Diffsets
M.J. Zaki and K. Gouda
Proc. 9th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2003, Washington, DC), 326-335
ACM Press, New York, NY, USA 2003 - LCM: An Efficient Algorithm for Enumerating Frequent Closed Item Sets.
T. Uno, T. Asai, Y. Uchida, and H. Arimura.
Proc. Workshop on Frequent Item Set Mining Implementations (FIMI 2003, Melbourne, FL).
CEUR Workshop Proceedings 90, TU Aachen, Germany 2003 - LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets.
T. Uno, M. Kiyomi and H. Arimura.
Proc. Workshop Frequent Item Set Mining Implementations (FIMI 2004, Brighton, UK).
CEUR Workshop Proceedings 126, TU Aachen, Germany 2004 - LCM ver. 3: Collaboration of Array, Bitmap and Prefix Tree for Frequent Itemset Mining.
T. Uno, M. Kiyomi, and H. Arimura.
Proc. 1st Open Source Data Mining Workshop on Frequent Pattern Mining Implementations (OSDM 2005, Chicago, IL), 77-86.
ACM Press, New York, NY, USA 2005
More information about frequent item set mining, implementations of other algorithms as well as test data sets can be found at theFrequent Itemset Mining Implementations Repository.