Supervised Machine Learning Approaches: A Survey (original) (raw)
Abstract
One of the core objectives of machine learning is to instruct computers to use data or past experience to solve a given problem. A good number of successful applications of machine learning exist already, including classifier to be trained on email messages to learn in order to distinguish between spam and non-spam messages, systems that analyze past sales data to predict customer buying behavior, fraud detection etc. Machine learning can be applied as association analysis through Supervised learning, Unsupervised learning and Reinforcement Learning but in this study we will focus on strength and weakness of supervised learning classification algorithms. The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. We are optimistic that this study will help new researchers to guiding new research areas and to compare the effectiveness and impuissance of supervised learning algorithms.
Key takeaways
AI
- Supervised learning aims to develop models predicting class labels from predictor features in labeled datasets.
- The study explores strengths and weaknesses of supervised learning algorithms, aiding new research directions.
- Key algorithms discussed include Decision Trees, Naive Bayes, Support Vector Machines, and k-Nearest Neighbors.
- Data preprocessing and feature selection significantly influence the effectiveness of supervised learning models.
- Deep learning has gained traction for applications in computer vision, speech recognition, and natural language processing.

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
References (36)
- S. B. Kotsiantis, "Supervised Machine Learning: A Review of Classification Techniques", Informatica, Vol. 31, No. 3, pp. 249-268, 2007.
- James Cussens, "Machine Learning", IEEE Journal of Computing and Control, Vol. 7, No. 4, pp 164-168, 1996.
- Richard S. Sutton and Andrew G. Barto, "Reinforcement Learning: An Introduction", Cambridge, MA: MIT Press, 1998.
- Victoria J. Hodge and Jim Austin, "A Survey of Outlier Detection Methodologies", Artificial Intelligence Review, Vol. 22, No. 2, pp. 85-126, 2004.
- Karanjit Singh and Shuchita Upadhyaya, "Outlier Detection: Applications and Techniques", International Journal of Computer Science Issues, Vol. 9, Issue. 1, No. 3, pp. 307-323, 2012.
- Hugo Jair Escalante, "A Comparison of Outlier Detection Algorithms for Machine Learning", CIC-2005 Congreso Internacional en Computacion-IPN, 2005.
- Pierre Geurts, Alexandre Irrthum, Louis Wehenkel, "Supervised learning with decision tree-based methods in computational and systems biology", Molecular BioSystems, Vol. 5, No. 12, pp. 1593-1605, 2009.
- L. Breiman, J. Friedman, R. A. Olsen and C. J. Stone, "Classification and Regression Trees", Belmont, California: Wadsworth International Group, 1984.
- J. Quinlan, "C4.5: Programs for machine learning", San Francisco, CA: Morgan Kaufmann, 1986.
- Masud Karim and Rashedur M. Rahman, "Decision Tree and Naïve Bayes Algorithm for Classification and Generation of Actionable Knowledge for Direct Marketing", Journal of Software Engineering and Applications, Vol. 6, No. 4, pp. 196-206, 2013.
- Earl B. Hunt, Janet Marin and Philip J. Stone, "Experiments in Induction", New York: Academic Press, 1966.
- Leo Breiman, Jerome Friedman, Charles J. Stone and R. A. Olshen, "Classification and Regression Trees (Wadsworth Statistics/Probability)", Chapman and Hall/CRC, 1984.
- Steven L. Salzberg, "Book Review: C4.5: Programs for Machine Learning by J. Ross Quinlan. Inc., 1993", Machine Learning, Vol. 16, No. 3, pp. 235-240, 1994.
- Johannes Fürnkranz, "Separate-and-Conquer Rule Learning", Artificial Intelligence Review, Vol. 13, pp. 3-54, 1999.
- Aijun An and Nick Cercone, "Discretization of continuous attributes for learning classification rules", Third Pacific- Asia Conference on Methodologies for Knowledge Discovery & Data Mining, Vol. 1574, pp. 509-514, 1999.
- Mehryar Mohri, Afshin Rostamizadeh and Ameet Talwalkar, "Foundations of Machine Learning", One Rogers Street Cambridge MA: The MIT Press, 2012.
- Olivier Bousquet, St´ephane Boucheron and G´abor Lugosi, "Introduction to Statistical Learning Theory", Lecture Notes in Computer Science, Vol. 3176, pp. 175- 213, 2004.
- Olivier Pourret, Patrick Naim and Bruce Marcot, "Bayesian Networks: A Practical Guide to Applications", Wiley Publishers, 2008.
- Kamal Nigam, John Lafferty and Andrew McCallum, "Using Maximum Entropy for Text Classification", Workshop on Machine Learning for Information Filtering, pp. 61-67, 1999.
- N. J. Nilsson, "Learning Machines: Foundations of Trainable Pattern-Classifying Systems", First Edition, New York: McGraw-Hill, 1965.
- Isidore Jacob Good, "Probability and the Weighing of Evidence", The University of Wisconsin -Madison: Charles Griffin, 1950.
- Shiliang Sun, Changshui Zhang and Guoqiang Yu, "A Bayesian Network Approach to Traffic Flow Forecasting", IEEE Transactions on Intelligent Transportation Systems, Vol. 7, No. 1, pp. 124-132, 2006.
- Jie Cheng, Russell Greiner, Jonathan Kelly, David Bell and Weiru Liu, "Learning Bayesian networks from data: An information-Theory based approach", The Artificial Intelligence Journal, Vol. 137, pp. 43-90, 2002.
- Tom M. Mitchell, "Machine Learning: A Guide to Current Research", The Springer International Series in Engineering and Computer Science Series, McGraw Hill, 1997.
- D. Aha, "Lazy Learning", Dordrecht: Kluwer Academic Publishers, 1997.
- Ramon Lopez De Mantaras and Eva Armengol, "Machine learning from examples: Inductive and Lazy methods", Data and Knowledge Engineering, Vol. 25, No. 1-2, pp. 99-123, 1998.
- Hamid Parvin, Hoseinali Alizadeh and Behrouz Minati, "A Modification on K-Nearest Neighbor Classifier", Global Journal of Computer Science and Technology, Vol. 10, No. 14 (Ver.1.0), pp. 37-41, 2010.
- Yen-Liang Chen and Lucas Tzu-Hsuan Hung, "Using decision trees to summarize associative classification rules", Expert Systems with Applications, Vol. 36, No. 2, Part 1, pp. 2338-2351, 2009.
- Samy Bengio, Li Deng, Hugo Larochelle, Honglak Lee, and Ruslan Salakhutdinov, "Guest Editors' Introduction: Special Section on Learning Deep Architectures", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, No. 8, pp. 1795-1797, 2013.
- Qinghua Zheng, Zhaohui Wu, Xiaocheng Cheng, Lu Jiang and Jun Liu, "Learning to crawl deep web", Information Systems, Vol. 38, No. 6, pp. 801-819, 2013.
- Xue-Wen Chen and Xiaotong Lin," Big Data Deep Learning: Challenges and Perspectives", IEEE Access Practical Innovations: Open Solutions and Access and IEEE, Vol. 2, pp. 514-525, 2014.
- Rajat Raina, Anand Madhavan and Andrew Yg, "Large- scale Deep Unsupervised Learning using Graphics Processors", 26 th International Conference on Machine Learning, pp. 609-616, 2009.
- Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks", Advances in Neural Information Processing System, pp. 1106-1114, 2012.
- Jeffrey Dean, Greg S. Corrado and Rajat Monga Kai, "Large Scale Distributed Deep Networks", Advances in Neural Information Processing System, pp. 1232-1240, 2012.
- Quoc V. Le, Marc'Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean, and Andrew Y. Ng, "Building High-level Features Using Large Scale Unsupervised Learning", Proceedings of the 29 th International Conference on Machine Learning, 2012.
- A. Coats and B. Huval, "Deep Learning with COTS HPS systems", Journal of Machine Learning Research, Vol. 28, No. 3, pp. 1337-1345, 2013.
FAQs
AI
What explains the effectiveness of SVMs in high-dimensional spaces?add
The research indicates that SVMs effectively utilize a subset of training points, known as support vectors, which enhances their efficiency in high-dimensional spaces.
How does decision tree induction compare to other supervised methods?add
Decision trees like C4.5 are favored for their interpretability and scalability, demonstrating considerable predictive accuracy, particularly in high-throughput datasets.
What are the challenges associated with Bayesian networks in classification?add
Bayesian networks struggle with datasets that contain numerous features, as constructing large networks is computationally intensive and often impractical.
When is k-fold cross-validation preferred over leave-one-out methods?add
The study reveals that k-fold cross-validation, particularly with k values of 10 or 20, is typically preferred for computational efficiency.
What are the key characteristics of Naive Bayes classifiers?add
Naive Bayes classifiers assume independence among features, simplifying probability estimation and making them efficient for classification tasks.