Svm Research Papers - Academia.edu (original) (raw)

Twitter is the popular and commonly used social networking platform because it permits users to express their thoughts, opinions about any item, and allows them to post comments or messages all around the world. Sentiment Analysis... more

Twitter is the popular and commonly used social networking platform because it permits users to express their thoughts, opinions about any item, and allows them to post comments or messages all around the world. Sentiment Analysis techniques are used to study and analyze these reviews or opinions. Sentiment analysis is a NLP technique that is used to express opinions into dif erent sentiments like positive, negative, and neutral. In this paper, we take Airline Dataset from Twitter and did sentiment analysis on that dataset using machine learning algorithms like SVM, Naïve Bayes and Random Forest. Sentiments are expressed in three categories positive, negative and neutral. Our dataset contains 11533 tweets and the dataset is not balanced. The performance of various machine learning algorithms is discussed in this paper

Breast cancer is considered to be the second most common type of cancer affecting the female population worldwide. It is estimated that more than 508 000 women died in 2011 as a result of breast cancer. The survival rates of breast cancer... more

Breast cancer is considered to be the second most common type of cancer affecting the female population worldwide. It is estimated that more than 508 000 women died in 2011 as a result of breast cancer. The survival rates of breast cancer are lower in less developed countries mainly due to the absence of early detection methods resulting in a great percentage of women showing with late-stage disease. Early detection and medical diagnosis are known to be the most effective solution to minimize the risk of tumor development and progression. There are different methods for Early detection of breast cancer which include screening tests and clinical breast exams performed by a well-trained health professional. Due to a lack of facilities and cost, many women in less developed countries may not be able to use the mentioned methods. The objective associated with this research was to achieve an affordable and cost-effective prediction model of breast cancer based on anthropometric data and parameters that can easily be collected in a routine and regular blood test. For every one of the 166 individuals number of clinical features such as age, Body Mass Index (MBI), serum glucose levels, plasma levels of insulin, etc. were measured and observed. Various learning algorithms including Support Vector Machines (SVM), K-Nearest Neighbors (K-NN) and logistic regression(LR), etc. have been applied and compared with one another. The result shows that SVM and K-NN models perform well and allow prediction of breast cancer in women with accuracy more than 78%, the sensitivity of 78% and 79%, and Specificity value is 77% and 79% respectively.

Fault detection and prediction in the semiconductor industry will enable organizations to optimize their resources and maintain a high production yield for semiconductors. However, the data generated from the sensors are producing an... more

Fault detection and prediction in the semiconductor industry will enable organizations to optimize their resources and maintain a high production yield for semiconductors. However, the data generated from the sensors are producing an abundance amount of data being kept in the repository and it is difficult for process engineers to analyze and make use of the dataset to make timely and accurate fault prediction and detection. Hence, this research will focus on developing a predictive model to predict and detect faulty equipment in the semiconductor manufacturing process. The quality of the data is enhanced through various analyses, feature selection and dimensionality reduction techniques. Random Forest, Logistic Regression, Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Artificial Neural Network (ANN) will be the main algorithms used to develop the predictive models and their respective results compared to determine the best model to be used.

Muscles can cause injury by training to improve physical performance. However, there are few ways to assess muscle fatigue currently. Therefore, in this paper, muscle fatigue is evaluated using surface EMG(ElectroMyoGram). The proposed... more

Muscles can cause injury by training to improve physical performance. However, there are few ways to assess muscle fatigue currently. Therefore, in this paper, muscle fatigue is evaluated using surface EMG(ElectroMyoGram). The proposed method in this research consists of 4 parts: Measurement, Pre-processing, Feature extraction, and Learning identification parts. The effectiveness of the proposed method is demonstrated in two ways, classification and regression analysis, and comparative verification is conducted.

An audio is a sound within audible range available to human. This range is study of all mechanical waves in gases, liquids and solids including vibration, and sound. Audio segmentation is most important preprocessing step in several audio... more

An audio is a sound within audible range available to human. This range is study of all mechanical waves in gases, liquids and solids including vibration, and sound. Audio segmentation is most important preprocessing step in several audio processing applications with a significant impact e.g. on speech recognition system. This paper introduces a framework which analyzes the advantages of different segmentation and classification methods. The different audio segmentation methods like Bayesian Information Criterion (BIC), two phase audio segmentation method, metric-based method are studied. Audio segmentation plays important role in audio classification. After segmenting audio it is classified into different classes. The classification model consists of rule-based and model based.

We present a new method to detect the presence of the hollow heart, an internal disorder of the potato tubers, using hyperspectral imaging technology in the infrared region. A set of 468 hyperspectral cubes of images has been acquired... more

We present a new method to detect the presence of the hollow heart, an internal disorder of the potato tubers, using hyperspectral imaging technology in the infrared region. A set of 468 hyperspectral cubes of images has been acquired from Agria variety potatoes, that have been cut later to check the presence of a hollow heart. We developed several experiments to recognize hollow heart potatoes using different Artificial Intelligence and Image Processing techniques. The results show that Support Vector Machines (SVM) achieve an accuracy of 89.1% of correct classification. This is an automatic and non-destructive approach, and it could be integrated into other machine vision developments.

Oral cancer is one of the most dangerous cancers which affects and originates from the oral cavity and neck. Overuse of tobacco and smoking cigarettes are the primary risk factor for developing oral cancer. This technique derives a group... more

Oral cancer is one of the most dangerous cancers which affects and originates from the oral cavity and neck. Overuse of tobacco and smoking cigarettes are the primary risk factor for developing oral cancer. This technique derives a group of features that would help the classifiers to identify the image state automatically. Various machine learning methods are applied on the datasets and their performance are analyzed. The derived features were classified using CNN, which are compared against various standard classification approaches such as SVM, Naive bayes. From the results, it is observed that the different stage classification of oral cancer can be classified effectively. Hence, the classification of various oral cancers can be achieved more efficiently by means of CNN.

Recent advances in neuroimaging demonstrate the potential of functional near-infrared spectroscopy (fNIRS) for use in brain–computer interfaces (BCIs). fNIRS uses light in the near-infrared range to measure brain surface haemoglobin... more

Recent advances in neuroimaging demonstrate the potential of functional near-infrared spectroscopy (fNIRS) for use in brain–computer interfaces (BCIs). fNIRS uses light in the near-infrared range to measure brain surface haemoglobin concentrations and thus determine human neural activity. Our primary goal in this study is to analyse brain haemodynamic responses for application in a BCI. Specifically, we develop an efficient signal processing algorithm to extract important mental-task-relevant neural features and ...

Internet usage has become intensive during the last few decades; this has given rise to the use of email which is one of the fastest yet cheap modes of communication. The growing demand of email communication has given rise to the spam... more

Internet usage has become intensive during the last
few decades; this has given rise to the use of email
which is one of the fastest yet cheap modes of
communication. The growing demand of email
communication has given rise to the spam email
which is also known as unsolicited mails. In this
paper we propose an ensemble model that uses
majority voting on top of several classifiers to
detect spam. The classification algorithms used for
this purpose are Naïve Bayesian, Support Vector
Machines, Random Forest, Decision Stump and kNearest Neighbor. Majority voting generates the
final decision of the ensemble by obtaining major
votes from the classifiers. The sample dataset used
for this task is taken from UCI and the tool
Rapidminer is used for the validation of the results.

Comparison study of algorithms is very much required before implementing them for the needs of any organization. The comparisons of algorithms are depending on the various parameters such as data frequency, types of data and relationship... more

Comparison study of algorithms is very much required before implementing them for the needs of any organization. The comparisons of algorithms are depending on the various parameters such as data frequency, types of data and relationship among the attributes in a given data set. There are number of learning and classifications algorithms are used to analyse, learn patterns and categorize data are available. But the problem is the one to find the best algorithm according to the problem and desired output. The desired result has always been higher accuracy in predicting future values or events from the given dataset. Algorithms taken for the comparisons study are Neural net, SVM, Naïve Bayes, BFT and Decision stump. These top algorithms are most influential data mining algorithms in the research community. These algorithms have been considered and mostly used in the field of knowledge discovery and data mining.

In this paper, we present a vehicle detection system by employing Histogram of Oriented Gradients (HOG) for feature extraction and linear SVM for classification. We study the influence of the color space on the performance of the... more

In this paper, we present a vehicle detection system by employing Histogram of Oriented Gradients (HOG) for feature extraction and linear SVM for classification. We study the influence of the color space on the performance of the detector, concluding that decorrelated and perceptual color spaces give the best results. An in-depth analysis is carried out on the effects of the HOG and SVM parameters, the threshold for the distance between features and the SVM classifying plane, and the non-maximum suppression (NMS) threshold on the performance of the detector, and we propose values that illustrate good performance for vehicle detection on images. We also discuss the issues of the approach and the reasons for its mediocre performance on videos. Finally, we address these issues by presenting ideas that can be considered for improving the system.

Machine learning is one of the aspect of artificial intelligence that allows the development of computer systems that have the ability to learn from experiences without being the need of programming it for every instance. Machine learning... more

Machine learning is one of the aspect of artificial intelligence that allows the development of computer systems that have the ability to learn from experiences without being the need of programming it for every instance. Machine learning is dire need of today's scenario to eliminate human effort as well as come up with higher automation with less errors. This paper focuses on the review of Early Diabetes detection using machine learning techniques and detection of the frequently occurred disorders with it-mainly Diabetic retinopathy and diabetic neuropathy. The data set employed in most of the concerned literature is Pima Indian Diabetic Data Set. Early diabetes detection is significant as it helps to reduce the fatal effects of the diabetes. Various machine learning techniques like artificial neural network, principal component, decision trees, genetic algorithms, Fuzzy logic etc. have been discussed and compared. This paper first introduces the basic notions of diabetes and then describes the various techniques used to detect it. An extensive literature survey is then presented with relevant conclusion and future scopes with analysis have been discussed.

The dramatic rise in the sizes of pictures databases has blended the advancement of powerful and productive recovery frameworks. The improvement of these frameworks began with recovering pictures utilizing printed implications however... more

The dramatic rise in the sizes of pictures databases has blended the advancement of powerful and productive recovery frameworks. The improvement of these frameworks began with recovering pictures utilizing printed implications however later presented picture recovery dependent on the substance. This came to be known as Content-Based Image Retrieval or CBIR. Frameworks utilizing CBIR recover pictures dependent on visual highlights, for example, surface, shading and shape, rather than relying upon picture depictions or printed ordering. In the proposed work we will use various types of image features like colour, texture, shape, energy, amplitude and cluster distance to classify the images according to the query image. We will use multi-SVM technique along with the clustering technique to compare the features of the input image with the input dataset of images to extract the similar images as that of the query image.

Face recognition and identification is essential for security and investigation process. A facial recognition system has an ability to identify, or verify a person from a digital picture or a video frame. Facial recognition is used in... more

Face recognition and identification is essential for security and investigation process. A facial recognition system has an ability to identify, or verify a person from a digital picture or a video frame. Facial recognition is used in real time applications such as attendance system, to unlock mobiles, tagging others on social media, payments, advertise, diagnose diseases, etc. The major requirements of real time applications using face recognition include high recognition rate and low training time. The proposed face recognition model combines the FaceNet with SVM for face embedding feature extraction and classification respectively. To reduce training time and increase the recognition rate, the concept of transfer learning is used. Multi-Task Cascaded Convolution Neural Network (MTCNN) model is used to extract the 5-point landmarks on face frames, the extracted face frame is sent to FaceNet to extract the embedding and later is classified using Support Vector Machine (SVM) model. The LFW-dataset was used to pre-train the FaceNet model whereas 5 Celebrity Face dataset was used for training and validation of the system. MTCCN with SVM outperform in detection and recognition of real time faces. The proposed system is capable of recognizing the face with a 99.85%, 99.85% and 100% accuracy when the face is straight and slightly turned to left or right. he proposed model is capable of identifying real time objects by detecting the facial regions using MTCCN based on Euclidean distance features extraction and trained by the SVM to identify and classify the object. MTCCN with SVM, improves recognition rate overcoming angle, tilt and intensity of the image.

Fruit disease cause a calamitous problem and results in economic and agricultural industry loss. Earlier detection of infected fruit had done manually but now with the advancement in technology image processing techniques was developed.... more

Fruit disease cause a calamitous problem and results in economic and agricultural industry loss. Earlier detection of infected fruit had done manually but now with the advancement in technology image processing techniques was developed. There are two phases; first is for training and the other is for testing. In training phases, all the data related to the infected and non-infected fruit is stored and in testing phase, it is analyzed that whether the fruit is infected or not and if yes then by which disease. In this paper, the different existing techniques used to detect the infected fruit are discussed. These techniques are proved to be beneficial for the farmers as they help in detection of fruit disease in early stages.

— Text mining is drawing enormous attention in this era as there is a huge amount of text data getting generated and it is required very hardly to manage this data to grasp maximum benefit out of it. Text classification is an essential... more

— Text mining is drawing enormous attention in this era as there is a huge amount of text data getting generated and it is required very hardly to manage this data to grasp maximum benefit out of it. Text classification is an essential sub-part of text mining where the related text data is assigned to a particular predefined category. In our study, we discussed different classifier techniques which are popularly used in recent years. There is comparison between different classifiers like SVM, Naïve Bayes, Neural Networks etc. which is expressed in a tabular form in this paper.

Globalization and growth of Internet users truly demands for almost all internet based applications to support local languages. Support of local languages can be given in all internet based applications by means of Machine Transliteration... more

Globalization and growth of Internet users truly demands for almost all internet based applications to support local languages. Support of local languages can be given in all internet based applications by means of Machine Transliteration and Machine Translation. This paper provides the thorough survey on machine transliteration models and machine learning approaches used for machine transliteration over the period of more than two decades for internationally used languages as well as Indian languages. Survey
shows that linguistic approach provides better results for the closely related languages and probability based statistical approaches are good when one of the languages is phonetic and other is nonphonetic.Better accuracy can be achieved only by using Hybrid and Combined models.

Breast cancer has become a common factor now-a-days. Despite the fact, not all general hospitals have the facilities to diagnose breast cancer through mammograms. Waiting for diagnosing a breast cancer for a long time may increase the... more

Breast cancer has become a common factor now-a-days. Despite the fact, not all general hospitals
have the facilities to diagnose breast cancer through mammograms. Waiting for diagnosing a breast
cancer for a long time may increase the possibility of the cancer spreading. Therefore a computerized
breast cancer diagnosis has been developed to reduce the time taken to diagnose the breast cancer and
reduce the death rate. This paper summarizes the survey on breast cancer diagnosis using various machine
learning algorithms and methods, which are used to improve the accuracy of predicting cancer. This survey
can also help us to know about number of papers that are implemented to diagnose the breast cancer.