somjit Arch-int - Profile on Academia.edu (original) (raw)

Papers by somjit Arch-int

Customer Behavior Analysis Using Data Mining Techniques

2018 International Seminar on Application for Technology of Information and Communication, 2018

Buying behavior is considered a critical role in organizations' marketing management. Analyzi... more Buying behavior is considered a critical role in organizations' marketing management. Analyzing accurate buying behavior helps entrepreneurs to adopt production plan and marketing effectively. This research proposes an enhanced approach to analyzing buying behavior of targeted customers to acquire their high dimensional data on buying product patterns. The method is divided into three stages: First stage is clustering purchase of products based on type of customers by using K-Means and selecting the appropriate clusters by using Elbow method. The outcome from this stage was the similar purchased product items, which were individually categorized; in the second stage analyzing buying behavior using Apriori Algorithm. Then in ARA-1 we specify threshold 2 values. They are support and confidence not less than 10% and 70%, respectively and ARA-2 is adding the Lift values to the threshold not less than 1. The outcomes from these stages were buying patterns of the individual group. And the third stage is the accuracy of buying product patterns were evaluated by experts. In this study, the finding was trialed using purchasing data from a retail store in Thailand, revealing that the proposed method was likely to effectively analyze buying behavior with dimensional data. The accuracy of buying behavior analysis ARA-2 was higher than 88% and higher than ARA-1 38%. Interestingly, the approach revealed new buying behavior not previously reported.

Sentiment Analysis Process for Product's Customer Reviews Using Ontology-Based Approach

2018 International Conference on System Science and Engineering (ICSSE), 2018

Today, data in a vast number of social networks are abundantly utilized to help consumers make de... more Today, data in a vast number of social networks are abundantly utilized to help consumers make decisions in selecting products. While companies endeavor to analyze and interpret the multitude of customer opinions and sentiments, an accurate assessment becomes problematic. Many research studies encounter semantic conflicts of words or synonymous words, and errors occur within the SentiWordNet algorithm, when assessing both positive and negative words in some sentences. The present study, therefore, aims to solve the above-mentioned problems through DBpedia, and addresses the differences in word meanings, and to create a user interface for retrieving products in the form of keywords, in order to help consumers make decisions in selecting products. The efficiency measurement of sentiment analysis within the present study was 94%.

Detecting cluster numbers based on density changes using density-index enhanced Scale-invariant density-based clustering initialization algorithm

2017 9th International Conference on Information Technology and Electrical Engineering (ICITEE), 2017

Despite high accuracy, K-means relies mainly on the determination of the suitable number of clust... more Despite high accuracy, K-means relies mainly on the determination of the suitable number of clusters. To cope with, it is hypothesized that in a dataset region with high density tends to be a cluster. The present study is based on Scaleinvariant density-based clustering initialization, in which a cluster numbers is derived from density change analysis or density distribution analysis. However, the density calculation under this approach is based on the number and volume of data, which may result in inaccuracy for cluster detection. Thus, the objective of this study was to improve the performance of Scaleinvariant density-based clustering initialization to detect the appropriate cluster numbers and initial cluster centers. We proposed a density calculation based on data distance. The density value obtained from the calculation was used as a condition of data division and data merging for cluster detection. According to the experiment, compared to the Scale invariant density-based clustering initialization, the proposed method could detect the cluster numbers and initial cluster centers more equal or closer to the actual number of clusters. In addition, the level of accuracy in clustering was higher than its counterpart.

Measures of dependency in metric decision systems and databases

2017 International Conference on Soft Computing, Intelligent System and Information Technology (ICSIIT), 2017

Attribute dependencies play an essential role for the problem of attribute reduction in decision ... more Attribute dependencies play an essential role for the problem of attribute reduction in decision systems. Data dependencies, including Matching Dependencies (MDs) and Metric Functional Dependencies (MFDs) have been applied for data cleaning, duplication and violation detection in the data quality problem. Approximation measures are used to loosen the strictness of dependencies for a better adaptation with data in the real world. Therefore, this paper introduces metric rough sets and dependency measures for metric decision systems. These rough sets make a connection between the metric decision systems and databases that allows us to apply the dependency measures for MDs and MFDs. These results are important to develop the applications of rough sets and construct the algorithms for the attribute reduction and discovery of MDs and MFDs in the metric decision systems and databases.

Introducing Fuzzy Temporal Description Logic

Proceedings of the 3rd International Conference on Industrial and Business Engineering - ICIBE 2017, 2017

Description logics are considered as the logical infrastructure of knowledge representation on th... more Description logics are considered as the logical infrastructure of knowledge representation on the semantic Web as well as information systems. To deal with imprecise temporal knowledge and its applications, we introduce a fuzzy temporal description logic (FTDL) in which such temporal is characterized in a classic description logic. The syntaxes and semantics of fuzzy temporal description logic are formally defined, and the forms of axioms and assertions are specified. Furthermore, we also show how FTDL is able to capture the various modeling evolution constraints in a concise way and to perform reasoning on the lack of temporal information.

Computing and Informatics, 2017

In this paper, we present SEMAG-a novel semantic-agent learning recommendation mechanism which ut... more In this paper, we present SEMAG-a novel semantic-agent learning recommendation mechanism which utilizes the advantages of instructional Semantic Web rules and multi-agent technology, in order to build a competitive and interactive learning environment. Specifically, the recommendation-making process is contingent upon chapter-quiz results, as usual; but it also checks the students' understanding at topic-levels, through personalized questions generated instantly and dynamically by a knowledge-based algorithm. The learning space is spread to the social network, with the aim of increasing the interaction between students and the intelligent tutoring system. A field experiment was conducted in which the results indicated that the experimental group gained significant achievements, and thus it supports the use of SEMAG.

Expert Systems with Applications, 2018

Data dependencies in databases and attribute dependencies in decision systems are important when ... more Data dependencies in databases and attribute dependencies in decision systems are important when addressing problems concerning data quality and attribute reduction, in which measures play a significant role in approximating these dependencies to achieve better adaptation to uncertain data. This paper proposes a differential-relation-based rough set model from the perspective of relational databases to express the dependency degree, error measures, confidence, information granulation and differential class distance for differential dependencies (DDs) and the relationships among them in a unified framework. Moreover, the error measure g 3 has been widely studied and applied for data dependencies. However, the computation of g 3 for DDs is NP-complete. Therefore, based on the proposed rough set, we introduce a new method that can compute the approximate error measure g 3 of g 3 in polynomial time. This study demonstrates that our approach can provide a substantially better approximation, that is, an approximation closer to the optimal solution g 3 , compared to the existing greedy method. We also introduce the differential-relation-based rough set from the perspective of information systems and make a connection to the rough sets induced by non-equivalence relations. The two views of the differential-relation-based rough sets form an essential bridge between the DDs in databases and attribute dependencies in differential decision systems (DDSs)

Journal of Healthcare Engineering, 2017

Within the numerous and heterogeneous web services offered through different sources, automatic w... more Within the numerous and heterogeneous web services offered through different sources, automatic web services composition is the most convenient method for building complex business processes that permit invocation of multiple existing atomic services. The current solutions in functional web services composition lack autonomous queries of semantic matches within the parameters of web services, which are necessary in the composition of large-scale related services. In this paper, we propose a graph-based Semantic Web Services composition system consisting of two subsystems: management time and run time. The management-time subsystem is responsible for dependency graph preparation in which a dependency graph of related services is generated automatically according to the proposed semantic matchmaking rules. The run-time subsystem is responsible for discovering the potential web services and nonredundant web services composition of a user’s query using a graph-based searching algorithm....

Improving key concept extraction using word association measurement

2015 7th International Conference on Information Technology and Electrical Engineering (ICITEE), 2015

Ontologies play a very important role in information exchange and sharing, and are typically cons... more Ontologies play a very important role in information exchange and sharing, and are typically constructed by human experts. However, this process is very costly in both time and effort. Given this, there is a need for automated ontology construction from various knowledge (such as text files). A key challenge of automated ontology learning from text is to extract key concepts, which are relevant to the domain, from the documents. Existing approaches typically require a large set of training data with prior domain-specific knowledge. However, it is not always possible to provide such knowledge and trained data sets. To overcome this issue, we present a method to obtain key concepts from unstructured texts by using the word association measure and statistical knowledge. To demonstrate the efficiency of our method in comparison with a state-of-the-art method, extensive experiments, which employed two real-world datasets, were performed. The obtained results indicate that our method achieves better accuracy than the state of the art method for 3% to 10% in case of not having domain-specific knowledge. The results are more efficient if there are many noun phrases (in data sets) whose number of words is large.

Handbook of Research on Applied E-Learning in Engineering and Architecture Education

In online learning environments, peer assessment activities lack the observation and supervision ... more In online learning environments, peer assessment activities lack the observation and supervision by the teacher or instructor. Therefore, students may be lacked full effort to assess their peers. There exist the students' hesitation about criticizing their peers and scoring their peers honestly, the likelihood for peer assessment to be occasionally unreliable and unfair. The present assessment methods focus only on the single-dimensional assessment of content rather than the activities and collaborations among the students. Students also have no chance to analyze and comment on their peer answers. This study explored the multidimensional assessment method on open-ended question to foster positive attitudes and full effort among students engaging in E-learning environments. The objectives are as follows: 1) To develop a process model for multidimensional assessment (M-DA) to enable effective learning 2) To develop free-text answers assessment by using vector space model and seman...

2015 12th International Joint Conference on Computer Science and Software Engineering (JCSSE), 2015

This paper introduces a novel secure channel selection rule for spatial image steganography. In t... more This paper introduces a novel secure channel selection rule for spatial image steganography. In this scheme, there are two factors considered to identify a pixel, which causes less distortion to cover image, to be modified in data hiding. The first one is an average difference between considered pixel and its neighbors. The value of the considered pixel is the second employed factor. Obtained experimental results reported on 10,000 natural images indicate the higher visual quality and security of our new channel selection rule for spatial image steganography when compared with the previous approaches.

International Journal of Security and Its Applications, 2015

Embedding data into smooth regions introduces stego-images with poor security and visual quality.... more Embedding data into smooth regions introduces stego-images with poor security and visual quality. Edge adaptive steganography, in which the flat regions are not employed to carry a message at low embedding rates, was proposed. However, for the high embedding rates, smooth regions are contaminated to hide a secret message. In this paper, we present an adaptive multi-layer block data-hiding (MBDH) algorithm, in which the embedding regions are adaptively selected according to the number of the secret message bits and the texture characteristic of a cover-image. Via employing the MBDH algorithm, more secret message bits are embedded into the sharp regions. Therefore, the smooth regions are not used, even at high embedding rates. Furthermore, most of edge adaptive steganography algorithms have a limited capacity when the smooth regions are not employed in data hiding. The proposed scheme solves this issue when it can embed more secret bits into the selected regions while the perceptual quality of stego-images is still maintained. The experimental results were evaluated on 10,000 natural gray-scale images. The visual attack, targeted steganalysis, and universal steganalysis are employed to examine the performance of the proposed scheme. The results show that the new scheme significantly overcomes the previous edge-based approaches and least significant bit (LSB) based methods in term of security and visual quality.

International Journal of Security and Its Applications, 2015

Hybrid intrusion detection systems that make use of data mining techniques, in order to improve e... more Hybrid intrusion detection systems that make use of data mining techniques, in order to improve effectiveness, have been actively pursued in the last decade. However, their complexity to build detection models has become very expensive when confronted with large-scale datasets, making them unviable for real-time retraining. In order to overcome the limitation of the conventional hybrid method, we propose a new lightweight hybrid intrusion detection method that consists of a combination of feature selection, clustering and classification. According to our hypothesis that there are different natures of attack events in each of network protocols, the proposed method examines each of network protocol data separately, but their processes are the same. First, the training dataset is divided into training subsets, depending on their type of network protocol. Next, each training subset is reduced dimensionally by eliminating the irrelevant and redundant features throughout the feature selection process; and then broken down into disjointed regions, depending on their similar feature values, by K-Means clustering. Lastly, the C4.5 decision tree is used to build multiple misuse detection models for suspicious regions, which deviate from the normal and anomaly regions. As a result, each detection model is built from high-quality data, which are less complex and consist of relevant data. For better understanding of the enhanced performance, the proposed method was evaluated through experiments using the NSL-KDD dataset. The experimental results indicate that the proposed method is better in terms of effectiveness (F-value: 0.9957, classification accuracy: 99.52%, false positive rate: 0.26%), and efficiency (the training and testing times of the proposed method are approximately 33% and 25%, respectively, of the time required for its comparison) than the conventional hybrid method using the same algorithm.

Rule-based semantic web services annotation for healthcare information integration

Accessing and exchanging data across heterogeneous systems become problems due to the lack of a u... more Accessing and exchanging data across heterogeneous systems become problems due to the lack of a uniform system and an accepted standard. In healthcare domain, each healthcare institute has used the different Electronic Patient Records (EPR) systems to manage the patient health information. The EPR systems are developed proprietarily and often only serve one specific requirement within a healthcare institute. This research proposes a framework for interoperating the healthcare systems using semantic Web Services. A framework is designed to incorporate with important procedures, such as, the Semantic Web services annotation is imperative to cope with the semantic service discrepancies with the help of rule-based generation. The ontology mapping is the process of determining correspondences between information concepts with semantic bridges description to construct the semantic rules. As a result, the framework enables semantic interoperability among independently developed healthcare ...

Entertainment Computing, 2015

The diverse data types of musical information domain including binary and text-based structures c... more The diverse data types of musical information domain including binary and text-based structures create semantic gaps between the entities of different data formats. This leads to difficulties in analyzing, capturing and managing entities of the domain. In this paper, we present a semantic knowledge management system, called SEMU, to efficiently managing musical information. We propose SEMU ontology to capture information extracted from various data types and sources. In order to extract information from raw data, we use Musical Information Retrieval techniques for audio files and Natural Language Processing techniques for text-based formats. We develop a rule-based solution to enrich the system knowledge base. Later, we provide a web application with seamless integration between SEMU knowledge base and user interface to enable users to benefit from the advantages of the SEMU system.

In anomaly intrusion detection systems, machine learning algorithms, e.g. KNN, SOM, and SVM, are ... more In anomaly intrusion detection systems, machine learning algorithms, e.g. KNN, SOM, and SVM, are widely used to construct a model of normal system activity that are designed to work with numeric data. Consequently, symbolic data (e.g., TCP, SMTP, FTP, OTH, etc.) need to be converted into numeric data prior to being analyzed. From the previous works, there were different methods proposed for handling the symbolic data; for example, excluding symbolic data, arbitrary assignment, and indicator variables. However, these methods may entail a very difficult classification problem, especially an increase of the dimensionality of data that directly affect the computational complexity of machine learning algorithm. Thus, this paper proposed a new symbolic conversion method in order to overcome limitations of previous works by replacing the symbolic data with their risk values, obtained from knowledge-based extraction. The experiments affirmed that our proposed method was more effective in improving the classifier performance than did the previous works, and it did not increase the dimensionality of data.

A new mobile phone system architecture for the navigational travelling blind

2012 Ninth International Conference on Computer Science and Software Engineering (JCSSE), 2012

ABSTRACT This paper introduces a new mobile phone system architecture to aid the blind in travell... more ABSTRACT This paper introduces a new mobile phone system architecture to aid the blind in travelling. The architecture has four main units: voice/speech recognition, Global Positioning System (GPS) and map services, ultrasonic sensor, and image processing service. These four subsystems function simultaneously within a single mobile device using web service interaction. The paper describes the instrument and component of the key design as well as implementation decisions. The feasibility of the architecture has been tested with Windows Phone 7 (HTC HD7), and cooperated with the real blind. Additionally, we have evaluated the performance of Hough Transform for straight line detection - the main component intensively using the processing power - and then we have optimized the Hough Transform to reduce the complexity leading to lower average response time.

Practicing Software Engineering in the 21st Century

Global competition among today’s enterprises forces their business processes to evolve constantly... more Global competition among today’s enterprises forces their business processes to evolve constantly, leading to changes in corresponding Web-based application systems. Most existing approaches that extend the traditional software engineering to develop Web-based application systems are based on object-oriented methods. Such methods emphasize modeling individual object behaviors instead of system behavior. This chapter proposes the Business Process-Based Methodology (BPBM) for developing such systems. It uses a business process as a unified conceptual framework for analyzing relationships between a business process and associated business objects and for identifying business activities and designing object-oriented components called business components. We propose measures for coupling and cohesion measurement in order to ensure that these business components enable the potential reusability. These business components can more clearly represent semantic system behaviors than linkages o...

Research Journal of Information Technology, 2014

LDL-Cholesterol Levels Measurement Using Hybrid Genetic Algorithm and Multiple Linear Regression

2013 International Conference on Information Science and Applications (ICISA), 2013

ABSTRACT Cholesterol level is the significant factor which causes cardiovascular disease. The cho... more ABSTRACT Cholesterol level is the significant factor which causes cardiovascular disease. The cholesterol types used to measure the fat level are total, low density lipoprotein (LDL), high density lipoprotein (HDL) and Triglycerides (TG). There are two methods used to measure the cholesterol level. The first method is by directly measuring the patient blood which although yields the best accuracy, is accompanied by the high cost. The second method is the calculation method, which has a lower cost, and a lower accuracy. High levels of LDL cholesterol are important factor that increase the risk for patients to acquire the disease. The cost for the high accuracy of LDL cholesterol levels detection is expensive. In order to decrease the overall cost, the detection process using the calculation method requires improvement of accuracy which could then justify a change to use this method. This study presents the combination methods between Multiple Linear Regression (MLR) and a Hybrid Genetic Algorithm (HGA) to explore an equation that is precise and suitable to detect the LDL cholesterol. In this experiment, we compare the results from MLR-HGA technique with the other three methods, i.e. Friedewald formula (FF), MLR and Multiple Linear Regression Genetic Algorithm (MLR-GA). The findings resulted in an investigated that the MLR-HGA techniques have a higher accuracy than the results from other three methods.

Customer Behavior Analysis Using Data Mining Techniques

2018 International Seminar on Application for Technology of Information and Communication, 2018

Sentiment Analysis Process for Product's Customer Reviews Using Ontology-Based Approach

2018 International Conference on System Science and Engineering (ICSSE), 2018

Detecting cluster numbers based on density changes using density-index enhanced Scale-invariant density-based clustering initialization algorithm

2017 9th International Conference on Information Technology and Electrical Engineering (ICITEE), 2017

Measures of dependency in metric decision systems and databases

2017 International Conference on Soft Computing, Intelligent System and Information Technology (ICSIIT), 2017

Introducing Fuzzy Temporal Description Logic

Proceedings of the 3rd International Conference on Industrial and Business Engineering - ICIBE 2017, 2017

Computing and Informatics, 2017

Expert Systems with Applications, 2018

Journal of Healthcare Engineering, 2017

Improving key concept extraction using word association measurement

2015 7th International Conference on Information Technology and Electrical Engineering (ICITEE), 2015

Handbook of Research on Applied E-Learning in Engineering and Architecture Education

2015 12th International Joint Conference on Computer Science and Software Engineering (JCSSE), 2015

International Journal of Security and Its Applications, 2015

Rule-based semantic web services annotation for healthcare information integration

Entertainment Computing, 2015

A new mobile phone system architecture for the navigational travelling blind

2012 Ninth International Conference on Computer Science and Software Engineering (JCSSE), 2012

Practicing Software Engineering in the 21st Century

Research Journal of Information Technology, 2014

LDL-Cholesterol Levels Measurement Using Hybrid Genetic Algorithm and Multiple Linear Regression

2013 International Conference on Information Science and Applications (ICISA), 2013