Husniza Husni | Universiti Utara Malaysia (original) (raw)
Papers by Husniza Husni
Journal of Creative Industry and Sustainable Culture
The digital graphic novel (DGN) is a comic book evolution that tells stories in a range of genres... more The digital graphic novel (DGN) is a comic book evolution that tells stories in a range of genres. Due to its benefits, DGN has gained a reputation over the years. This paper explores the theoretical parts of DGNs that make them valuable resources for research and infor-mation, identifies the benefits for designers, researchers, and DGN readers. Nonetheless, this study focuses on the aesthetic values and components in designing appealing DGNs. A systematic literature review has been performed on 128 sources. In general, findings from this review have classified five benefits of DGN that impact the educational areas; DGN can encourage learners to read, able to improve critical thinking, serve as a new dimension of learning, build visualliteracy, and able to give positive values in learning. This study sum-marizes the current literature and identified six aesthetic elements and fourteen components that affect the design of DGNs. This study presented a research outline with several opp...
INSTITUT TERJEMAHAN & BUKU MALAYSIA BERHAD, 2015
Kandungan buku ini mengambil kira semua komponen dalam kurikulum mata pelajaran Bahasa Melayu bag... more Kandungan buku ini mengambil kira semua komponen dalam kurikulum mata pelajaran Bahasa Melayu bagi Tahap 1.Ia mengambil kira pendekatan pengajaran Bahasa Melayu sebagai bahasa pertama. Kurikulum ini adalah kurikulum perdana yang sepatutnya difahami oleh semua murid Tahap 1 selepas mereka menamatkan pelajaran di peringkat ini.Bahan dalam Lembaran Kerja (LK) dan Lembaran Maklumat (LM) bermatlamat memenuhi 22 objektif KSSR bagi mata pelajaran Bahasa Melayu dengan memberi penekanan kepada pembelajaran secara didik hibur bagi mencungkil pelbagai potensi murid.Buku ini dapat membantu guru dan ibu bapa dalam membimbing murid disleksia untuk belajar membaca dan menulis. Ia merupakan buku yang unik berbanding dengan bahan bacaan yang lain kerana menggunakan pendekatan mesra disleksia.Sebagaimana yang kita tahu, kanak-kanak memang menggemari kartun dan komik.Melalui penelitian saya proses penghasilannya dibuat melalui pembacaan yang luas dan penyelidikan yang tekun daripada para penulis.Pendekatan kartun dan penulisan santai tetapi berilmu lagi bermaklumat mampu menarik minat murid disleksia untuk belajar membaca.
Industry revolution 4.0 (IR 4.0) is focus on transforming all individual processes in computing. ... more Industry revolution 4.0 (IR 4.0) is focus on transforming all individual processes in computing. Data digitalization in IR 4.0 allows production to monitor performance, workflow, management of all machines and access company record remotely using diversity of electronic and smart devices like smart phones, tablets, laptops, TVs and smart watches. Yet, preparing IR 4.0 platform in an industry is very challenging and requires management to prepare all the nine IR 4.0 components including Big Data. Big Data has been leads from tremendous growth of data directed from the increasing number of communication devices every day. This data perceives meaningful information that can be utilized by the industry to improve their product quality, sales and services. Mostly, Big Data was aligned in unstructured format and entails expertise and powerful facilities to decode them. Hence, a questionnaire has been distributed to 29 respondents which represent IT educators at PTSS to measure their readi...
A systematic review of usability quality attributes for the evaluation of mobile learning applica... more A systematic review of usability quality attributes for the evaluation of mobile learning applications for children
THE 4TH INNOVATION AND ANALYTICS CONFERENCE & EXHIBITION (IACE 2019), 2019
One of most important techniques that plays a key role in elevating a mobile robot’s independence... more One of most important techniques that plays a key role in elevating a mobile robot’s independence is its ability to construct a map from an unknown surrounding in an unknown initial position, and with the use of onboard sensors, localize itself in this map. This technique is called simultaneous localization and mapping or SLAM. Over the last 30 years, numerous new and interesting inquiries have been raised, with the improvement of new techniques, new computational instruments, and new sensors. However, the big challenges facing mobile robots in the next decade, as in the autonomous urban vehicles, require extended representations that exceed traditional mapping found in classical SLAM systems, i.e. the so-called semantic representation. The main goal of a SLAM system with semantic concepts is to expand mobile robots’ services and strengthen human-robot interaction. Related works reviewed show that the visual-based SLAM or VSLAM has received a great deal of interest in the last decade. This is due to the visual sensors’ capability to provide information of the scene whereas they are low-priced, smaller and lighter than other sensors. Unlike the metric representation, semantic mapping is still immature, and it comes up short on durable formulation. This paper aims to systematically review recent researches related to the semantic VSLAM, including its types, approaches, and challenges. The paper also deals with the classical SLAM system by giving an overview of necessary information before getting into detail. This review also provides new researches in the SLAM domain facilities to further understand the anatomy of modern VSLAM generation, discover recent algorithms, and apprehend some open challenges.
International Journal of Data Mining, Modelling and Management, 2016
Existing conventional clustering techniques require a predetermined number of clusters, unluckily... more Existing conventional clustering techniques require a predetermined number of clusters, unluckily; missing information about real world problem makes it a hard challenge. A new orientation in data clustering is to automatically cluster a given set of items by identifying the appropriate number of clusters and the optimal centre for each cluster. In this paper, we present the WFA_selection algorithm that originates from weight-based firefly algorithm. The newly proposed WFA_selection merges selected clusters in order to produce a better quality of clusters. Experiments utilising the WFA and WFA_selection algorithms were conducted on the 20Newsgroups and Reuters-21578 benchmark dataset and the output were compared against bisect K-means and general stochastic clustering method (GSCM). Results demonstrate that the WFA_selection generates a more robust and compact clusters as compared to the WFA, bisect K-means and GSCM.
Global Journal on Technology, Dec 21, 2012
This paper proposes an automatic speech transcription for dyslexic children’s read speech using a... more This paper proposes an automatic speech transcription for dyslexic children’s read speech using a speech recognition engine trained on lexical and language models specifically constructed based on their recorded readings. Automatic transcription of recorded speech is useful to facilitate speech processing for researchers. It could effectively increase efficiency of the process by eliminating manual transcription that is usually very time-consuming and subject to human errors. To automate the process, automatic speech recognition engine is required to perform the task of automatically labeling and transcribing speech into its corresponding phonemic and text representations, resulting in an individual phoneme and text files for each read speech transcribed. For this purpose, a total of 6112 speech samples were recorded from a number of ten dyslexic children reading aloud prompted isolated words in Malay. The speech samples are used to perform the automatic transcription and the accuracy is measured. Evaluation is conducted to determine whether or not the automatic transcription accuracy is acceptable for use in speech processing research. The findings reveal that automatic transcription is promising to be used for speech processing. It reduces time and effort as well as simplifies the manual phonetic labelling and transcription process. Keywords: Automatic speech transcription; speech processing; read speech; automatic speech recognition;
AIP Conference Proceedings, 2015
The Document clustering plays significant role in Information Retrieval (IR) where it organizes d... more The Document clustering plays significant role in Information Retrieval (IR) where it organizes documents prior to the retrieval process. To date, various clustering algorithms have been proposed and this includes the K-means and Particle Swarm Optimization. Even though these algorithms have been widely applied in many disciplines due to its simplicity, such an approach tends to be trapped in a local minimum during its search for an optimal solution. To address the shortcoming, this paper proposes a Basic Firefly (Basic FA) algorithm to cluster text documents. The algorithm employs the Average Distance to Document Centroid (ADDC) as the objective function of the search. Experiments utilizing the proposed algorithm were conducted on the 20Newsgroups benchmark dataset. Results demonstrate that the Basic FA generates a more robust and compact clusters than the ones produced by K-means and Particle Swarm Optimization (PSO).
Lecture Notes in Computer Science, 2015
Text mining, in particular the clustering is mostly used by search engines to increase the recall... more Text mining, in particular the clustering is mostly used by search engines to increase the recall and precision of a search query. The content of online websites (text, blogs, chats, news, etc.) are dynamically updated, nevertheless relevant information on the changes made are not present. Such a scenario requires a dynamic text clustering method that operates without initial knowledge on a data collection. In this paper, a dynamic text clustering that utilizes Firefly algorithm is introduced. The proposed, aFAmerge, clustering algorithm automatically groups text documents into the appropriate number of clusters based on the behavior of firefly and cluster merging process. Experiments utilizing the proposed aFAmerge were conducted on two datasets; 20Newsgroups and Reuter’s news collection. Results indicate that the aFAmerge generates a more robust and compact clusters than the ones produced by Bisect K-means and practical General Stochastic Clustering Method (pGSCM).
Web services are changing the way how online business operates, especially in tourism domain. Typ... more Web services are changing the way how online business operates, especially in tourism domain. Typically, existing Web services are built individually as atomic services. The rapid growth of Web services has created the need for Web service composition so that clients can compose atomic services to achieve more complex tasks. Thus, to ease the process, automation is important. Automation means that the service composition is done with less or no user interference. Hence, we propose a framework to automatically compose Web services using SHOP2 planner. SHOP2 is a planner that implements AI planning technique, called Hierarchical Task Network (HTN). We propose and implement a framework to compose services available from the Australian Tourism Data Warehouse (ATDW) and present the example execution results. We also outline some drawbacks of our approach, identify open problems, and suggest future work to improve the framework.
Proceedings of the International Conference on Advances in Image Processing and Compuation Techniques, 2012
The rapid development of the internet eventually increases the number of internet users triggerin... more The rapid development of the internet eventually increases the number of internet users triggering the need for an intelligent search engine that is able to minimize the search on world wide web (WWW) and find relevant information as requested. To overcome the issue of finding relevant information as well as minimizing the search on WWW, this paper proposes a search engine that is specifically designed and built using RSS syndication and fuzzy Parameters to search for information contained in blogs. The blogs search engine consists of three main phases: 1) crawling using RSS feeds algorithm; 2) indexing weblogs algorithm; and 3) searching technique using fuzzy logic. In RSS crawling process, the RSS feeds need to be gathered to extract useful information such as title, links, time published, and description. Next, indexing weblogs uses the links to retrieve the blog sites for text processing and for constructing the indexing database. In order to retrieve such information requested or queried by any user, an interface is provided to enable the blog search based on keyword with associated degree of importance. The density of keyword is then computed from the indexing database. The rank of the pages is computed by using fuzzy weighted average. The experiment resulted in mean average precision of 81.7% of total system performance.
Lecture Notes in Computer Science, 2014
Document clustering is an important technique that has been widely employed in Information Retrie... more Document clustering is an important technique that has been widely employed in Information Retrieval (IR). Various clustering techniques have been reported, but the effectiveness of most techniques relies on the initial value of k clusters. Such an approach may not be suitable as we may not have prior knowledge on the collection of documents. To date, there are various swarm based clustering techniques proposed to address such problem, including this paper that explores the adaptation of Firefly Algorithm (FA) in document clustering. We extend the work on Gravitation Firefly Algorithm (GFA) by introducing a relocate mechanism that relocates assigned documents, if necessary. The newly proposed clustering algorithm, known as GFA_R, is then tested on a benchmark dataset obtained from the 20Newsgroups. Experimental results on external and relative quality metrics for the GFA_R is compared against the one obtained using the standard GFA and Bisect K-means. It is learned that by extending GFA to becoming GFA_R, a better quality clustering is obtained.
Advances in Intelligent Systems and Computing, 2014
This paper studies two clustering algorithms that are based on the Firefly Algorithm (FA) which i... more This paper studies two clustering algorithms that are based on the Firefly Algorithm (FA) which is a recent swarm intelligence approach. We perform experiments utilizing the Newton’s Universal Gravitation Inspired Firefly Algorithm (GFA) and Weight-Based Firefly Algorithm (WFA) on the 20_newsgroups dataset. The analysis is undertaken on two parameters. The first is the alpha (α) value in the Firefly algorithms and latter is the threshold value required during clustering process. Results showed that a better performance is demonstrated by Weight-Based Firefly Algorithm compared to Newton’s Universal Gravitation Inspired Firefly Algorithm.
Lecture Notes in Electrical Engineering, 2014
ABSTRACT The divisive clustering has the advantage to build a hierarchical structure that is more... more ABSTRACT The divisive clustering has the advantage to build a hierarchical structure that is more efficient to represent documents in search engines. Its operation employs one of the partition clustering algorithms that leads to being trapped in a local optima. This paper proposes a Firefly algorithm that is based on Newton’s law of universal gravitation, known as Gravitation Firefly Algorithm (GFA), for document clustering. GFA is used to find centers of clusters based on objective function that maximizes the force between each document and an initial center. Upon identification of a center, the algorithm then locates documents that are similar to the center using cosine similarity function. The process of finding centers for new clusters continues by sorting the light intensity values of the balance documents. Experimental results on Reuters datasets showed that the proposed Newton inspired Firefly algorithm is suitable to be used for document clustering in text mining.
Lecture Notes in Electrical Engineering, 2013
Existing clustering techniques have many drawbacks and this includes being trapped in a local opt... more Existing clustering techniques have many drawbacks and this includes being trapped in a local optima. In this paper, we introduce the utilization of a new meta-heuristics algorithm, namely the Firefly algorithm (FA) to increase solution diversity. FA is a nature-inspired algorithm that is used in many optimization problems. The FA is realized in document clustering by executing it on Reuters-21578 database. The algorithm identifies documents that has the highest light intensity in a search space and represents it as a centroid. This is followed by recognizing similar documents using the cosine similarity function. Documents that are similar to the centroid are located into one cluster and dissimilar in the other. Experiments performed on the chosen dataset produce high values of Purity and F-measure. Hence, suggesting that the proposed Firefly algorithm is a possible approach in document clustering.
Children are among the most frequent and important users of Internet. The children can search any... more Children are among the most frequent and important users of Internet. The children can search any type of data in any digital forms in the digital libraries, web directories, or in many other media repositories. However, one possible limitation of searching these digital ...
AIP Conference Proceedings, 2017
Highly phonetically similar reading mistakes often occur when dyslexic children read. In respect ... more Highly phonetically similar reading mistakes often occur when dyslexic children read. In respect to automatic speech transcription, these mistakes are challenging, even for manual transcription. The highly phonetically similar reading mistakes are difficult to be recognized, not to mention segmenting and labelling them accordingly for processing prior to training speech recognition (ASR). The need to automate the segmentation and labelling arise especially when we need to build an ASR for assisting dyslexic children's reading. Hence, the aim of this paper is to investigate the effects that highly phonetically similar errors have upon transcription and segmentation accuracy. A total of 585 speech files are used to produce manual transcription, forced alignment, and training. The recognition of ASR engine using automatic transcription and phonetic labelling obtained 76.04% accuracy with 23.9% word error rate and 18.1% false alarm rate. The results are almost similar with its manual counterpart with 76.26% accuracy, 23.7% word error rate and 17.9% false alarm rate.
As the internet is overload with information, various knowledge based systems are now equipped wi... more As the internet is overload with information, various knowledge based systems are now equipped with data analytics features that facilitate knowledge discovery.This includes the utilization of optimization algorithms that mimics the behavior of insects or animals.This paper presents an experiment on document clustering utilizing the Gravitation Firefly algorithm (GFA).The advantage of GFA is that clustering can be performed without a pre-defined value of k clusters.GFA determines the center of clusters by identifying documents with high force.Upon identification of the centers, clusters are created based on cosine similarity measurement.Experimental results demonstrated that GFA utilizing a random positioning of documents outperforms existing clustering algorithm such as Particles Swarm Optimization (PSO) and K-means.
Journal of Creative Industry and Sustainable Culture
The digital graphic novel (DGN) is a comic book evolution that tells stories in a range of genres... more The digital graphic novel (DGN) is a comic book evolution that tells stories in a range of genres. Due to its benefits, DGN has gained a reputation over the years. This paper explores the theoretical parts of DGNs that make them valuable resources for research and infor-mation, identifies the benefits for designers, researchers, and DGN readers. Nonetheless, this study focuses on the aesthetic values and components in designing appealing DGNs. A systematic literature review has been performed on 128 sources. In general, findings from this review have classified five benefits of DGN that impact the educational areas; DGN can encourage learners to read, able to improve critical thinking, serve as a new dimension of learning, build visualliteracy, and able to give positive values in learning. This study sum-marizes the current literature and identified six aesthetic elements and fourteen components that affect the design of DGNs. This study presented a research outline with several opp...
INSTITUT TERJEMAHAN & BUKU MALAYSIA BERHAD, 2015
Kandungan buku ini mengambil kira semua komponen dalam kurikulum mata pelajaran Bahasa Melayu bag... more Kandungan buku ini mengambil kira semua komponen dalam kurikulum mata pelajaran Bahasa Melayu bagi Tahap 1.Ia mengambil kira pendekatan pengajaran Bahasa Melayu sebagai bahasa pertama. Kurikulum ini adalah kurikulum perdana yang sepatutnya difahami oleh semua murid Tahap 1 selepas mereka menamatkan pelajaran di peringkat ini.Bahan dalam Lembaran Kerja (LK) dan Lembaran Maklumat (LM) bermatlamat memenuhi 22 objektif KSSR bagi mata pelajaran Bahasa Melayu dengan memberi penekanan kepada pembelajaran secara didik hibur bagi mencungkil pelbagai potensi murid.Buku ini dapat membantu guru dan ibu bapa dalam membimbing murid disleksia untuk belajar membaca dan menulis. Ia merupakan buku yang unik berbanding dengan bahan bacaan yang lain kerana menggunakan pendekatan mesra disleksia.Sebagaimana yang kita tahu, kanak-kanak memang menggemari kartun dan komik.Melalui penelitian saya proses penghasilannya dibuat melalui pembacaan yang luas dan penyelidikan yang tekun daripada para penulis.Pendekatan kartun dan penulisan santai tetapi berilmu lagi bermaklumat mampu menarik minat murid disleksia untuk belajar membaca.
Industry revolution 4.0 (IR 4.0) is focus on transforming all individual processes in computing. ... more Industry revolution 4.0 (IR 4.0) is focus on transforming all individual processes in computing. Data digitalization in IR 4.0 allows production to monitor performance, workflow, management of all machines and access company record remotely using diversity of electronic and smart devices like smart phones, tablets, laptops, TVs and smart watches. Yet, preparing IR 4.0 platform in an industry is very challenging and requires management to prepare all the nine IR 4.0 components including Big Data. Big Data has been leads from tremendous growth of data directed from the increasing number of communication devices every day. This data perceives meaningful information that can be utilized by the industry to improve their product quality, sales and services. Mostly, Big Data was aligned in unstructured format and entails expertise and powerful facilities to decode them. Hence, a questionnaire has been distributed to 29 respondents which represent IT educators at PTSS to measure their readi...
A systematic review of usability quality attributes for the evaluation of mobile learning applica... more A systematic review of usability quality attributes for the evaluation of mobile learning applications for children
THE 4TH INNOVATION AND ANALYTICS CONFERENCE & EXHIBITION (IACE 2019), 2019
One of most important techniques that plays a key role in elevating a mobile robot’s independence... more One of most important techniques that plays a key role in elevating a mobile robot’s independence is its ability to construct a map from an unknown surrounding in an unknown initial position, and with the use of onboard sensors, localize itself in this map. This technique is called simultaneous localization and mapping or SLAM. Over the last 30 years, numerous new and interesting inquiries have been raised, with the improvement of new techniques, new computational instruments, and new sensors. However, the big challenges facing mobile robots in the next decade, as in the autonomous urban vehicles, require extended representations that exceed traditional mapping found in classical SLAM systems, i.e. the so-called semantic representation. The main goal of a SLAM system with semantic concepts is to expand mobile robots’ services and strengthen human-robot interaction. Related works reviewed show that the visual-based SLAM or VSLAM has received a great deal of interest in the last decade. This is due to the visual sensors’ capability to provide information of the scene whereas they are low-priced, smaller and lighter than other sensors. Unlike the metric representation, semantic mapping is still immature, and it comes up short on durable formulation. This paper aims to systematically review recent researches related to the semantic VSLAM, including its types, approaches, and challenges. The paper also deals with the classical SLAM system by giving an overview of necessary information before getting into detail. This review also provides new researches in the SLAM domain facilities to further understand the anatomy of modern VSLAM generation, discover recent algorithms, and apprehend some open challenges.
International Journal of Data Mining, Modelling and Management, 2016
Existing conventional clustering techniques require a predetermined number of clusters, unluckily... more Existing conventional clustering techniques require a predetermined number of clusters, unluckily; missing information about real world problem makes it a hard challenge. A new orientation in data clustering is to automatically cluster a given set of items by identifying the appropriate number of clusters and the optimal centre for each cluster. In this paper, we present the WFA_selection algorithm that originates from weight-based firefly algorithm. The newly proposed WFA_selection merges selected clusters in order to produce a better quality of clusters. Experiments utilising the WFA and WFA_selection algorithms were conducted on the 20Newsgroups and Reuters-21578 benchmark dataset and the output were compared against bisect K-means and general stochastic clustering method (GSCM). Results demonstrate that the WFA_selection generates a more robust and compact clusters as compared to the WFA, bisect K-means and GSCM.
Global Journal on Technology, Dec 21, 2012
This paper proposes an automatic speech transcription for dyslexic children’s read speech using a... more This paper proposes an automatic speech transcription for dyslexic children’s read speech using a speech recognition engine trained on lexical and language models specifically constructed based on their recorded readings. Automatic transcription of recorded speech is useful to facilitate speech processing for researchers. It could effectively increase efficiency of the process by eliminating manual transcription that is usually very time-consuming and subject to human errors. To automate the process, automatic speech recognition engine is required to perform the task of automatically labeling and transcribing speech into its corresponding phonemic and text representations, resulting in an individual phoneme and text files for each read speech transcribed. For this purpose, a total of 6112 speech samples were recorded from a number of ten dyslexic children reading aloud prompted isolated words in Malay. The speech samples are used to perform the automatic transcription and the accuracy is measured. Evaluation is conducted to determine whether or not the automatic transcription accuracy is acceptable for use in speech processing research. The findings reveal that automatic transcription is promising to be used for speech processing. It reduces time and effort as well as simplifies the manual phonetic labelling and transcription process. Keywords: Automatic speech transcription; speech processing; read speech; automatic speech recognition;
AIP Conference Proceedings, 2015
The Document clustering plays significant role in Information Retrieval (IR) where it organizes d... more The Document clustering plays significant role in Information Retrieval (IR) where it organizes documents prior to the retrieval process. To date, various clustering algorithms have been proposed and this includes the K-means and Particle Swarm Optimization. Even though these algorithms have been widely applied in many disciplines due to its simplicity, such an approach tends to be trapped in a local minimum during its search for an optimal solution. To address the shortcoming, this paper proposes a Basic Firefly (Basic FA) algorithm to cluster text documents. The algorithm employs the Average Distance to Document Centroid (ADDC) as the objective function of the search. Experiments utilizing the proposed algorithm were conducted on the 20Newsgroups benchmark dataset. Results demonstrate that the Basic FA generates a more robust and compact clusters than the ones produced by K-means and Particle Swarm Optimization (PSO).
Lecture Notes in Computer Science, 2015
Text mining, in particular the clustering is mostly used by search engines to increase the recall... more Text mining, in particular the clustering is mostly used by search engines to increase the recall and precision of a search query. The content of online websites (text, blogs, chats, news, etc.) are dynamically updated, nevertheless relevant information on the changes made are not present. Such a scenario requires a dynamic text clustering method that operates without initial knowledge on a data collection. In this paper, a dynamic text clustering that utilizes Firefly algorithm is introduced. The proposed, aFAmerge, clustering algorithm automatically groups text documents into the appropriate number of clusters based on the behavior of firefly and cluster merging process. Experiments utilizing the proposed aFAmerge were conducted on two datasets; 20Newsgroups and Reuter’s news collection. Results indicate that the aFAmerge generates a more robust and compact clusters than the ones produced by Bisect K-means and practical General Stochastic Clustering Method (pGSCM).
Web services are changing the way how online business operates, especially in tourism domain. Typ... more Web services are changing the way how online business operates, especially in tourism domain. Typically, existing Web services are built individually as atomic services. The rapid growth of Web services has created the need for Web service composition so that clients can compose atomic services to achieve more complex tasks. Thus, to ease the process, automation is important. Automation means that the service composition is done with less or no user interference. Hence, we propose a framework to automatically compose Web services using SHOP2 planner. SHOP2 is a planner that implements AI planning technique, called Hierarchical Task Network (HTN). We propose and implement a framework to compose services available from the Australian Tourism Data Warehouse (ATDW) and present the example execution results. We also outline some drawbacks of our approach, identify open problems, and suggest future work to improve the framework.
Proceedings of the International Conference on Advances in Image Processing and Compuation Techniques, 2012
The rapid development of the internet eventually increases the number of internet users triggerin... more The rapid development of the internet eventually increases the number of internet users triggering the need for an intelligent search engine that is able to minimize the search on world wide web (WWW) and find relevant information as requested. To overcome the issue of finding relevant information as well as minimizing the search on WWW, this paper proposes a search engine that is specifically designed and built using RSS syndication and fuzzy Parameters to search for information contained in blogs. The blogs search engine consists of three main phases: 1) crawling using RSS feeds algorithm; 2) indexing weblogs algorithm; and 3) searching technique using fuzzy logic. In RSS crawling process, the RSS feeds need to be gathered to extract useful information such as title, links, time published, and description. Next, indexing weblogs uses the links to retrieve the blog sites for text processing and for constructing the indexing database. In order to retrieve such information requested or queried by any user, an interface is provided to enable the blog search based on keyword with associated degree of importance. The density of keyword is then computed from the indexing database. The rank of the pages is computed by using fuzzy weighted average. The experiment resulted in mean average precision of 81.7% of total system performance.
Lecture Notes in Computer Science, 2014
Document clustering is an important technique that has been widely employed in Information Retrie... more Document clustering is an important technique that has been widely employed in Information Retrieval (IR). Various clustering techniques have been reported, but the effectiveness of most techniques relies on the initial value of k clusters. Such an approach may not be suitable as we may not have prior knowledge on the collection of documents. To date, there are various swarm based clustering techniques proposed to address such problem, including this paper that explores the adaptation of Firefly Algorithm (FA) in document clustering. We extend the work on Gravitation Firefly Algorithm (GFA) by introducing a relocate mechanism that relocates assigned documents, if necessary. The newly proposed clustering algorithm, known as GFA_R, is then tested on a benchmark dataset obtained from the 20Newsgroups. Experimental results on external and relative quality metrics for the GFA_R is compared against the one obtained using the standard GFA and Bisect K-means. It is learned that by extending GFA to becoming GFA_R, a better quality clustering is obtained.
Advances in Intelligent Systems and Computing, 2014
This paper studies two clustering algorithms that are based on the Firefly Algorithm (FA) which i... more This paper studies two clustering algorithms that are based on the Firefly Algorithm (FA) which is a recent swarm intelligence approach. We perform experiments utilizing the Newton’s Universal Gravitation Inspired Firefly Algorithm (GFA) and Weight-Based Firefly Algorithm (WFA) on the 20_newsgroups dataset. The analysis is undertaken on two parameters. The first is the alpha (α) value in the Firefly algorithms and latter is the threshold value required during clustering process. Results showed that a better performance is demonstrated by Weight-Based Firefly Algorithm compared to Newton’s Universal Gravitation Inspired Firefly Algorithm.
Lecture Notes in Electrical Engineering, 2014
ABSTRACT The divisive clustering has the advantage to build a hierarchical structure that is more... more ABSTRACT The divisive clustering has the advantage to build a hierarchical structure that is more efficient to represent documents in search engines. Its operation employs one of the partition clustering algorithms that leads to being trapped in a local optima. This paper proposes a Firefly algorithm that is based on Newton’s law of universal gravitation, known as Gravitation Firefly Algorithm (GFA), for document clustering. GFA is used to find centers of clusters based on objective function that maximizes the force between each document and an initial center. Upon identification of a center, the algorithm then locates documents that are similar to the center using cosine similarity function. The process of finding centers for new clusters continues by sorting the light intensity values of the balance documents. Experimental results on Reuters datasets showed that the proposed Newton inspired Firefly algorithm is suitable to be used for document clustering in text mining.
Lecture Notes in Electrical Engineering, 2013
Existing clustering techniques have many drawbacks and this includes being trapped in a local opt... more Existing clustering techniques have many drawbacks and this includes being trapped in a local optima. In this paper, we introduce the utilization of a new meta-heuristics algorithm, namely the Firefly algorithm (FA) to increase solution diversity. FA is a nature-inspired algorithm that is used in many optimization problems. The FA is realized in document clustering by executing it on Reuters-21578 database. The algorithm identifies documents that has the highest light intensity in a search space and represents it as a centroid. This is followed by recognizing similar documents using the cosine similarity function. Documents that are similar to the centroid are located into one cluster and dissimilar in the other. Experiments performed on the chosen dataset produce high values of Purity and F-measure. Hence, suggesting that the proposed Firefly algorithm is a possible approach in document clustering.
Children are among the most frequent and important users of Internet. The children can search any... more Children are among the most frequent and important users of Internet. The children can search any type of data in any digital forms in the digital libraries, web directories, or in many other media repositories. However, one possible limitation of searching these digital ...
AIP Conference Proceedings, 2017
Highly phonetically similar reading mistakes often occur when dyslexic children read. In respect ... more Highly phonetically similar reading mistakes often occur when dyslexic children read. In respect to automatic speech transcription, these mistakes are challenging, even for manual transcription. The highly phonetically similar reading mistakes are difficult to be recognized, not to mention segmenting and labelling them accordingly for processing prior to training speech recognition (ASR). The need to automate the segmentation and labelling arise especially when we need to build an ASR for assisting dyslexic children's reading. Hence, the aim of this paper is to investigate the effects that highly phonetically similar errors have upon transcription and segmentation accuracy. A total of 585 speech files are used to produce manual transcription, forced alignment, and training. The recognition of ASR engine using automatic transcription and phonetic labelling obtained 76.04% accuracy with 23.9% word error rate and 18.1% false alarm rate. The results are almost similar with its manual counterpart with 76.26% accuracy, 23.7% word error rate and 17.9% false alarm rate.
As the internet is overload with information, various knowledge based systems are now equipped wi... more As the internet is overload with information, various knowledge based systems are now equipped with data analytics features that facilitate knowledge discovery.This includes the utilization of optimization algorithms that mimics the behavior of insects or animals.This paper presents an experiment on document clustering utilizing the Gravitation Firefly algorithm (GFA).The advantage of GFA is that clustering can be performed without a pre-defined value of k clusters.GFA determines the center of clusters by identifying documents with high force.Upon identification of the centers, clusters are created based on cosine similarity measurement.Experimental results demonstrated that GFA utilizing a random positioning of documents outperforms existing clustering algorithm such as Particles Swarm Optimization (PSO) and K-means.