Fakhri Karray - Academia.edu (original) (raw)
Papers by Fakhri Karray
arXiv (Cornell University), May 7, 2019
Pattern analysis often requires a pre-processing stage for extracting or selecting features in or... more Pattern analysis often requires a pre-processing stage for extracting or selecting features in order to help the classification, prediction, or clustering stage discriminate or represent the data in a better way. The reason for this requirement is that the raw data are complex and difficult to process without extracting or selecting appropriate features beforehand. This paper reviews theory and motivation of different common methods of feature selection and extraction and introduces some of their applications. Some numerical implementations are also shown for these methods. Finally, the methods in feature selection and extraction are compared.
Proceedings of the International AAAI Conference on Web and Social Media
Detecting topics in Twitter streams has been gaining an increasing amount of attention. It can be... more Detecting topics in Twitter streams has been gaining an increasing amount of attention. It can be of great support for communities struck by natural disasters, and could assist companies and political parties understand users' opinions and needs. Traditional approaches for topic detection focus on representing topics using terms, are negatively affected by length limitation and the lack of context associated with tweets.In this work, we propose an Exemplar-based approach for topic detection, in which detected topics are represented using a few selected tweets. Using exemplar tweets instead of a set of key words allows for an easy interpretation of the meaning of the detected topics. Experimental evaluation on benchmark Twitter datasets shows that the proposed topic detection approach achieves the best term precision. It does this while maintaining good topic recall and running time compared to other approaches.
2002 IEEE World Congress on Computational Intelligence. 2002 IEEE International Conference on Fuzzy Systems. FUZZ-IEEE'02. Proceedings (Cat. No.02CH37291)
... Mouton de Gruyter, 1991. [ 20]Mangu, L., Brill, E. & Stolcke, A.,... more ... Mouton de Gruyter, 1991. [ 20]Mangu, L., Brill, E. & Stolcke, A., Finding Consensus in Speech Recognition: Word Error Minimization And Other Applications of Confusion Networks, Computer Speech and Language, 14(4): pp.373-400,2000. ...
International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003
Grammar-based speech recognition systems exhibit performance degradation as their vocabulary size... more Grammar-based speech recognition systems exhibit performance degradation as their vocabulary sizes increase. Data clustering is deemed to reduce the proportionality of this problem. We introduce an approach to data clustering for automatic speech recognition systems using Kohonen Self-Organized Map. Clustering results are used further to build a language model for each of the clusters using CMU-Cambridge toolkit. The approach was implemented as a prototype for a large vocabulary and continuous speech recognition system and about 8% performance improvement was achieved in comparison with the performance achieved using the language model and dictionary provided by Sphinx3. In this paper we present the experimental results along with discussions, analysis and potential future directions.
The 14th IEEE International Conference on Fuzzy Systems, 2005. FUZZ '05.
One of the many issues that confront traditional statistical approaches of natural language under... more One of the many issues that confront traditional statistical approaches of natural language understanding (NLU) is on how to overcome the insufficient co-occurrence information caused by the limited boundary of statistical approaches. Researches have long used the imparting of human knowledge into statistical approaches, including definition of rules and collections of hierarchy of concepts. However, these are difficult to define
Proceedings of the 2003 IEEE International Symposium on Intelligent Control ISIC-03, 2003
This paper describes an improvement on the interline method in order to include feature extractio... more This paper describes an improvement on the interline method in order to include feature extraction using n Finite State Machine (FSM) on the directions followed by this algorithm. Normally, a feature extraction is based on edge or corner detection. Differential operators, such as Sobel operator, are used to detect edges and the SUSAN operator can be used to detect corners,
Lecture Notes in Computer Science, 2013
ABSTRACT This paper presents a new approach for driver's eye tracking, based on an improv... more ABSTRACT This paper presents a new approach for driver's eye tracking, based on an improved version of a particle filter. We use two different state transition models and two different observation models distributions in order to adapt the tracking depending on the situation. The first state transition model is based on autoregressive model and is robust to face rotation. The second one is based on head motion and is efficient despite rapid head movements. The first observation model is based on pupil detection. Although very accurate, it is not extremely robust to head rotations. For that, we add a second observation model, based on similarity between eye candidates and a subspace trained offline. This approach is robust to important face rotations or partial occlusion. Evaluation has been done with an infrared camera on different people executing a challenging sequence of movements. Results show that our method is robust to face rotation, partial occlusion, and illumination variation.
Lecture Notes in Computer Science, 2010
Abstract. The optic disc (OD) is an important anatomical feature in retinal images, and its detec... more Abstract. The optic disc (OD) is an important anatomical feature in retinal images, and its detection is vital for developing automated screening programs. In this paper we propose a method to automatically detect the OD in fundus images using two steps: OD vessel candidate ...
2010 International Conference on Autonomous and Intelligent Systems, AIS 2010, 2010
Page 1. Optic Disc and Fovea Detection via Multi-scale Matched Filters and a Vessels' Direct... more Page 1. Optic Disc and Fovea Detection via Multi-scale Matched Filters and a Vessels' Directional Matched Filter Bob Zhang1 Department of Electrical and Computer Engineering University of Waterloo Waterloo, ON, Canada N2L 3G1 yibo@pami.uwaterloo.ca ...
This study presents the development of connectionist or artificial neural network (ANN) models of... more This study presents the development of connectionist or artificial neural network (ANN) models of a crude oil distillation column that can be utilised for real time optimization (RTO). The column is an actual distillation tower in operation in a refinery in Malaysia. Connectionist models developed for RTO are different than for process control applications because they are steady state, multivariable models. Training data for the network models was generated using a reconciled steady state process model simulated in the Aspen Plus process simulator. All ANN models were coded and simulated in MATLAB. Two types of feedforward network models were developed and compared: multi-layer perceptron (MLP) with adaptive learning rates and radial basis function networks (RBFN). The RBFN models were found to yield better and more consistent predictions with shorter training times than the MLP models. Grouping suitable output variables in a network model were found to give better predictions, and allow the complex, multivariable model of the crude tower to be more manageable.
2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 2008
AbstractThis paper aims at presenting different strategies for the construction of Beta Basis Fu... more AbstractThis paper aims at presenting different strategies for the construction of Beta Basis Function (BBF) Fuzzy Neural Network. These strategies lead to the determination of the network architecture by determining the structure of the hidden layer and parameters of its centers ...
2012 IEEE International Conference on Fuzzy Systems, 2012
This work intends to identify common performance metrics for task-oriented human-robot interactio... more This work intends to identify common performance metrics for task-oriented human-robot interaction. We present a methodology to assess the system performance of a human-robot team in achievement of collective tasks. We propose a systematic approach that addresses the performance of both the human user and the robotic agent as a team. Toward this end, we attempt to determine the true time that an operator has to dedicate to a robot in action. We define the robot attention demand (RAD) as a function of both direct interaction time (DIT) and indirect interaction time (IIT), where the IIT is a direct consequence of the human trust in automation. We propose a two-level fuzzy temporal model to evaluate the human trust in automation while interacting with robots. Another fuzzy temporal model is presented to evaluate the human reliability during interaction time. The model is then generalized to accommodate multi-robot scenarios. Sequential and parallel robot cooperation schemes with varying levels of task dependency are considered. The fuzzy knowledge bases are further updated by implementing an application robotic platform where robots and users interact naturally to complete tasks with varying levels of complexity. User feedback is noted and used to tune the knowledge base rules where needed, to better represent a human expert's knowledge.
Lecture Notes in Computer Science, 2004
Nowadays, there is a plethora of robotic systems from different vendors and with different charac... more Nowadays, there is a plethora of robotic systems from different vendors and with different characteristics that work in specific tasks. Unfortunately, most of the robotic operating systems come in a closed control architecture. This fact represents a challenge to integrate these systems with other robotic components, such as vision systems or other types of robots. In this paper, we propose
2012 IEEE 13th International Conference on Information Reuse & Integration (IRI), 2012
ABSTRACT The incorporation of soft human-generated data into the fusion process is an emerging tr... more ABSTRACT The incorporation of soft human-generated data into the fusion process is an emerging trend in the data fusion community. This paper describes an extension of our original Random Set (RS) theoretic soft/hard data fusion system from single-target to multi-target tracking case. Leveraging recent developments in the RS theoretic data fusion community, we propose a novel soft measurement-to-track association algorithm. Based on this algorithm, we describe a multi-target tracking system capable of processing soft human-generated data. Our preliminary experiments demonstrate the advantages of the proposed soft data association algorithm (SDAA) in achieving substantial improvement of tracking performance, considering the baseline algorithm to rely merely on human opinions for solving the data association problem.
2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, 2009
Due to the rapid growth of network technologies and substantial improvement in attack tools and t... more Due to the rapid growth of network technologies and substantial improvement in attack tools and techniques, a distributed Intrusion Detection System (dIDS) is required to allocate multiple IDSs across a network to monitor security events and to collect data. However, dIDS architectures suffer from many limitations such as the lack of a central analyzer and a heavy network load. In this paper, we propose a new architecture for dIDS, called a Collaborative architecture for dIDS (C-dIDS), to overcome these limitations. The C-dIDS contains one-level hierarchy dIDS with a non-central analyzer. To make the detection decision for a specific IDS module in the system, this IDS module needs to collaborate with the IDS in the lower level of the hierarchy. Cooperating with lower level IDS module improves the system accuracy with less network load (just one bit of information). Moreover, by using one hierarchy level, there is no central management and processing of data so there is no chance for a single point of failure. We have examined the feasibility of our dIDS architecture by conducting several experiments using the DARPA dataset. The experimental results indicate that the proposed architecture can deliver satisfactory system performance with less network load.
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011
2010 International Conference on Autonomous and Intelligent Systems, AIS 2010, 2010
Abstractnowadays large populations worldwide are suffering from eye diseases such as astigmatism... more Abstractnowadays large populations worldwide are suffering from eye diseases such as astigmatism, myopia, and hyperopia which are caused by ophthalmologically refractive errors. This paper presents an effective approach to computer aided diagnosis of such ...
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, 2007
Most of text categorization techniques are based on word and/or phrase analysis of the text. Stat... more Most of text categorization techniques are based on word and/or phrase analysis of the text. Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes moreto the meaning of its sentences than the other term. Thus, the underlying model
arXiv (Cornell University), May 7, 2019
Pattern analysis often requires a pre-processing stage for extracting or selecting features in or... more Pattern analysis often requires a pre-processing stage for extracting or selecting features in order to help the classification, prediction, or clustering stage discriminate or represent the data in a better way. The reason for this requirement is that the raw data are complex and difficult to process without extracting or selecting appropriate features beforehand. This paper reviews theory and motivation of different common methods of feature selection and extraction and introduces some of their applications. Some numerical implementations are also shown for these methods. Finally, the methods in feature selection and extraction are compared.
Proceedings of the International AAAI Conference on Web and Social Media
Detecting topics in Twitter streams has been gaining an increasing amount of attention. It can be... more Detecting topics in Twitter streams has been gaining an increasing amount of attention. It can be of great support for communities struck by natural disasters, and could assist companies and political parties understand users' opinions and needs. Traditional approaches for topic detection focus on representing topics using terms, are negatively affected by length limitation and the lack of context associated with tweets.In this work, we propose an Exemplar-based approach for topic detection, in which detected topics are represented using a few selected tweets. Using exemplar tweets instead of a set of key words allows for an easy interpretation of the meaning of the detected topics. Experimental evaluation on benchmark Twitter datasets shows that the proposed topic detection approach achieves the best term precision. It does this while maintaining good topic recall and running time compared to other approaches.
2002 IEEE World Congress on Computational Intelligence. 2002 IEEE International Conference on Fuzzy Systems. FUZZ-IEEE'02. Proceedings (Cat. No.02CH37291)
... Mouton de Gruyter, 1991. [ 20]Mangu, L., Brill, E. & Stolcke, A.,... more ... Mouton de Gruyter, 1991. [ 20]Mangu, L., Brill, E. & Stolcke, A., Finding Consensus in Speech Recognition: Word Error Minimization And Other Applications of Confusion Networks, Computer Speech and Language, 14(4): pp.373-400,2000. ...
International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003
Grammar-based speech recognition systems exhibit performance degradation as their vocabulary size... more Grammar-based speech recognition systems exhibit performance degradation as their vocabulary sizes increase. Data clustering is deemed to reduce the proportionality of this problem. We introduce an approach to data clustering for automatic speech recognition systems using Kohonen Self-Organized Map. Clustering results are used further to build a language model for each of the clusters using CMU-Cambridge toolkit. The approach was implemented as a prototype for a large vocabulary and continuous speech recognition system and about 8% performance improvement was achieved in comparison with the performance achieved using the language model and dictionary provided by Sphinx3. In this paper we present the experimental results along with discussions, analysis and potential future directions.
The 14th IEEE International Conference on Fuzzy Systems, 2005. FUZZ '05.
One of the many issues that confront traditional statistical approaches of natural language under... more One of the many issues that confront traditional statistical approaches of natural language understanding (NLU) is on how to overcome the insufficient co-occurrence information caused by the limited boundary of statistical approaches. Researches have long used the imparting of human knowledge into statistical approaches, including definition of rules and collections of hierarchy of concepts. However, these are difficult to define
Proceedings of the 2003 IEEE International Symposium on Intelligent Control ISIC-03, 2003
This paper describes an improvement on the interline method in order to include feature extractio... more This paper describes an improvement on the interline method in order to include feature extraction using n Finite State Machine (FSM) on the directions followed by this algorithm. Normally, a feature extraction is based on edge or corner detection. Differential operators, such as Sobel operator, are used to detect edges and the SUSAN operator can be used to detect corners,
Lecture Notes in Computer Science, 2013
ABSTRACT This paper presents a new approach for driver's eye tracking, based on an improv... more ABSTRACT This paper presents a new approach for driver's eye tracking, based on an improved version of a particle filter. We use two different state transition models and two different observation models distributions in order to adapt the tracking depending on the situation. The first state transition model is based on autoregressive model and is robust to face rotation. The second one is based on head motion and is efficient despite rapid head movements. The first observation model is based on pupil detection. Although very accurate, it is not extremely robust to head rotations. For that, we add a second observation model, based on similarity between eye candidates and a subspace trained offline. This approach is robust to important face rotations or partial occlusion. Evaluation has been done with an infrared camera on different people executing a challenging sequence of movements. Results show that our method is robust to face rotation, partial occlusion, and illumination variation.
Lecture Notes in Computer Science, 2010
Abstract. The optic disc (OD) is an important anatomical feature in retinal images, and its detec... more Abstract. The optic disc (OD) is an important anatomical feature in retinal images, and its detection is vital for developing automated screening programs. In this paper we propose a method to automatically detect the OD in fundus images using two steps: OD vessel candidate ...
2010 International Conference on Autonomous and Intelligent Systems, AIS 2010, 2010
Page 1. Optic Disc and Fovea Detection via Multi-scale Matched Filters and a Vessels' Direct... more Page 1. Optic Disc and Fovea Detection via Multi-scale Matched Filters and a Vessels' Directional Matched Filter Bob Zhang1 Department of Electrical and Computer Engineering University of Waterloo Waterloo, ON, Canada N2L 3G1 yibo@pami.uwaterloo.ca ...
This study presents the development of connectionist or artificial neural network (ANN) models of... more This study presents the development of connectionist or artificial neural network (ANN) models of a crude oil distillation column that can be utilised for real time optimization (RTO). The column is an actual distillation tower in operation in a refinery in Malaysia. Connectionist models developed for RTO are different than for process control applications because they are steady state, multivariable models. Training data for the network models was generated using a reconciled steady state process model simulated in the Aspen Plus process simulator. All ANN models were coded and simulated in MATLAB. Two types of feedforward network models were developed and compared: multi-layer perceptron (MLP) with adaptive learning rates and radial basis function networks (RBFN). The RBFN models were found to yield better and more consistent predictions with shorter training times than the MLP models. Grouping suitable output variables in a network model were found to give better predictions, and allow the complex, multivariable model of the crude tower to be more manageable.
2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 2008
AbstractThis paper aims at presenting different strategies for the construction of Beta Basis Fu... more AbstractThis paper aims at presenting different strategies for the construction of Beta Basis Function (BBF) Fuzzy Neural Network. These strategies lead to the determination of the network architecture by determining the structure of the hidden layer and parameters of its centers ...
2012 IEEE International Conference on Fuzzy Systems, 2012
This work intends to identify common performance metrics for task-oriented human-robot interactio... more This work intends to identify common performance metrics for task-oriented human-robot interaction. We present a methodology to assess the system performance of a human-robot team in achievement of collective tasks. We propose a systematic approach that addresses the performance of both the human user and the robotic agent as a team. Toward this end, we attempt to determine the true time that an operator has to dedicate to a robot in action. We define the robot attention demand (RAD) as a function of both direct interaction time (DIT) and indirect interaction time (IIT), where the IIT is a direct consequence of the human trust in automation. We propose a two-level fuzzy temporal model to evaluate the human trust in automation while interacting with robots. Another fuzzy temporal model is presented to evaluate the human reliability during interaction time. The model is then generalized to accommodate multi-robot scenarios. Sequential and parallel robot cooperation schemes with varying levels of task dependency are considered. The fuzzy knowledge bases are further updated by implementing an application robotic platform where robots and users interact naturally to complete tasks with varying levels of complexity. User feedback is noted and used to tune the knowledge base rules where needed, to better represent a human expert's knowledge.
Lecture Notes in Computer Science, 2004
Nowadays, there is a plethora of robotic systems from different vendors and with different charac... more Nowadays, there is a plethora of robotic systems from different vendors and with different characteristics that work in specific tasks. Unfortunately, most of the robotic operating systems come in a closed control architecture. This fact represents a challenge to integrate these systems with other robotic components, such as vision systems or other types of robots. In this paper, we propose
2012 IEEE 13th International Conference on Information Reuse & Integration (IRI), 2012
ABSTRACT The incorporation of soft human-generated data into the fusion process is an emerging tr... more ABSTRACT The incorporation of soft human-generated data into the fusion process is an emerging trend in the data fusion community. This paper describes an extension of our original Random Set (RS) theoretic soft/hard data fusion system from single-target to multi-target tracking case. Leveraging recent developments in the RS theoretic data fusion community, we propose a novel soft measurement-to-track association algorithm. Based on this algorithm, we describe a multi-target tracking system capable of processing soft human-generated data. Our preliminary experiments demonstrate the advantages of the proposed soft data association algorithm (SDAA) in achieving substantial improvement of tracking performance, considering the baseline algorithm to rely merely on human opinions for solving the data association problem.
2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, 2009
Due to the rapid growth of network technologies and substantial improvement in attack tools and t... more Due to the rapid growth of network technologies and substantial improvement in attack tools and techniques, a distributed Intrusion Detection System (dIDS) is required to allocate multiple IDSs across a network to monitor security events and to collect data. However, dIDS architectures suffer from many limitations such as the lack of a central analyzer and a heavy network load. In this paper, we propose a new architecture for dIDS, called a Collaborative architecture for dIDS (C-dIDS), to overcome these limitations. The C-dIDS contains one-level hierarchy dIDS with a non-central analyzer. To make the detection decision for a specific IDS module in the system, this IDS module needs to collaborate with the IDS in the lower level of the hierarchy. Cooperating with lower level IDS module improves the system accuracy with less network load (just one bit of information). Moreover, by using one hierarchy level, there is no central management and processing of data so there is no chance for a single point of failure. We have examined the feasibility of our dIDS architecture by conducting several experiments using the DARPA dataset. The experimental results indicate that the proposed architecture can deliver satisfactory system performance with less network load.
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011
2010 International Conference on Autonomous and Intelligent Systems, AIS 2010, 2010
Abstractnowadays large populations worldwide are suffering from eye diseases such as astigmatism... more Abstractnowadays large populations worldwide are suffering from eye diseases such as astigmatism, myopia, and hyperopia which are caused by ophthalmologically refractive errors. This paper presents an effective approach to computer aided diagnosis of such ...
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, 2007
Most of text categorization techniques are based on word and/or phrase analysis of the text. Stat... more Most of text categorization techniques are based on word and/or phrase analysis of the text. Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes moreto the meaning of its sentences than the other term. Thus, the underlying model