Younès Bennani | Université Sorbonne Paris Nord / Sorbonne Paris Nord University (original) (raw)

Papers by Younès Bennani

Research paper thumbnail of Topographic connectionist unsupervised learning for RFID behavior data mining

Proceedings of the 2nd …, 2008

... The new unsupervised clustering method (DS2L-SOM) used in this article is a very efficient da... more ... The new unsupervised clustering method (DS2L-SOM) used in this article is a very efficient data ... very simple and effective visualization with a non-linear projection of the data structure on ... Here, we were able to highlight the characteristics of spatial organization in ants colonies. ...

Research paper thumbnail of SYSTEM FOR SEARCHING VISUAL INFORMATION. WO/2010/066774 - PCT/EP2009/066702

Research paper thumbnail of Connectionist and conventional models for free-text talker identification tasks

Research paper thumbnail of Combining structural and statistical features for the recognition of handwritten characters

Proceedings of 13th International Conference on Pattern Recognition, 1996

The authors present a feature vector for the recognition of handwritten characters which combines... more The authors present a feature vector for the recognition of handwritten characters which combines the strengths of both statistical and structural feature extractors. Thanks to a combination of seven complementary families of features (ranging from pure structural to pure statistical and including both local and global features), a complete description of the characters can be achieved thus providing a wide range of identification clues. The recognition system has been tested on three categories of handwritten characters: handwritten well-segmented digits extracted from the NIST Database, uppercase letters collected from US dead letter envelopes and graphemes generated by a handwritten cursive word segmentation performed on US address word images. We thus demonstrate in this paper that high recognition rates with very low substitution rates can he achieved by means of the same general-purpose structural/statistical feature based vector

Research paper thumbnail of Hybrid Unsupervised Learning to Uncover Discourse Structure

Lecture Notes in Computer Science, 2009

... Clément, F-93430 Villetaneuse, France {Catherine.Recanati,Nicoleta.Rogovski,Younes.Bennani}@l... more ... Clément, F-93430 Villetaneuse, France {Catherine.Recanati,Nicoleta.Rogovski,Younes.Bennani}@lipn.univ-paris13.fr Abstract. ... Little causality, background, little impact, no goal or alternative No causality, background, no impact, many goals and alternatives ...

Research paper thumbnail of LEA2C: Low Energy Adaptive Connectionist Clustering for Wireless Sensor Networks

Lecture Notes in Computer Science, 2005

The use of the wireless sensor networks (WSNs) should be increasing in different fields (scientis... more The use of the wireless sensor networks (WSNs) should be increasing in different fields (scientist, logistic, military and health, etc.). However, the sensor's size is an important limitation in term of energetic autonomy, and thus of life span because battery must be very small. This is the reason why, today, research mainly carries on the energy management in the WSNs, taking into account communications, essentially. In this context, we propose an adaptive routing algorithm based on the clustering that we named LEA2C. This algorithm relies on connectionist learning techniques and more exactly on the topological selforganizing maps (SOMs). New rules for the choice of the clusterheads have also been added. By comparing the results obtained by our protocol with those of other clustering methods used in the WSNs, such as LEACH and LEACH-C, we obtain important gains in term of energy and thus of network lifetime.

Research paper thumbnail of MULTI-EXPERT AND HYBRID CONNECTIONIST APPROACH FOR PATTERN RECOGNITION: SPEAKER IDENTIFICATION TASK

International Journal of Neural Systems, 1994

This paper presents and evaluates a modular/hybrid connectionist system for speaker identificatio... more This paper presents and evaluates a modular/hybrid connectionist system for speaker identification. Modularity has emerged as a powerful technique for reducing the complexity of connectionist systems, allowing a priori knowledge to be incorporated into their design. In problems where training data are scarce, such modular systems are likely to generalize significantly better than a monolithic connectionist system. In addition, modules are not restricted to be connectionist: hybrid systems, with e.g. Hidden Markov Models (HMMs), can be designed, combining the advantages of connectionist and non-connectionist approaches. Text independent speaker identification is an inherently complex task where the amount of training data is often limited. It thus provides an ideal domain to test the validity of the modular/hybrid connectionist approach. An architecture is developed in this paper which achieves this identification, based upon the cooperation of several connectionist modules, together with an HMM module. When tested on a population of 102 speakers extracted from the DARPA-TIMIT database, perfect identification was obtained. Overall, our recognition results are among the best for any text-independent speaker identification system handling this population size. In a specific comparison with a system based on multivariate auto-regressive models, the modular/hybrid connectionist approach was found to be significantly better in terms of both accuracy and speed. Our design also allows for easy incorporation of new speakers.

Research paper thumbnail of Une mesure de pertinence pour la sélection de variables dans les perceptrons multicouches

Revue d'intelligence artificielle, 2001

ABSTRACT

Research paper thumbnail of A New Energy Model for the Hidden Markov Random Fields

Lecture Notes in Computer Science, 2014

In this article we propose a modification to the HMRF-EM framework applied to image segmentation.... more In this article we propose a modification to the HMRF-EM framework applied to image segmentation. To do so, we introduce a new model for the neighborhood energy function of the Hidden Markov Random Fields model based on the Hidden Markov Model formalism. With this new energy model, we aim at (1) avoiding the use of a key parameter chosen empirically on which the results of the current models are heavily relying, (2) proposing an information rich modelisation of neighborhood relationships.

Research paper thumbnail of Semi-structured document categorization with a semantic kernel

Pattern Recognition, 2009

ABSTRACT Since a decade, text categorization has become an active field of research in the machin... more ABSTRACT Since a decade, text categorization has become an active field of research in the machine learning community. Most of the approaches are based on the term occurrence frequency. The performance of such surface-based methods can decrease when the texts are too complex, i.e., ambiguous. One alternative is to use the semantic-based approaches to process textual documents according to their meaning. Furthermore, research in text categorization has mainly focused on “flat texts” whereas many documents are now semi-structured and especially under the XML format. In this paper, we propose a semantic kernel for semi-structured biomedical documents. The semantic meanings of words are extracted using the unified medical language system (UMLS) framework. The kernel, with a SVM classifier, has been applied to a text categorization task on a medical corpus of free text documents. The results have shown that the semantic kernel outperforms the linear kernel and the naive Bayes classifier. Moreover, this kernel was ranked in the top 10 of the best algorithms among 44 classification methods at the 2007 Computational Medicine Center (CMC) Medical NLP International Challenge.

Research paper thumbnail of Connectionist and ethological approaches for discovering salient facial movements features in human gender recognition

28th International Conference on Information Technology Interfaces, 2006., 2006

Individual Facial movements signal various social information to other persons, like the gender o... more Individual Facial movements signal various social information to other persons, like the gender of the sender. We used an ethological and a connectionist approaches in order to detect these movements and their characteristics in men and in women. Behavioural results indicate both qualitative and quantitative differences between men and women. The connectionist approach involves similar and complementary conclusions. The ethological study has been focused on the main movement differences as well as did the connectionist one but this last approach showed important differences between men and women in motionless events. These pilot results leads to a re-examination of behavioural events and a checking of lateralization of movements correlated with the gender.

Research paper thumbnail of Visualization and Analysis of Web Navigation Data

Lecture Notes in Computer Science, 2002

In this paper, we present two new approaches for the analysis of web site users behaviors. The fi... more In this paper, we present two new approaches for the analysis of web site users behaviors. The first one is a synthetic visualization of Log file data and the second one is a coding of sequence based data. This coding allows us to carry out a vector quantization, and thus to find meaningful prototypes of the data set. For this, first the set of sessions is partitioned and then a prototype is extracted from each of the resulting classes. This analytic process allows us to categorize the different web site users behaviors interested by a set of categories of pages in a commercial site.

Research paper thumbnail of A weighted Self-Organizing Map for mixed continuous and categorical data

Research paper thumbnail of Classification topographique à deux niveaux simultanés à base de modes de densité

Une des difficultés rencontrées en classification non supervisée, aussi connue comme le "problème... more Une des difficultés rencontrées en classification non supervisée, aussi connue comme le "problème de sélection de modèle", est de déterminer un nombre approprié de groupes. Sans connaissances a priori il n'y a pas de moyen simple pour déterminer ce nombre. Nous proposons dans cet article un nouvel algorithme de classification non supervisée à deux niveaux, appelé DS2L-SOM (Densitybased Simultaneous Two-Level -Self Organizing Map). Cet algorithme effectue simultanément une réduction de dimensions et une classificassion automatique des données. Cette approche utilise à la fois des informations sur la distance et la densité des données, ce qui lui permet de détecter automatiquement le nombre de groupes, i.e. aucune hypothèse a priori sur le nombre de groupes n'est exigée. Le principal avantage de l'algorithme proposé, comparé aux méthodes classiques de classification, est qu'il est capable d'identifier des groupes de formes non convexes et même des groupes qui se superposent en partie. La validation expérimentale de cet algorithme sur un ensemble de problèmes fondamentaux pour la classification montre sa supériorité sur les méthodes standards de classification.

Research paper thumbnail of Extending SOM with Efficient Estimation of the Number of Clusters

Research paper thumbnail of Dimensionality reduction for binary data

In this paper we propose a new automatic learning model which allows the simultaneously topologic... more In this paper we propose a new automatic learning model which allows the simultaneously topological clustering and feature selection for quantitative datasets. We explore a new topological organization algorithm for categorical data clustering and visualization named RTC (Relational Topological Clustering). Generally, it is more difficult to perform clustering on categorical data than on numerical data due to the absence of the ordered property in the data. The proposed approach is based on the self-organization principle of the Kohonen's model and uses the Relational Analysis formalism by optimizing a cost function defined as a modified Condorcet criterion. We propose an iterative algorithm, which deals linearly with large datasets, provides a natural clusters identification and allows a visualization of the clustering result on a two dimensional grid. Thereafter, the statistical ScreeTest is used to detect relevant and correlated features (or modalities) for each prototype. This test allows to detect the most important variables in an automatic way without setting any parameters. The proposed approach was validated on variant real datasets and the experimental results show the effectiveness of the proposed procedure.

Research paper thumbnail of Simultaneous Topological Categorical Data Clustering and Cluster Characterization

Research paper thumbnail of Traitement Numérique des Données

Research paper thumbnail of POWER CONTROL AND CLUSTERING IN

Research paper thumbnail of Un nouveau modèle d’énergie pour les champs aléatoires de Markov cachés

Research paper thumbnail of Topographic connectionist unsupervised learning for RFID behavior data mining

Proceedings of the 2nd …, 2008

... The new unsupervised clustering method (DS2L-SOM) used in this article is a very efficient da... more ... The new unsupervised clustering method (DS2L-SOM) used in this article is a very efficient data ... very simple and effective visualization with a non-linear projection of the data structure on ... Here, we were able to highlight the characteristics of spatial organization in ants colonies. ...

Research paper thumbnail of SYSTEM FOR SEARCHING VISUAL INFORMATION. WO/2010/066774 - PCT/EP2009/066702

Research paper thumbnail of Connectionist and conventional models for free-text talker identification tasks

Research paper thumbnail of Combining structural and statistical features for the recognition of handwritten characters

Proceedings of 13th International Conference on Pattern Recognition, 1996

The authors present a feature vector for the recognition of handwritten characters which combines... more The authors present a feature vector for the recognition of handwritten characters which combines the strengths of both statistical and structural feature extractors. Thanks to a combination of seven complementary families of features (ranging from pure structural to pure statistical and including both local and global features), a complete description of the characters can be achieved thus providing a wide range of identification clues. The recognition system has been tested on three categories of handwritten characters: handwritten well-segmented digits extracted from the NIST Database, uppercase letters collected from US dead letter envelopes and graphemes generated by a handwritten cursive word segmentation performed on US address word images. We thus demonstrate in this paper that high recognition rates with very low substitution rates can he achieved by means of the same general-purpose structural/statistical feature based vector

Research paper thumbnail of Hybrid Unsupervised Learning to Uncover Discourse Structure

Lecture Notes in Computer Science, 2009

... Clément, F-93430 Villetaneuse, France {Catherine.Recanati,Nicoleta.Rogovski,Younes.Bennani}@l... more ... Clément, F-93430 Villetaneuse, France {Catherine.Recanati,Nicoleta.Rogovski,Younes.Bennani}@lipn.univ-paris13.fr Abstract. ... Little causality, background, little impact, no goal or alternative No causality, background, no impact, many goals and alternatives ...

Research paper thumbnail of LEA2C: Low Energy Adaptive Connectionist Clustering for Wireless Sensor Networks

Lecture Notes in Computer Science, 2005

The use of the wireless sensor networks (WSNs) should be increasing in different fields (scientis... more The use of the wireless sensor networks (WSNs) should be increasing in different fields (scientist, logistic, military and health, etc.). However, the sensor's size is an important limitation in term of energetic autonomy, and thus of life span because battery must be very small. This is the reason why, today, research mainly carries on the energy management in the WSNs, taking into account communications, essentially. In this context, we propose an adaptive routing algorithm based on the clustering that we named LEA2C. This algorithm relies on connectionist learning techniques and more exactly on the topological selforganizing maps (SOMs). New rules for the choice of the clusterheads have also been added. By comparing the results obtained by our protocol with those of other clustering methods used in the WSNs, such as LEACH and LEACH-C, we obtain important gains in term of energy and thus of network lifetime.

Research paper thumbnail of MULTI-EXPERT AND HYBRID CONNECTIONIST APPROACH FOR PATTERN RECOGNITION: SPEAKER IDENTIFICATION TASK

International Journal of Neural Systems, 1994

This paper presents and evaluates a modular/hybrid connectionist system for speaker identificatio... more This paper presents and evaluates a modular/hybrid connectionist system for speaker identification. Modularity has emerged as a powerful technique for reducing the complexity of connectionist systems, allowing a priori knowledge to be incorporated into their design. In problems where training data are scarce, such modular systems are likely to generalize significantly better than a monolithic connectionist system. In addition, modules are not restricted to be connectionist: hybrid systems, with e.g. Hidden Markov Models (HMMs), can be designed, combining the advantages of connectionist and non-connectionist approaches. Text independent speaker identification is an inherently complex task where the amount of training data is often limited. It thus provides an ideal domain to test the validity of the modular/hybrid connectionist approach. An architecture is developed in this paper which achieves this identification, based upon the cooperation of several connectionist modules, together with an HMM module. When tested on a population of 102 speakers extracted from the DARPA-TIMIT database, perfect identification was obtained. Overall, our recognition results are among the best for any text-independent speaker identification system handling this population size. In a specific comparison with a system based on multivariate auto-regressive models, the modular/hybrid connectionist approach was found to be significantly better in terms of both accuracy and speed. Our design also allows for easy incorporation of new speakers.

Research paper thumbnail of Une mesure de pertinence pour la sélection de variables dans les perceptrons multicouches

Revue d'intelligence artificielle, 2001

ABSTRACT

Research paper thumbnail of A New Energy Model for the Hidden Markov Random Fields

Lecture Notes in Computer Science, 2014

In this article we propose a modification to the HMRF-EM framework applied to image segmentation.... more In this article we propose a modification to the HMRF-EM framework applied to image segmentation. To do so, we introduce a new model for the neighborhood energy function of the Hidden Markov Random Fields model based on the Hidden Markov Model formalism. With this new energy model, we aim at (1) avoiding the use of a key parameter chosen empirically on which the results of the current models are heavily relying, (2) proposing an information rich modelisation of neighborhood relationships.

Research paper thumbnail of Semi-structured document categorization with a semantic kernel

Pattern Recognition, 2009

ABSTRACT Since a decade, text categorization has become an active field of research in the machin... more ABSTRACT Since a decade, text categorization has become an active field of research in the machine learning community. Most of the approaches are based on the term occurrence frequency. The performance of such surface-based methods can decrease when the texts are too complex, i.e., ambiguous. One alternative is to use the semantic-based approaches to process textual documents according to their meaning. Furthermore, research in text categorization has mainly focused on “flat texts” whereas many documents are now semi-structured and especially under the XML format. In this paper, we propose a semantic kernel for semi-structured biomedical documents. The semantic meanings of words are extracted using the unified medical language system (UMLS) framework. The kernel, with a SVM classifier, has been applied to a text categorization task on a medical corpus of free text documents. The results have shown that the semantic kernel outperforms the linear kernel and the naive Bayes classifier. Moreover, this kernel was ranked in the top 10 of the best algorithms among 44 classification methods at the 2007 Computational Medicine Center (CMC) Medical NLP International Challenge.

Research paper thumbnail of Connectionist and ethological approaches for discovering salient facial movements features in human gender recognition

28th International Conference on Information Technology Interfaces, 2006., 2006

Individual Facial movements signal various social information to other persons, like the gender o... more Individual Facial movements signal various social information to other persons, like the gender of the sender. We used an ethological and a connectionist approaches in order to detect these movements and their characteristics in men and in women. Behavioural results indicate both qualitative and quantitative differences between men and women. The connectionist approach involves similar and complementary conclusions. The ethological study has been focused on the main movement differences as well as did the connectionist one but this last approach showed important differences between men and women in motionless events. These pilot results leads to a re-examination of behavioural events and a checking of lateralization of movements correlated with the gender.

Research paper thumbnail of Visualization and Analysis of Web Navigation Data

Lecture Notes in Computer Science, 2002

In this paper, we present two new approaches for the analysis of web site users behaviors. The fi... more In this paper, we present two new approaches for the analysis of web site users behaviors. The first one is a synthetic visualization of Log file data and the second one is a coding of sequence based data. This coding allows us to carry out a vector quantization, and thus to find meaningful prototypes of the data set. For this, first the set of sessions is partitioned and then a prototype is extracted from each of the resulting classes. This analytic process allows us to categorize the different web site users behaviors interested by a set of categories of pages in a commercial site.

Research paper thumbnail of A weighted Self-Organizing Map for mixed continuous and categorical data

Research paper thumbnail of Classification topographique à deux niveaux simultanés à base de modes de densité

Une des difficultés rencontrées en classification non supervisée, aussi connue comme le "problème... more Une des difficultés rencontrées en classification non supervisée, aussi connue comme le "problème de sélection de modèle", est de déterminer un nombre approprié de groupes. Sans connaissances a priori il n'y a pas de moyen simple pour déterminer ce nombre. Nous proposons dans cet article un nouvel algorithme de classification non supervisée à deux niveaux, appelé DS2L-SOM (Densitybased Simultaneous Two-Level -Self Organizing Map). Cet algorithme effectue simultanément une réduction de dimensions et une classificassion automatique des données. Cette approche utilise à la fois des informations sur la distance et la densité des données, ce qui lui permet de détecter automatiquement le nombre de groupes, i.e. aucune hypothèse a priori sur le nombre de groupes n'est exigée. Le principal avantage de l'algorithme proposé, comparé aux méthodes classiques de classification, est qu'il est capable d'identifier des groupes de formes non convexes et même des groupes qui se superposent en partie. La validation expérimentale de cet algorithme sur un ensemble de problèmes fondamentaux pour la classification montre sa supériorité sur les méthodes standards de classification.

Research paper thumbnail of Extending SOM with Efficient Estimation of the Number of Clusters

Research paper thumbnail of Dimensionality reduction for binary data

In this paper we propose a new automatic learning model which allows the simultaneously topologic... more In this paper we propose a new automatic learning model which allows the simultaneously topological clustering and feature selection for quantitative datasets. We explore a new topological organization algorithm for categorical data clustering and visualization named RTC (Relational Topological Clustering). Generally, it is more difficult to perform clustering on categorical data than on numerical data due to the absence of the ordered property in the data. The proposed approach is based on the self-organization principle of the Kohonen's model and uses the Relational Analysis formalism by optimizing a cost function defined as a modified Condorcet criterion. We propose an iterative algorithm, which deals linearly with large datasets, provides a natural clusters identification and allows a visualization of the clustering result on a two dimensional grid. Thereafter, the statistical ScreeTest is used to detect relevant and correlated features (or modalities) for each prototype. This test allows to detect the most important variables in an automatic way without setting any parameters. The proposed approach was validated on variant real datasets and the experimental results show the effectiveness of the proposed procedure.

Research paper thumbnail of Simultaneous Topological Categorical Data Clustering and Cluster Characterization

Research paper thumbnail of Traitement Numérique des Données

Research paper thumbnail of POWER CONTROL AND CLUSTERING IN

Research paper thumbnail of Un nouveau modèle d’énergie pour les champs aléatoires de Markov cachés