annie morin - Academia.edu (original) (raw)
Papers by annie morin
The Statistics Education Research Journal is published by the International Association for Stati... more The Statistics Education Research Journal is published by the International Association for Statistical Education and the International Statistical Institute to: • encourage research activity in statistics education; • advance knowledge about students ’ attitudes, conceptions, and difficulties as regards stochastic knowledge; • improve the teaching of statistics at all educational levels. The Journal encourages the submission of quality papers, including research reports, theoretical or methodological analyses, and integrative literature surveys, that can advance scholarly knowledge, research methods, and educational practice in any of the broad areas related to statistical education or learning of statistics and probability at all educational levels and in all educational contexts. Contributions in English are preferred. Contributions in French and Spanish will also be considered. All papers are blind-refereed by at least two experts in the field. Submissions Manuscripts should be ...
Problem of Incomplete Record Classification by Diagnostic Modules in Thyroid Laboratory Information System
Lors de recherches sur le Web, l'utilisateur est souvent confronté à un grand nombre de résultats... more Lors de recherches sur le Web, l'utilisateur est souvent confronté à un grand nombre de résultats généralement triés selon leur degré d'appariement à la requête et affichés sous la forme d'une liste. Constatant les limites de cette approche, nous proposons d'explorer de nouvelles organisations et présentations des résultats de recherche, ainsi que de nouveaux types d'interaction avec les résultats afin de rendre leur exploration plus intuitive et efficace. Bien que la « pertinence » soit liée à la qualité des résultats issus du système de recherche, l'efficacité de la restitution de ces résultats représente un moyen alternatif d'augmenter la « pertinence » pour l'utilisateur. Cet article s'intéresse essentiellement aux métaphores 3D de visualisation des résultats de recherche. Nous proposons d'exploiter une métaphore 3D, basée sur le concept de ville, pour améliorer l'efficacité de la restitution des résultats. Une évaluation de cette approche est également proposée. L'utilisation d'un environnement 3D pour l'affichage des résultats permet d'augmenter l'espace de représentation, mais le succès de ces approches passe par une réflexion sur un ensemble de critères bien spécifiques au type d'interface et d'application visées. Ces principaux critères sont présentés dans cet article. ABSTRACT. While searching the web, users are often confronted by a great number of results, generally sorted by their rank and displayed as an ordered list. Facing the limits of this approach, we propose to explore new presentations of search results, as well as new types of interaction with the results to make their exploration more intuitive and efficient. Although the relevance depends on the quality of the results coming from the search engine, the effectiveness of the result processing represents an alternative way to improve the relevance for the user. This paper focuses primarily on 3D metaphors for visualizing search results. We propose to exploit a 3D metaphor based on the city concept, to improve the effectiveness of the result visualization. An
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific re... more HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Exploration interactive de résultats d'arbre de décision
National audienc
implique l'accord avec les conditions générales d'utilisation (http://www.numdam. org/conditions)... more implique l'accord avec les conditions générales d'utilisation (http://www.numdam. org/conditions). Toute utilisation commerciale ou impression systématique est constitutive d'une infraction pénale. Toute copie ou impression de ce fichier doit contenir la présente mention de copyright. Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
Visualisation globale de collections de documents sous forme d'hypercube
Redundant thyroid laboratory diagnostic modules in laboratory information system--a way to improve the performance
Studies in health technology and informatics, 2000
In this paper, we introduced an idea of using redundant thyroid laboratory diagnostic modules int... more In this paper, we introduced an idea of using redundant thyroid laboratory diagnostic modules integrated in laboratory information system. The first module was based on decision tree which was the result of Assistant algorithm applied to thyroid laboratory test results. Instead of improving decision rules, the "second opinion" module was designed based on Spad-S software. Diagnosis obtained with both modules were compared with results before introducing the "second opinion" module. From the first results it was clear that introducing the "second opinion" module decreased the number of misclassified records from the first module. With three or more modules, the final diagnosis could be obtained by voting or more complex procedures.
Multimedia indexing and retrieval with features association rules mining
IEEE International Conference on Multimedia and Expo, 2004
The administration of very large collections of images, accentuates the classical problems of ind... more The administration of very large collections of images, accentuates the classical problems of indexing and efficiently querying information. This paper describes a new method applied to very large still image databases that combines two data mining techniques: clustering and association rules mining in order to better organize image collections and to improve the performance of queries. The objective of our
New Results - New techniques for language processing and applications
Ninth International Conference on Information Visualisation (IV'05)
While searching the Web, the user is often confronted by a great number of results, generally sor... more While searching the Web, the user is often confronted by a great number of results, generally sorted by their rank. These results are then displayed as a succession of ordered lists. Facing the limits of this approach, we propose a prototype to explore new organizations and presentations of search results, as well as new types of interactions with the results in order to make their exploration more intuitive and efficient. The main topic of this paper is the processing of the results coming from an information retrieval system. Although the relevance depends on the result quality, the effectiveness of the result processing represents an alternative way to improve the relevance for the user. Given the current expectations, this processing is composed by an organization step and a visualization step. Then the proposed prototype organizes the results according to their meaning using a Kohonen self-organizing map, and also visualizes them in a 3D scene to increase the representation space. The 3D metaphor proposed here is a city.
Proceedings of the 2nd international workshop on Computer vision meets databases - CVDB '05, 2005
The typical mode for querying in an image content-based information system is query-by-example, w... more The typical mode for querying in an image content-based information system is query-by-example, which allows the user to provide an image as a query and to search for similar images (i.e., the nearest neighbors) based on one or a combination of low-level multidimensional features of the query example. Off-line, this requires the time-consuming pre-computing of the whole set of visual descriptors over the image database. On-line, one major drawback is that multidimensional sequential NN-search is usually exhaustive over the whole image set face to the user who has a very limited patience. In this paper, we propose a technique for improving the performance of image queryby-example execution strategies over multiple visual features. This includes first, the pre-clustering of the large image database and then, the scheduling of the processing of the feature clusters before providing progressively the query results (i.e., intermediate results are sent continuously before the end of the exhaustive scan over the whole database). A cluster eligibility criterion and two filtering rules are proposed to select the most relevant clusters to a query-by-example. Experiments over more than 110 000 images and five MPEG-7 global features show that our approach significantly reduces the query time in two experimental cases: the query time is divided by 4.8 for 100 clusters per descriptor type and by 7 for 200 clusters per descriptor type compared to a "blind" sequential NN-search with keeping the same final query result. This constitutes a promising perspective for optimizing image query-byexample execution.
While searching the web, the user is often confronted by a great number of results, generally dis... more While searching the web, the user is often confronted by a great number of results, generally displayed in a list which is sorted according to the relevance of the results. Facing the limits of this approach, we propose to explore new organizations and presentations of search results, as well as new types of interactions with the results to make their exploration more intuitive and efficient. The main topic of this paper is the processing of the results coming from an information retrieval system. Although the relevance depends on the results quality, the effectiveness of the results processing represents an alternative way to improve the relevance for the user. Given the current expectations, this processing is composed by an organization step and a visualization step. Then the proposed approach organizes the results according to their meaning using a Kohonen Self-Organizing Map (SOM), and visualizes them in a 3D scene to increase the representation space. The 3D metaphor proposed here is a city.
Etude des résumés en français des rapports de recherche d'un institut d' …
JADT'2000
We focus here on the summaries in french of the internal reports published by INRIA from 1989 to ... more We focus here on the summaries in french of the internal reports published by INRIA from 1989 to 1998. Our goal is to study the scientific topics of the reports, to bring on the fore a typology of the themes. For that, we use factorial correspondence analysis and the software ...
Visualization of temporal text collections based on Correspondence Analysis
Expert Systems with Applications, 2012
In this paper, we present CatViz-Temporally-Sliced Correspondence Analysis Visualization. This no... more In this paper, we present CatViz-Temporally-Sliced Correspondence Analysis Visualization. This novel method visualizes relationships through time and is suitable for large-scale temporal multivariate data. We couple CatViz with clustering methods, whereupon we introduce the concept of final centroid transfer, which enables the correspondence of clusters in time. Although CatViz can be used on any type of temporal data, we show how it can be applied to the task of exploratory visual analysis of text collections. We present a successful concept of employing feature-type filtering to present different aspects of textual data. We performed case studies on large collections of French and English news articles. In addition, we conducted a user study that confirms the usefulness of our method. We present typical tasks of exploratory text analysis and discuss application procedures that an analyst might perform. We believe that CatViz is general and highly applicable to large data sets because of its intuitiveness, effectiveness, and robustness. We expect that it will enable a better understanding of texts in huge historical archives.
Utilisation de l'analyse factorielle des correspondances pour la recherche d'images à grande échelle
Extraction et Gestion des Connaissances, 2009
In most countries at the secondary school level, the statistics curriculum is a part of the mathe... more In most countries at the secondary school level, the statistics curriculum is a part of the mathematics curriculum. If we have a look at the papers on statistical education at the college or at the university published ten years ago, we can see that the requirements are practically adaptable to the actual secondary level. With the changes occurring in mathematical education at the secondary school level, with the development of interdisciplinary class projects especially for higher grades (9-12), with the increasing availability of computers at school, the teaching of statistics has changed. But first, we have to define the objective or more precisely the objectives, then the ways to get them and conclude with the limits and their reasons of the approach.
Interactive Exploration of Decision Tree Results
Our investigation aims at interactively exploring the decision tree results obtained by the machi... more Our investigation aims at interactively exploring the decision tree results obtained by the machine-learning algorithms like C4.5. We propose an interactive graphical environment using the new radial tree layout, zoom/pan techniques and some existing visualization methods like explorer-like, hierarchical visualization, interactive techniques to represent large decision trees in a graphical mode more intuitive than the results in output of usual decision tree algorithms. The interactive exploration system on one hand can preserve the global view in a large representation of radial layout, zoom/pan techniques and on the other hand, it also provides a very good performance for an interesting sub-tree in the explorer-like view with simplicity, speed of task completion, ease of use and user understanding. The user can easily extract inductive rules and prune the tree in the post-processing stage. He has a better understanding of the obtained decision tree models. The numerical test resul...
During the last ten years, many tools have been produced for self-learning in statistics. It bega... more During the last ten years, many tools have been produced for self-learning in statistics. It began with distance teaching with paper and pencil, video or television; then specialised software appeared: the first ones on a book model, then using the resources of multimedia. Recent developments are now with Internet. This paper aims to classify available products and analyse future trends.
The Statistics Education Research Journal is published by the International Association for Stati... more The Statistics Education Research Journal is published by the International Association for Statistical Education and the International Statistical Institute to: • encourage research activity in statistics education; • advance knowledge about students ’ attitudes, conceptions, and difficulties as regards stochastic knowledge; • improve the teaching of statistics at all educational levels. The Journal encourages the submission of quality papers, including research reports, theoretical or methodological analyses, and integrative literature surveys, that can advance scholarly knowledge, research methods, and educational practice in any of the broad areas related to statistical education or learning of statistics and probability at all educational levels and in all educational contexts. Contributions in English are preferred. Contributions in French and Spanish will also be considered. All papers are blind-refereed by at least two experts in the field. Submissions Manuscripts should be ...
Problem of Incomplete Record Classification by Diagnostic Modules in Thyroid Laboratory Information System
Lors de recherches sur le Web, l'utilisateur est souvent confronté à un grand nombre de résultats... more Lors de recherches sur le Web, l'utilisateur est souvent confronté à un grand nombre de résultats généralement triés selon leur degré d'appariement à la requête et affichés sous la forme d'une liste. Constatant les limites de cette approche, nous proposons d'explorer de nouvelles organisations et présentations des résultats de recherche, ainsi que de nouveaux types d'interaction avec les résultats afin de rendre leur exploration plus intuitive et efficace. Bien que la « pertinence » soit liée à la qualité des résultats issus du système de recherche, l'efficacité de la restitution de ces résultats représente un moyen alternatif d'augmenter la « pertinence » pour l'utilisateur. Cet article s'intéresse essentiellement aux métaphores 3D de visualisation des résultats de recherche. Nous proposons d'exploiter une métaphore 3D, basée sur le concept de ville, pour améliorer l'efficacité de la restitution des résultats. Une évaluation de cette approche est également proposée. L'utilisation d'un environnement 3D pour l'affichage des résultats permet d'augmenter l'espace de représentation, mais le succès de ces approches passe par une réflexion sur un ensemble de critères bien spécifiques au type d'interface et d'application visées. Ces principaux critères sont présentés dans cet article. ABSTRACT. While searching the web, users are often confronted by a great number of results, generally sorted by their rank and displayed as an ordered list. Facing the limits of this approach, we propose to explore new presentations of search results, as well as new types of interaction with the results to make their exploration more intuitive and efficient. Although the relevance depends on the quality of the results coming from the search engine, the effectiveness of the result processing represents an alternative way to improve the relevance for the user. This paper focuses primarily on 3D metaphors for visualizing search results. We propose to exploit a 3D metaphor based on the city concept, to improve the effectiveness of the result visualization. An
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific re... more HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Exploration interactive de résultats d'arbre de décision
National audienc
implique l'accord avec les conditions générales d'utilisation (http://www.numdam. org/conditions)... more implique l'accord avec les conditions générales d'utilisation (http://www.numdam. org/conditions). Toute utilisation commerciale ou impression systématique est constitutive d'une infraction pénale. Toute copie ou impression de ce fichier doit contenir la présente mention de copyright. Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
Visualisation globale de collections de documents sous forme d'hypercube
Redundant thyroid laboratory diagnostic modules in laboratory information system--a way to improve the performance
Studies in health technology and informatics, 2000
In this paper, we introduced an idea of using redundant thyroid laboratory diagnostic modules int... more In this paper, we introduced an idea of using redundant thyroid laboratory diagnostic modules integrated in laboratory information system. The first module was based on decision tree which was the result of Assistant algorithm applied to thyroid laboratory test results. Instead of improving decision rules, the "second opinion" module was designed based on Spad-S software. Diagnosis obtained with both modules were compared with results before introducing the "second opinion" module. From the first results it was clear that introducing the "second opinion" module decreased the number of misclassified records from the first module. With three or more modules, the final diagnosis could be obtained by voting or more complex procedures.
Multimedia indexing and retrieval with features association rules mining
IEEE International Conference on Multimedia and Expo, 2004
The administration of very large collections of images, accentuates the classical problems of ind... more The administration of very large collections of images, accentuates the classical problems of indexing and efficiently querying information. This paper describes a new method applied to very large still image databases that combines two data mining techniques: clustering and association rules mining in order to better organize image collections and to improve the performance of queries. The objective of our
New Results - New techniques for language processing and applications
Ninth International Conference on Information Visualisation (IV'05)
While searching the Web, the user is often confronted by a great number of results, generally sor... more While searching the Web, the user is often confronted by a great number of results, generally sorted by their rank. These results are then displayed as a succession of ordered lists. Facing the limits of this approach, we propose a prototype to explore new organizations and presentations of search results, as well as new types of interactions with the results in order to make their exploration more intuitive and efficient. The main topic of this paper is the processing of the results coming from an information retrieval system. Although the relevance depends on the result quality, the effectiveness of the result processing represents an alternative way to improve the relevance for the user. Given the current expectations, this processing is composed by an organization step and a visualization step. Then the proposed prototype organizes the results according to their meaning using a Kohonen self-organizing map, and also visualizes them in a 3D scene to increase the representation space. The 3D metaphor proposed here is a city.
Proceedings of the 2nd international workshop on Computer vision meets databases - CVDB '05, 2005
The typical mode for querying in an image content-based information system is query-by-example, w... more The typical mode for querying in an image content-based information system is query-by-example, which allows the user to provide an image as a query and to search for similar images (i.e., the nearest neighbors) based on one or a combination of low-level multidimensional features of the query example. Off-line, this requires the time-consuming pre-computing of the whole set of visual descriptors over the image database. On-line, one major drawback is that multidimensional sequential NN-search is usually exhaustive over the whole image set face to the user who has a very limited patience. In this paper, we propose a technique for improving the performance of image queryby-example execution strategies over multiple visual features. This includes first, the pre-clustering of the large image database and then, the scheduling of the processing of the feature clusters before providing progressively the query results (i.e., intermediate results are sent continuously before the end of the exhaustive scan over the whole database). A cluster eligibility criterion and two filtering rules are proposed to select the most relevant clusters to a query-by-example. Experiments over more than 110 000 images and five MPEG-7 global features show that our approach significantly reduces the query time in two experimental cases: the query time is divided by 4.8 for 100 clusters per descriptor type and by 7 for 200 clusters per descriptor type compared to a "blind" sequential NN-search with keeping the same final query result. This constitutes a promising perspective for optimizing image query-byexample execution.
While searching the web, the user is often confronted by a great number of results, generally dis... more While searching the web, the user is often confronted by a great number of results, generally displayed in a list which is sorted according to the relevance of the results. Facing the limits of this approach, we propose to explore new organizations and presentations of search results, as well as new types of interactions with the results to make their exploration more intuitive and efficient. The main topic of this paper is the processing of the results coming from an information retrieval system. Although the relevance depends on the results quality, the effectiveness of the results processing represents an alternative way to improve the relevance for the user. Given the current expectations, this processing is composed by an organization step and a visualization step. Then the proposed approach organizes the results according to their meaning using a Kohonen Self-Organizing Map (SOM), and visualizes them in a 3D scene to increase the representation space. The 3D metaphor proposed here is a city.
Etude des résumés en français des rapports de recherche d'un institut d' …
JADT'2000
We focus here on the summaries in french of the internal reports published by INRIA from 1989 to ... more We focus here on the summaries in french of the internal reports published by INRIA from 1989 to 1998. Our goal is to study the scientific topics of the reports, to bring on the fore a typology of the themes. For that, we use factorial correspondence analysis and the software ...
Visualization of temporal text collections based on Correspondence Analysis
Expert Systems with Applications, 2012
In this paper, we present CatViz-Temporally-Sliced Correspondence Analysis Visualization. This no... more In this paper, we present CatViz-Temporally-Sliced Correspondence Analysis Visualization. This novel method visualizes relationships through time and is suitable for large-scale temporal multivariate data. We couple CatViz with clustering methods, whereupon we introduce the concept of final centroid transfer, which enables the correspondence of clusters in time. Although CatViz can be used on any type of temporal data, we show how it can be applied to the task of exploratory visual analysis of text collections. We present a successful concept of employing feature-type filtering to present different aspects of textual data. We performed case studies on large collections of French and English news articles. In addition, we conducted a user study that confirms the usefulness of our method. We present typical tasks of exploratory text analysis and discuss application procedures that an analyst might perform. We believe that CatViz is general and highly applicable to large data sets because of its intuitiveness, effectiveness, and robustness. We expect that it will enable a better understanding of texts in huge historical archives.
Utilisation de l'analyse factorielle des correspondances pour la recherche d'images à grande échelle
Extraction et Gestion des Connaissances, 2009
In most countries at the secondary school level, the statistics curriculum is a part of the mathe... more In most countries at the secondary school level, the statistics curriculum is a part of the mathematics curriculum. If we have a look at the papers on statistical education at the college or at the university published ten years ago, we can see that the requirements are practically adaptable to the actual secondary level. With the changes occurring in mathematical education at the secondary school level, with the development of interdisciplinary class projects especially for higher grades (9-12), with the increasing availability of computers at school, the teaching of statistics has changed. But first, we have to define the objective or more precisely the objectives, then the ways to get them and conclude with the limits and their reasons of the approach.
Interactive Exploration of Decision Tree Results
Our investigation aims at interactively exploring the decision tree results obtained by the machi... more Our investigation aims at interactively exploring the decision tree results obtained by the machine-learning algorithms like C4.5. We propose an interactive graphical environment using the new radial tree layout, zoom/pan techniques and some existing visualization methods like explorer-like, hierarchical visualization, interactive techniques to represent large decision trees in a graphical mode more intuitive than the results in output of usual decision tree algorithms. The interactive exploration system on one hand can preserve the global view in a large representation of radial layout, zoom/pan techniques and on the other hand, it also provides a very good performance for an interesting sub-tree in the explorer-like view with simplicity, speed of task completion, ease of use and user understanding. The user can easily extract inductive rules and prune the tree in the post-processing stage. He has a better understanding of the obtained decision tree models. The numerical test resul...
During the last ten years, many tools have been produced for self-learning in statistics. It bega... more During the last ten years, many tools have been produced for self-learning in statistics. It began with distance teaching with paper and pencil, video or television; then specialised software appeared: the first ones on a book model, then using the resources of multimedia. Recent developments are now with Internet. This paper aims to classify available products and analyse future trends.