benedicte le grand - Profile on Academia.edu (original) (raw)
Papers by benedicte le grand
2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)
This paper considers generalizing context reasoning capabilities through a context mining facilit... more This paper considers generalizing context reasoning capabilities through a context mining facility offered to all Information System applications. This facility requires mining context data at the system scale, which raises several challenges for Machine Learning approaches used for such mining. Through a detailed literature review, we analyze these approaches with regard to the requirements of such a context mining facility at the Information System level, pointing to the potential and to the challenges raised by this perspective.
Proceedings of the International AAAI Conference on Web and Social Media
Citation cascades in blog networks are often considered as traces of information spreading on thi... more Citation cascades in blog networks are often considered as traces of information spreading on this social medium. In this work, we question this point of view using both a structural and semantic analysis of five months activity of the most representative blogs of the french-speaking community. Statistical measures reveal that our dataset shares many features with those that can be found in the literature, suggesting the existence of an identical underlying process. However, a closer analysis of the post content indicates that the popular epidemic-like descriptions of cascades are misleading in this context. A basic model, taking only into account the behavior of bloggers and their restricted social network, accounts for several important statistical features of the data. These arguments support the idea that citations primary goal may not be information spreading on the blogosphere.
2019 IEEE 23rd International Enterprise Distributed Object Computing Workshop (EDOCW)
Processes where knowledge is a key characteristic are called knowledge intensive processes (KIP).... more Processes where knowledge is a key characteristic are called knowledge intensive processes (KIP). A successful KIP has to adapt for the situation and to treat each customer's request as unique rather than to follow some predefined sequence of actions. The discipline of Business Process Management (BPM) defines solutions for modeling, development, analysis and improvement of processes with a predefined flow of activities. From the traditional, activity-centered point of view, KIP are challenging to automate, to control and to test for compliance. In this article we present the overview of recent works that address the challenges and explore different ideas, including extension of BPM, theoretical foundations for KIP management and execution support for KIP. We also outline some research perspectives in KIP management and discuss one particular idea that exploits the data-centered point of view on KIP.
Natural Language Processing and Information Systems, 2017
Social content generated by users' interactions in social networks is a knowledge source that may... more Social content generated by users' interactions in social networks is a knowledge source that may enhance users' profiles modeling, by providing information on their activities and interests over time. The aim of this article is to propose several original strategies for modeling profiles of social networks' users, taking into account social information and its temporal evolution. We illustrate our approach on the Twitter network. We distinguish interactive and thematic temporal profiles and we study profiles' similarities by applying various clustering algorithms, by giving a special attention to overlapping clusters. We compare the different types of profiles obtained and show how they can be relevant for the recommendation of hashtags and users to follow.
2016 IEEE Tenth International Conference on Research Challenges in Information Science (RCIS), 2016
This paper presents a new method for automatically extracting smartphone users' contextual behavi... more This paper presents a new method for automatically extracting smartphone users' contextual behaviors from the digital traces collected during their interactions with their devices. Our goal is in particular to understand the impact of users' context (e.g., location, time, environment, etc.) on the applications they run on their smartphones. We propose a methodology to analyze digital traces and to automatically identify the significant information that characterizes users' behaviors. In earlier work, we have used Formal Concept Analysis and Galois lattices to extract relevant knowledge from heterogeneous and complex contextual data; however, the interpretation of the obtained Galois lattices was performed manually. In this article, we aim at automating this interpretation process, through the provision of original metrics. Therefore our methodology returns relevant information without requiring any expertise in data analysis. We illustrate our contribution on real data collected from volunteer users.
Business Process Management Workshops, 2016
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific re... more HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
2013 17th International Conference on Information Visualisation, 2013
Platforms which combine data mining algorithms and interactive visualizations play a key role in ... more Platforms which combine data mining algorithms and interactive visualizations play a key role in the discovery process from complex networks data, e.g. Web and Online Social Networks data. Here we illustrate the use of Gephi, an open source software for networks visual exploration, for the visual analysis of Business Intelligence data modeled as complex networks.
IEEE 7th International Conference on Research Challenges in Information Science (RCIS), 2013
Monitoring the evolution of user-system interactions is of high importance for complex systems an... more Monitoring the evolution of user-system interactions is of high importance for complex systems and for information systems in particular, especially to raise alerts automatically when abnormal behaviors occur. However current methods fail at capturing the intrinsic dynamics of the system, and focus on evolution due to exogenous factors like day-night patterns. In order to capture the intrinsic dynamics of user-system interactions, we propose an innovative graph-based approach relying on a novel concept of time. We apply our method on two large real-world systems (the Github.com social network and the eDonkey peer-topeer system) to automatically detect statistically significant events in a real-time fashion. We finally validate our results with the successful interpretation of the detected events.
Proceedings of the 5th Annual ACM Web Science Conference, 2013
How should we characterize the dynamics of the Web? Whereas network maps have contributed to a re... more How should we characterize the dynamics of the Web? Whereas network maps have contributed to a redefinition of distances and space in information networks, current studies still use a traditional time unit-the second-to understand the temporality of the Web. This unit leads to the observation of exogenous phenomena like day-night patterns. In order to capture the intrinsic dynamics of the network, we introduce an innovative-yet simple-concept of time which relies on the measure of changes in the network space. We demonstrate its practical interest on the evolution of the Github social network.
2014 International Conference on Data Science and Advanced Analytics (DSAA), 2014
In large-scale online complex networks (Wikipedia, Facebook, Twitter, etc.) finding nodes related... more In large-scale online complex networks (Wikipedia, Facebook, Twitter, etc.) finding nodes related to a specific topic is a strategic research subject. This article focuses on two central notions in this context: communities (groups of highly connected nodes) and proximity measures (indicating whether nodes are topologically close). We propose a parameterized proximity measure which, given a set of nodes belonging to a community, learns the optimal parameters and identifies the other nodes of this community, called multi-ego-centered community as it is centered on a set of nodes. We validate our results on a large dataset of categorized Wikipedia pages and on benchmarks, we also show that our approach performs better than existing ones. Our main contributions are (i) a new ergonomic parametrized proximity measure, (ii) the automatic tuning of the proximity's parameters and (iii) the unsupervised detection of community boundaries.
A user-centric vision of service-oriented Pervasive Information Systems
2014 IEEE Eighth International Conference on Research Challenges in Information Science (RCIS), 2014
Web Usage Mining for Ontology Management
Handbook of Research on Text and Web Mining Technologies
Nous présentons une nouvelle méthode d'analyse exploratoire de grands flots de liens que nous app... more Nous présentons une nouvelle méthode d'analyse exploratoire de grands flots de liens que nous appliquons à la détection d'événements significatifs dans plus de 2 millions d'interactions (pendant 4 mois) entre utilisateurs du réseau social en ligne Github. Nous combinons une méthode statistique de détection automatique d'événements dans une série temporelle, Outskewer, avec un système de visualisation de graphes. Outskewer identifie des instants de l'évolution du graphe d'interactions méritant d'être étudiés, et un analyste peut valider et interpréter ces événements par la visualisation de motifs anormaux dans les sous-graphes correspondants. Nous montrons par de multiples exemples que cette approche 1) permet de détecter des événements pertinents et de rejeter ceux qui ne le sont pas, 2) est adaptée à une démarche exploratoire car elle ne nécessite pas de connaissance a priori sur les données.
This article presents an original approach for the analysis of context information in ubiquitous ... more This article presents an original approach for the analysis of context information in ubiquitous environments. Large volumes of heterogeneous data are now collected, such as location, temperature, etc. This "environmental" context may be enriched by data related to users, e.g., their activities or applications. We propose a unified analysis and correlation of all these dimensions of context in order to measure their impact on user activities. Formal Concept Analysis and association rules are used to discover non-trivial relationships between context elements and activities, which, otherwise, could seem independent. Our goal is to make an optimal use of available data in order to understand user behavior and eventually make recommendations. In this paper, we describe our general methodology for context analysis and we illustrate it on an experiment conducted on real data collected by a capture system. Thanks to this methodology, it is possible to identify correlation between context elements and user applications, making possible to recommend such applications for user in similar situations.
Le suivi de l'évolution des interactions utilisateur-système est de première importance pour les ... more Le suivi de l'évolution des interactions utilisateur-système est de première importance pour les systèmes complexes et les systèmes d'information en particulier, notamment pour déclencher des alertes automatiquement quand survient un comportement anormal. Cependant, les méthodes actuelles ne parviennent pasà capturer la dynamique intrinsèque du système et font apparaître desévolutions dûesà des facteurs externes comme les cycles jour-nuit. Dans le but de capturer la dynamique intrinsèque des interactions utilisateur-système, nous proposons une approche innovanteà base de graphes reposant sur une nouvelle conception du temps. Nous appliquons notre méthodeà un grand système réel (le réseau social Github.com) pour détecter automatiquement desévénements statistiquement significatifs en temps réel. Nous validons enfin nos résultats par l'interprétation réussie desévénements détectés. ABSTRACT. Monitoring the evolution of user-system interactions is of high importance for complex systems and for information systems in particular, especially to raise alerts automatically when abnormal behaviors occur. However current methods fail at capturing the intrinsic dynamics of the system, and focus on evolution due to exogenous factors like day-night patterns. In order to capture the intrinsic dynamics of user-system interactions, we propose an innovative graph-based approach relying on a novel concept of time. We apply our method on a large real-world system (the Github.com social network) to automatically detect statistically significant events in a real-time fashion. We finally validate our results with the successful interpretation of the detected events.
Systèmes d'Information Pervasifs et Espaces de Services: Définition d'un cadre conceptuel
distinguent des SI traditionnels par l'hétérogénéité des environnements pervasifs et des systèmes... more distinguent des SI traditionnels par l'hétérogénéité des environnements pervasifs et des systèmes pervasifs eux-mêmes, du fait de leur intégration à l'entreprise et de leur besoin d'alignement vis-à-vis de la stratégie de celle-ci. Dans ce contexte, l'espace de services est un concept abstrait permettant de masquer la vraie nature des éléments qui composent le SIP. Il s'agit d'un outil formel permettant de mieux gérer l'hétérogénéité des environnements pervasifs dans le cadre d'un Système d'Information.
International Journal of Conceptual Structures and Smart Applications, 2013
Concept lattices have been widely used for various purposes in many different applications since ... more Concept lattices have been widely used for various purposes in many different applications since the 1980s. Recent applications of Formal Concept Analysis include extensions of traditional FCA applications such as data and text mining, machine learning and knowledge management. Progress has also recently been made in software engineering, Semantic Web and databases. New applications have also emerged in the fields of healthcare, ecology, biology, agronomy, business and social networks. This article presents example of successful applications of FCA for Social Networks Analysis. We show the benefit of FCA solutions, as well as their combination with semantics and topology-based approaches. We conclude by presenting FCA-based visualization solutions and open challenges for FCA in the context of large and dynamic data.
Conceptual and Spatial Footprints for Complex Systems Analysis: Application to the Semantic Web
Lecture Notes in Computer Science, 2009
2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)
This paper considers generalizing context reasoning capabilities through a context mining facilit... more This paper considers generalizing context reasoning capabilities through a context mining facility offered to all Information System applications. This facility requires mining context data at the system scale, which raises several challenges for Machine Learning approaches used for such mining. Through a detailed literature review, we analyze these approaches with regard to the requirements of such a context mining facility at the Information System level, pointing to the potential and to the challenges raised by this perspective.
Proceedings of the International AAAI Conference on Web and Social Media
Citation cascades in blog networks are often considered as traces of information spreading on thi... more Citation cascades in blog networks are often considered as traces of information spreading on this social medium. In this work, we question this point of view using both a structural and semantic analysis of five months activity of the most representative blogs of the french-speaking community. Statistical measures reveal that our dataset shares many features with those that can be found in the literature, suggesting the existence of an identical underlying process. However, a closer analysis of the post content indicates that the popular epidemic-like descriptions of cascades are misleading in this context. A basic model, taking only into account the behavior of bloggers and their restricted social network, accounts for several important statistical features of the data. These arguments support the idea that citations primary goal may not be information spreading on the blogosphere.
2019 IEEE 23rd International Enterprise Distributed Object Computing Workshop (EDOCW)
Processes where knowledge is a key characteristic are called knowledge intensive processes (KIP).... more Processes where knowledge is a key characteristic are called knowledge intensive processes (KIP). A successful KIP has to adapt for the situation and to treat each customer's request as unique rather than to follow some predefined sequence of actions. The discipline of Business Process Management (BPM) defines solutions for modeling, development, analysis and improvement of processes with a predefined flow of activities. From the traditional, activity-centered point of view, KIP are challenging to automate, to control and to test for compliance. In this article we present the overview of recent works that address the challenges and explore different ideas, including extension of BPM, theoretical foundations for KIP management and execution support for KIP. We also outline some research perspectives in KIP management and discuss one particular idea that exploits the data-centered point of view on KIP.
Natural Language Processing and Information Systems, 2017
Social content generated by users' interactions in social networks is a knowledge source that may... more Social content generated by users' interactions in social networks is a knowledge source that may enhance users' profiles modeling, by providing information on their activities and interests over time. The aim of this article is to propose several original strategies for modeling profiles of social networks' users, taking into account social information and its temporal evolution. We illustrate our approach on the Twitter network. We distinguish interactive and thematic temporal profiles and we study profiles' similarities by applying various clustering algorithms, by giving a special attention to overlapping clusters. We compare the different types of profiles obtained and show how they can be relevant for the recommendation of hashtags and users to follow.
2016 IEEE Tenth International Conference on Research Challenges in Information Science (RCIS), 2016
This paper presents a new method for automatically extracting smartphone users' contextual behavi... more This paper presents a new method for automatically extracting smartphone users' contextual behaviors from the digital traces collected during their interactions with their devices. Our goal is in particular to understand the impact of users' context (e.g., location, time, environment, etc.) on the applications they run on their smartphones. We propose a methodology to analyze digital traces and to automatically identify the significant information that characterizes users' behaviors. In earlier work, we have used Formal Concept Analysis and Galois lattices to extract relevant knowledge from heterogeneous and complex contextual data; however, the interpretation of the obtained Galois lattices was performed manually. In this article, we aim at automating this interpretation process, through the provision of original metrics. Therefore our methodology returns relevant information without requiring any expertise in data analysis. We illustrate our contribution on real data collected from volunteer users.
Business Process Management Workshops, 2016
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific re... more HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
2013 17th International Conference on Information Visualisation, 2013
Platforms which combine data mining algorithms and interactive visualizations play a key role in ... more Platforms which combine data mining algorithms and interactive visualizations play a key role in the discovery process from complex networks data, e.g. Web and Online Social Networks data. Here we illustrate the use of Gephi, an open source software for networks visual exploration, for the visual analysis of Business Intelligence data modeled as complex networks.
IEEE 7th International Conference on Research Challenges in Information Science (RCIS), 2013
Monitoring the evolution of user-system interactions is of high importance for complex systems an... more Monitoring the evolution of user-system interactions is of high importance for complex systems and for information systems in particular, especially to raise alerts automatically when abnormal behaviors occur. However current methods fail at capturing the intrinsic dynamics of the system, and focus on evolution due to exogenous factors like day-night patterns. In order to capture the intrinsic dynamics of user-system interactions, we propose an innovative graph-based approach relying on a novel concept of time. We apply our method on two large real-world systems (the Github.com social network and the eDonkey peer-topeer system) to automatically detect statistically significant events in a real-time fashion. We finally validate our results with the successful interpretation of the detected events.
Proceedings of the 5th Annual ACM Web Science Conference, 2013
How should we characterize the dynamics of the Web? Whereas network maps have contributed to a re... more How should we characterize the dynamics of the Web? Whereas network maps have contributed to a redefinition of distances and space in information networks, current studies still use a traditional time unit-the second-to understand the temporality of the Web. This unit leads to the observation of exogenous phenomena like day-night patterns. In order to capture the intrinsic dynamics of the network, we introduce an innovative-yet simple-concept of time which relies on the measure of changes in the network space. We demonstrate its practical interest on the evolution of the Github social network.
2014 International Conference on Data Science and Advanced Analytics (DSAA), 2014
In large-scale online complex networks (Wikipedia, Facebook, Twitter, etc.) finding nodes related... more In large-scale online complex networks (Wikipedia, Facebook, Twitter, etc.) finding nodes related to a specific topic is a strategic research subject. This article focuses on two central notions in this context: communities (groups of highly connected nodes) and proximity measures (indicating whether nodes are topologically close). We propose a parameterized proximity measure which, given a set of nodes belonging to a community, learns the optimal parameters and identifies the other nodes of this community, called multi-ego-centered community as it is centered on a set of nodes. We validate our results on a large dataset of categorized Wikipedia pages and on benchmarks, we also show that our approach performs better than existing ones. Our main contributions are (i) a new ergonomic parametrized proximity measure, (ii) the automatic tuning of the proximity's parameters and (iii) the unsupervised detection of community boundaries.
A user-centric vision of service-oriented Pervasive Information Systems
2014 IEEE Eighth International Conference on Research Challenges in Information Science (RCIS), 2014
Web Usage Mining for Ontology Management
Handbook of Research on Text and Web Mining Technologies
Nous présentons une nouvelle méthode d'analyse exploratoire de grands flots de liens que nous app... more Nous présentons une nouvelle méthode d'analyse exploratoire de grands flots de liens que nous appliquons à la détection d'événements significatifs dans plus de 2 millions d'interactions (pendant 4 mois) entre utilisateurs du réseau social en ligne Github. Nous combinons une méthode statistique de détection automatique d'événements dans une série temporelle, Outskewer, avec un système de visualisation de graphes. Outskewer identifie des instants de l'évolution du graphe d'interactions méritant d'être étudiés, et un analyste peut valider et interpréter ces événements par la visualisation de motifs anormaux dans les sous-graphes correspondants. Nous montrons par de multiples exemples que cette approche 1) permet de détecter des événements pertinents et de rejeter ceux qui ne le sont pas, 2) est adaptée à une démarche exploratoire car elle ne nécessite pas de connaissance a priori sur les données.
This article presents an original approach for the analysis of context information in ubiquitous ... more This article presents an original approach for the analysis of context information in ubiquitous environments. Large volumes of heterogeneous data are now collected, such as location, temperature, etc. This "environmental" context may be enriched by data related to users, e.g., their activities or applications. We propose a unified analysis and correlation of all these dimensions of context in order to measure their impact on user activities. Formal Concept Analysis and association rules are used to discover non-trivial relationships between context elements and activities, which, otherwise, could seem independent. Our goal is to make an optimal use of available data in order to understand user behavior and eventually make recommendations. In this paper, we describe our general methodology for context analysis and we illustrate it on an experiment conducted on real data collected by a capture system. Thanks to this methodology, it is possible to identify correlation between context elements and user applications, making possible to recommend such applications for user in similar situations.
Le suivi de l'évolution des interactions utilisateur-système est de première importance pour les ... more Le suivi de l'évolution des interactions utilisateur-système est de première importance pour les systèmes complexes et les systèmes d'information en particulier, notamment pour déclencher des alertes automatiquement quand survient un comportement anormal. Cependant, les méthodes actuelles ne parviennent pasà capturer la dynamique intrinsèque du système et font apparaître desévolutions dûesà des facteurs externes comme les cycles jour-nuit. Dans le but de capturer la dynamique intrinsèque des interactions utilisateur-système, nous proposons une approche innovanteà base de graphes reposant sur une nouvelle conception du temps. Nous appliquons notre méthodeà un grand système réel (le réseau social Github.com) pour détecter automatiquement desévénements statistiquement significatifs en temps réel. Nous validons enfin nos résultats par l'interprétation réussie desévénements détectés. ABSTRACT. Monitoring the evolution of user-system interactions is of high importance for complex systems and for information systems in particular, especially to raise alerts automatically when abnormal behaviors occur. However current methods fail at capturing the intrinsic dynamics of the system, and focus on evolution due to exogenous factors like day-night patterns. In order to capture the intrinsic dynamics of user-system interactions, we propose an innovative graph-based approach relying on a novel concept of time. We apply our method on a large real-world system (the Github.com social network) to automatically detect statistically significant events in a real-time fashion. We finally validate our results with the successful interpretation of the detected events.
Systèmes d'Information Pervasifs et Espaces de Services: Définition d'un cadre conceptuel
distinguent des SI traditionnels par l'hétérogénéité des environnements pervasifs et des systèmes... more distinguent des SI traditionnels par l'hétérogénéité des environnements pervasifs et des systèmes pervasifs eux-mêmes, du fait de leur intégration à l'entreprise et de leur besoin d'alignement vis-à-vis de la stratégie de celle-ci. Dans ce contexte, l'espace de services est un concept abstrait permettant de masquer la vraie nature des éléments qui composent le SIP. Il s'agit d'un outil formel permettant de mieux gérer l'hétérogénéité des environnements pervasifs dans le cadre d'un Système d'Information.
International Journal of Conceptual Structures and Smart Applications, 2013
Concept lattices have been widely used for various purposes in many different applications since ... more Concept lattices have been widely used for various purposes in many different applications since the 1980s. Recent applications of Formal Concept Analysis include extensions of traditional FCA applications such as data and text mining, machine learning and knowledge management. Progress has also recently been made in software engineering, Semantic Web and databases. New applications have also emerged in the fields of healthcare, ecology, biology, agronomy, business and social networks. This article presents example of successful applications of FCA for Social Networks Analysis. We show the benefit of FCA solutions, as well as their combination with semantics and topology-based approaches. We conclude by presenting FCA-based visualization solutions and open challenges for FCA in the context of large and dynamic data.
Conceptual and Spatial Footprints for Complex Systems Analysis: Application to the Semantic Web
Lecture Notes in Computer Science, 2009