Giuseppe Polese - Academia.edu (original) (raw)
Papers by Giuseppe Polese
IEEE Access, 2020
Nowadays, new laws and regulations, such as the European General Data Protection Regulation (GDPR... more Nowadays, new laws and regulations, such as the European General Data Protection Regulation (GDPR), require companies to define privacy policies complying with the preferences of their users. The regulation prescribes expensive penalties for those companies causing the disclosure of sensitive data of their users, even if this occurs accidentally. Thus, it is necessary to devise methods supporting companies in the identification of privacy threats during advanced data manipulation activities. To this end, in this paper, we propose a methodology exploiting relaxed functional dependencies (RFDs) to automatically identify data that could imply the values of sensitive ones, which permits to increase the confidentiality of a dataset while reducing the number of values to be obscured. An experimental evaluation demonstrates the effectiveness of the proposed methodology in increasing compliance to the GDPR data privacy, while reducing the set of values to be partially masked, hence enhancing data usage. INDEX TERMS Data privacy, confidentiality, data dependencies.
In this paper we propose a visual language based framework to effectively tackle the problem of s... more In this paper we propose a visual language based framework to effectively tackle the problem of software based structural analysis in different application domains. In particular, the framework includes grammar based parser generation modules to easily adapt structural analysis software packages to evolving standards of specific application domains. Moreover, it includes visual analytics paradigms to enhance the software based structural analysis processes. To demonstrate the feasibility of the proposed framework we have implemented some of its modules and instantiated them in the context of the evaluation of earthquake-resistant masonry buildings.
We present an approach to integrate a visual authorization policy management system based on RBAC... more We present an approach to integrate a visual authorization policy management system based on RBAC and XACM in the ADAMS (ADvanced Artifact Management System) Process Support System. ADAMS is a Web-based system that integrates project management features such as resource allocation and process control and artifact management features, such as coordination of cooperative workers and artifact versioning, as well as context-awareness. We propose a hierarchy of visual languages aiming to support project managers and security administrators in modeling RBAC based access policies in ADAMS. The visual sentences are translated into XACML and stored into a Policy Repository. In this way the Policy Management System is able to process XACML requests and compare them against the defined access policies
Journal of Big Data
Social networks are a vast source of information, and they have been increasing impact on people’... more Social networks are a vast source of information, and they have been increasing impact on people’s daily lives. They permit us to share emotions, passions, and interactions with other people around the world. While enabling people to exhibit their lives, social networks guarantee their privacy. The definitions of privacy requirements and default policies for safeguarding people’s data are the most difficult challenges that social networks have to deal with. In this work, we have collected data concerning people who have different social network profiles, aiming to analyse privacy requirements offered by social networks. In particular, we have built a tool exploiting image-recognition techniques to recognise a user from his/her picture, aiming to collect his/her personal data accessible through social networks where s/he has a profile. We have composed a dataset of 5000 users by combining data available from several social networks; we compared social network data mandatory in the re...
The Indiana MAS project, funded by the Italian Ministry of Education, University and Research "Fu... more The Indiana MAS project, funded by the Italian Ministry of Education, University and Research "Futuro in Ricerca 2010" program, aims at providing a framework for the digital protection and conservation of rock art natural and cultural heritage sites, by storing, organizing and presenting information about them in such a way to encourage scientific research and to raise the interest and sensibility towards them from the common people. The project involves two research units, namely Genova (Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi) and Salerno (Dipartimento di Matematica e Informatica), for a period of 36 months, starting from march 8th, 2012. The technologies adopted in the project range from agents to ontologies, as requested by the complex nature of the platform, where each module is devoted to a specific task: sketch and symbol recognition, semantic interpretation of complex visual scenes, multi-language text understanding, storing, classification and indexing of multimedia and heterogeneous digital objects. All of them should cooperate and coordinate in order to enable higher level components to reason on them and to detect relationships among different digital objects, hence providing new hypothesis based on such relationships.
Relaxed functional dependencies (rfds) are properties expressing important relationships among da... more Relaxed functional dependencies (rfds) are properties expressing important relationships among data. Thanks to the introduction of approximations in data comparison and/or validity, they can capture constraints useful for several purposes, such as the identification of data inconsistencies or patterns of semantically related data. Nevertheless, rfds can provide benefits only if they can be automatically discovered from data. In this discussion paper we present an rfd discovery algorithm relying on a lattice structured search space, and a new candidate rfd validation method. An experimental evaluation demonstrates the discovery performances of the proposed algorithm on real datasets.
Quality of search engine results often do not meet user’s expectations. In this paper we propose ... more Quality of search engine results often do not meet user’s expectations. In this paper we propose to implicitly infer visitors feedbacks from the actions they perform while reading a web document. In particular, we propose a new model to interpret mouse cursor actions, such as scrolling, movement, text selection, while reading web documents, aiming to infer a relevance value indicating how the user found the document useful for his/her search purposes. We have implemented the proposed model through light-weight components, which can be easily installed within major web browsers as a plug-in. The components capture mouse cursor actions without spoiling user browsing activities, which enabled us to easily collect experimental data to validate the proposed model. The experimental results demonstrate that the proposed model is able to predict user feedbacks with an acceptable level of accuracy.
Proceedings of the 27th International Conference on Distributed Multimedia Systems, 2021
Cardiac arrhythmia is an alteration of the heart rhythm, for which the heartbeat is irregular. Ba... more Cardiac arrhythmia is an alteration of the heart rhythm, for which the heartbeat is irregular. Based on the severity of this condition, an arrhythmia could represent a serious danger for a patient. An ECG is a graphic representation of an heart rhythm, which provides an overview of heart's conditions over a specific time interval. ECG signal analysis is entrusted to trained clinicians, although complex and frantic environments, such as emergency settings, can make hard to delegate continuous monitoring to the medical personnel. In such scenarios, an automatic detection methodology could provide crucial support in promptly alerting clinicians towards a potential degeneration of a patient's conditions. To this end, we propose a heartbeat classification module capable of capturing the semantics of visual information of ECG signals provided by video frames. The module relies on feature extraction techniques derived from video projected images resulting in ECG data, which are then classified by means of deep-learning models. It can be used to support the early detection of some arrhythmia in critical contexts, such as emergency rooms. We show how the proposed module can be used to support clinicians in this context, and discuss an experimental evaluation performed over ground-truth datasets.
Multimedia Tools and Applications, 2022
Proceedings of the Second International Conference on Software and Data Technologies, 2007
The construction of spatial databases often requires considerable computing and storage resources... more The construction of spatial databases often requires considerable computing and storage resources, due to the inherent complexity of spatial data and their manipulation. Thus, it would be desirable to devise methods enabling a designer to estimate performances of a spatial database since from its early design stages. We present a method for estimating both the size of data and the cost of operations based on the conceptual schema of the spatial database. We also show the application of the method to the design of a spatial database concerning botanic data.
Mashup editors enable end-users to mix the functionalities of several applications to derive a ne... more Mashup editors enable end-users to mix the functionalities of several applications to derive a new one. However, when the end-user faces the development of a new mashup application s/he has to cope with the abundance of services and information sources available on the Web, and with complex operations like filtering and joining. Thus, even a simple to use mashup editor is not capable of providing adequate support, unless it embeds intelligent methods to process the semantics of available mashups and rank them based on how much they meet user needs. Most existing mashup editors process either semantic or statistical information to derive recommendations for the mashups considered suitable to user needs. However, none of them uses both strategies in a synergistic way. In this paper we present a new mashup advisory approach and a system that combines the statistical and semantic based approaches, by using collaborative filtering techniques and semantic tagging, in order to rank mashups...
We discuss the results of experiments on spatial databases, aiming to empirically derive paramete... more We discuss the results of experiments on spatial databases, aiming to empirically derive parameters for estimating disk occupancy and performances since from the conceptual stages of the design process. This opens the way to the definition of an estimation methodology, which should let a designer evaluate the quality of alternative design choices based on their xpected performances
IEEE Access, 2020
IoT data analytics can potentially bring benefits to several critical application domains, especi... more IoT data analytics can potentially bring benefits to several critical application domains, especially in healthcare. In fact, especially in emergency rooms the detection of critical patients can be a critical task when the number of patients to be monitored is high with respect to the available medical personnel. However, it is also necessary to pay attention to ethics, privacy, and security issues, aiming to prevent attacks and unauthorized access to sensitive data of patients, guaranteeing the correct functioning of the system in a secure environment. To this end, this article presents a knowledge representation framework enabling the intelligent video surveillance of patients, which can be used in combination with IoT-based systems to enhance the detection of critical patients in emergency rooms, while dealing with ethics, privacy, and security issues. These are guaranteed by means of an event-based visual access control specification method, constraining the access to both devices and users. We also describe a clinical scenario related to the early treatment of sepsis in an emergency room, showing how the proposed framework can enhance the detection of such critical disease while guaranteeing ethics, privacy, and security. INDEX TERMS IoT data analytics, video surveillance, role-based and event-based access control, event modeling, ICU, knowledge representation.
Failing queries are database queries returning few o no results. It might be useful reformulating... more Failing queries are database queries returning few o no results. It might be useful reformulating them in order to retrieve results that are close to those intended with original queries. In this paper, we introduce an approach for rewriting failing queries that are in the disjunctive normal form. In particular, the approach prescribes to replace some of the attributes of the failing queries with attributes semantically related to them by means of Relaxed Functional Dependencies (rfds), which can be automatically discovered from data. The semantics of automatically discovered rfds allow us to rank them in a way to provide an application order during the query rewriting process. Experiments show that such application order of rfds yields a ranking of the approximate query answers meeting the expectations of the user.
Big Data Research, 2021
Abstract Data stream profiling concerns the automatic extraction of metadata from a data stream, ... more Abstract Data stream profiling concerns the automatic extraction of metadata from a data stream, without having the possibility to store it. Among the metadata of interest, functional dependencies ( fd s), and their extensions relaxed functional dependencies ( rfd s), represent an important semantic property of data. Nowadays, there are many algorithms for automatically discovering them from static datasets, and some are being proposed for data streams. However, one of the main problems is that the stream nature of data requires a different paradigm of monitoring, since the “big” number of ( r ) fd s that might hold on a given dataset continuously change as new data are read from the stream. In this paper, we present a tool for visualizing rfd s discovered from a data stream. The tool permits to explore results for different types of rfd s, and uses quantitative measures to monitor how discovery results evolve. Moreover, the tool enables the comparison among rfd s discovered across several executions, also proving visual manipulation operators to dynamically compose and filter results. A user study has been conducted to assess the effectiveness of the proposed visualization tool.
2018 IEEE International Conference on Big Data (Big Data), 2018
Nowadays, the human influence often depends on the number of followers that an individual has in ... more Nowadays, the human influence often depends on the number of followers that an individual has in his/her own social media context. To this end, the presence of fake accounts is one of the most relevant problems and can potentially have a big impact on many real life and business activities. Fake followers are dangerous for social platforms, since they may alter concepts like popularity and influence, which might yield a strong impact on economy, politics, and society. Thus, it is necessary to devise new methodologies enabling the possibility to identify and characterize fake accounts. This work presents a novel technique to discriminate real accounts on social networks from fake ones. The technique exploits knowledge automatically extracted from big data to characterize typical patterns of fake accounts. We empirically evaluated the proposed technique on the Twitter social network, and achieved significant results in terms of discrimination capabilities.
Proceedings of the 20th International Database Engineering & Applications Symposium on - IDEAS '16, 2016
Approximate functional dependencies are used in many emerging application domains, such as the id... more Approximate functional dependencies are used in many emerging application domains, such as the identification of data inconsistencies or patterns of semantically related data, query rewriting, and so forth. They can approximate the canonical definition of functional dependency (fd) by relaxing on the data comparison (i.e., by considering data similarity rather than equality), on the extent (i.e., by admitting the possibility that the dependency holds on a subset of data), or both. Approximate fds are difficult to be identified at design time like it happens with fds. In this paper, we propose a genetic algorithm to discover approximate fds from data. An empirical evaluation demonstrates the effectiveness of the algorithm.
IEEE Access, 2020
Nowadays, new laws and regulations, such as the European General Data Protection Regulation (GDPR... more Nowadays, new laws and regulations, such as the European General Data Protection Regulation (GDPR), require companies to define privacy policies complying with the preferences of their users. The regulation prescribes expensive penalties for those companies causing the disclosure of sensitive data of their users, even if this occurs accidentally. Thus, it is necessary to devise methods supporting companies in the identification of privacy threats during advanced data manipulation activities. To this end, in this paper, we propose a methodology exploiting relaxed functional dependencies (RFDs) to automatically identify data that could imply the values of sensitive ones, which permits to increase the confidentiality of a dataset while reducing the number of values to be obscured. An experimental evaluation demonstrates the effectiveness of the proposed methodology in increasing compliance to the GDPR data privacy, while reducing the set of values to be partially masked, hence enhancing data usage. INDEX TERMS Data privacy, confidentiality, data dependencies.
In this paper we propose a visual language based framework to effectively tackle the problem of s... more In this paper we propose a visual language based framework to effectively tackle the problem of software based structural analysis in different application domains. In particular, the framework includes grammar based parser generation modules to easily adapt structural analysis software packages to evolving standards of specific application domains. Moreover, it includes visual analytics paradigms to enhance the software based structural analysis processes. To demonstrate the feasibility of the proposed framework we have implemented some of its modules and instantiated them in the context of the evaluation of earthquake-resistant masonry buildings.
We present an approach to integrate a visual authorization policy management system based on RBAC... more We present an approach to integrate a visual authorization policy management system based on RBAC and XACM in the ADAMS (ADvanced Artifact Management System) Process Support System. ADAMS is a Web-based system that integrates project management features such as resource allocation and process control and artifact management features, such as coordination of cooperative workers and artifact versioning, as well as context-awareness. We propose a hierarchy of visual languages aiming to support project managers and security administrators in modeling RBAC based access policies in ADAMS. The visual sentences are translated into XACML and stored into a Policy Repository. In this way the Policy Management System is able to process XACML requests and compare them against the defined access policies
Journal of Big Data
Social networks are a vast source of information, and they have been increasing impact on people’... more Social networks are a vast source of information, and they have been increasing impact on people’s daily lives. They permit us to share emotions, passions, and interactions with other people around the world. While enabling people to exhibit their lives, social networks guarantee their privacy. The definitions of privacy requirements and default policies for safeguarding people’s data are the most difficult challenges that social networks have to deal with. In this work, we have collected data concerning people who have different social network profiles, aiming to analyse privacy requirements offered by social networks. In particular, we have built a tool exploiting image-recognition techniques to recognise a user from his/her picture, aiming to collect his/her personal data accessible through social networks where s/he has a profile. We have composed a dataset of 5000 users by combining data available from several social networks; we compared social network data mandatory in the re...
The Indiana MAS project, funded by the Italian Ministry of Education, University and Research "Fu... more The Indiana MAS project, funded by the Italian Ministry of Education, University and Research "Futuro in Ricerca 2010" program, aims at providing a framework for the digital protection and conservation of rock art natural and cultural heritage sites, by storing, organizing and presenting information about them in such a way to encourage scientific research and to raise the interest and sensibility towards them from the common people. The project involves two research units, namely Genova (Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi) and Salerno (Dipartimento di Matematica e Informatica), for a period of 36 months, starting from march 8th, 2012. The technologies adopted in the project range from agents to ontologies, as requested by the complex nature of the platform, where each module is devoted to a specific task: sketch and symbol recognition, semantic interpretation of complex visual scenes, multi-language text understanding, storing, classification and indexing of multimedia and heterogeneous digital objects. All of them should cooperate and coordinate in order to enable higher level components to reason on them and to detect relationships among different digital objects, hence providing new hypothesis based on such relationships.
Relaxed functional dependencies (rfds) are properties expressing important relationships among da... more Relaxed functional dependencies (rfds) are properties expressing important relationships among data. Thanks to the introduction of approximations in data comparison and/or validity, they can capture constraints useful for several purposes, such as the identification of data inconsistencies or patterns of semantically related data. Nevertheless, rfds can provide benefits only if they can be automatically discovered from data. In this discussion paper we present an rfd discovery algorithm relying on a lattice structured search space, and a new candidate rfd validation method. An experimental evaluation demonstrates the discovery performances of the proposed algorithm on real datasets.
Quality of search engine results often do not meet user’s expectations. In this paper we propose ... more Quality of search engine results often do not meet user’s expectations. In this paper we propose to implicitly infer visitors feedbacks from the actions they perform while reading a web document. In particular, we propose a new model to interpret mouse cursor actions, such as scrolling, movement, text selection, while reading web documents, aiming to infer a relevance value indicating how the user found the document useful for his/her search purposes. We have implemented the proposed model through light-weight components, which can be easily installed within major web browsers as a plug-in. The components capture mouse cursor actions without spoiling user browsing activities, which enabled us to easily collect experimental data to validate the proposed model. The experimental results demonstrate that the proposed model is able to predict user feedbacks with an acceptable level of accuracy.
Proceedings of the 27th International Conference on Distributed Multimedia Systems, 2021
Cardiac arrhythmia is an alteration of the heart rhythm, for which the heartbeat is irregular. Ba... more Cardiac arrhythmia is an alteration of the heart rhythm, for which the heartbeat is irregular. Based on the severity of this condition, an arrhythmia could represent a serious danger for a patient. An ECG is a graphic representation of an heart rhythm, which provides an overview of heart's conditions over a specific time interval. ECG signal analysis is entrusted to trained clinicians, although complex and frantic environments, such as emergency settings, can make hard to delegate continuous monitoring to the medical personnel. In such scenarios, an automatic detection methodology could provide crucial support in promptly alerting clinicians towards a potential degeneration of a patient's conditions. To this end, we propose a heartbeat classification module capable of capturing the semantics of visual information of ECG signals provided by video frames. The module relies on feature extraction techniques derived from video projected images resulting in ECG data, which are then classified by means of deep-learning models. It can be used to support the early detection of some arrhythmia in critical contexts, such as emergency rooms. We show how the proposed module can be used to support clinicians in this context, and discuss an experimental evaluation performed over ground-truth datasets.
Multimedia Tools and Applications, 2022
Proceedings of the Second International Conference on Software and Data Technologies, 2007
The construction of spatial databases often requires considerable computing and storage resources... more The construction of spatial databases often requires considerable computing and storage resources, due to the inherent complexity of spatial data and their manipulation. Thus, it would be desirable to devise methods enabling a designer to estimate performances of a spatial database since from its early design stages. We present a method for estimating both the size of data and the cost of operations based on the conceptual schema of the spatial database. We also show the application of the method to the design of a spatial database concerning botanic data.
Mashup editors enable end-users to mix the functionalities of several applications to derive a ne... more Mashup editors enable end-users to mix the functionalities of several applications to derive a new one. However, when the end-user faces the development of a new mashup application s/he has to cope with the abundance of services and information sources available on the Web, and with complex operations like filtering and joining. Thus, even a simple to use mashup editor is not capable of providing adequate support, unless it embeds intelligent methods to process the semantics of available mashups and rank them based on how much they meet user needs. Most existing mashup editors process either semantic or statistical information to derive recommendations for the mashups considered suitable to user needs. However, none of them uses both strategies in a synergistic way. In this paper we present a new mashup advisory approach and a system that combines the statistical and semantic based approaches, by using collaborative filtering techniques and semantic tagging, in order to rank mashups...
We discuss the results of experiments on spatial databases, aiming to empirically derive paramete... more We discuss the results of experiments on spatial databases, aiming to empirically derive parameters for estimating disk occupancy and performances since from the conceptual stages of the design process. This opens the way to the definition of an estimation methodology, which should let a designer evaluate the quality of alternative design choices based on their xpected performances
IEEE Access, 2020
IoT data analytics can potentially bring benefits to several critical application domains, especi... more IoT data analytics can potentially bring benefits to several critical application domains, especially in healthcare. In fact, especially in emergency rooms the detection of critical patients can be a critical task when the number of patients to be monitored is high with respect to the available medical personnel. However, it is also necessary to pay attention to ethics, privacy, and security issues, aiming to prevent attacks and unauthorized access to sensitive data of patients, guaranteeing the correct functioning of the system in a secure environment. To this end, this article presents a knowledge representation framework enabling the intelligent video surveillance of patients, which can be used in combination with IoT-based systems to enhance the detection of critical patients in emergency rooms, while dealing with ethics, privacy, and security issues. These are guaranteed by means of an event-based visual access control specification method, constraining the access to both devices and users. We also describe a clinical scenario related to the early treatment of sepsis in an emergency room, showing how the proposed framework can enhance the detection of such critical disease while guaranteeing ethics, privacy, and security. INDEX TERMS IoT data analytics, video surveillance, role-based and event-based access control, event modeling, ICU, knowledge representation.
Failing queries are database queries returning few o no results. It might be useful reformulating... more Failing queries are database queries returning few o no results. It might be useful reformulating them in order to retrieve results that are close to those intended with original queries. In this paper, we introduce an approach for rewriting failing queries that are in the disjunctive normal form. In particular, the approach prescribes to replace some of the attributes of the failing queries with attributes semantically related to them by means of Relaxed Functional Dependencies (rfds), which can be automatically discovered from data. The semantics of automatically discovered rfds allow us to rank them in a way to provide an application order during the query rewriting process. Experiments show that such application order of rfds yields a ranking of the approximate query answers meeting the expectations of the user.
Big Data Research, 2021
Abstract Data stream profiling concerns the automatic extraction of metadata from a data stream, ... more Abstract Data stream profiling concerns the automatic extraction of metadata from a data stream, without having the possibility to store it. Among the metadata of interest, functional dependencies ( fd s), and their extensions relaxed functional dependencies ( rfd s), represent an important semantic property of data. Nowadays, there are many algorithms for automatically discovering them from static datasets, and some are being proposed for data streams. However, one of the main problems is that the stream nature of data requires a different paradigm of monitoring, since the “big” number of ( r ) fd s that might hold on a given dataset continuously change as new data are read from the stream. In this paper, we present a tool for visualizing rfd s discovered from a data stream. The tool permits to explore results for different types of rfd s, and uses quantitative measures to monitor how discovery results evolve. Moreover, the tool enables the comparison among rfd s discovered across several executions, also proving visual manipulation operators to dynamically compose and filter results. A user study has been conducted to assess the effectiveness of the proposed visualization tool.
2018 IEEE International Conference on Big Data (Big Data), 2018
Nowadays, the human influence often depends on the number of followers that an individual has in ... more Nowadays, the human influence often depends on the number of followers that an individual has in his/her own social media context. To this end, the presence of fake accounts is one of the most relevant problems and can potentially have a big impact on many real life and business activities. Fake followers are dangerous for social platforms, since they may alter concepts like popularity and influence, which might yield a strong impact on economy, politics, and society. Thus, it is necessary to devise new methodologies enabling the possibility to identify and characterize fake accounts. This work presents a novel technique to discriminate real accounts on social networks from fake ones. The technique exploits knowledge automatically extracted from big data to characterize typical patterns of fake accounts. We empirically evaluated the proposed technique on the Twitter social network, and achieved significant results in terms of discrimination capabilities.
Proceedings of the 20th International Database Engineering & Applications Symposium on - IDEAS '16, 2016
Approximate functional dependencies are used in many emerging application domains, such as the id... more Approximate functional dependencies are used in many emerging application domains, such as the identification of data inconsistencies or patterns of semantically related data, query rewriting, and so forth. They can approximate the canonical definition of functional dependency (fd) by relaxing on the data comparison (i.e., by considering data similarity rather than equality), on the extent (i.e., by admitting the possibility that the dependency holds on a subset of data), or both. Approximate fds are difficult to be identified at design time like it happens with fds. In this paper, we propose a genetic algorithm to discover approximate fds from data. An empirical evaluation demonstrates the effectiveness of the algorithm.