Garima Agrawal - Academia.edu (original) (raw)
Uploads
Papers by Garima Agrawal
European Conference on Social Media, May 21, 2024
ACM transactions on the internet of things, Sep 6, 2022
Information
Knowledge graphs are graph-based data models which can represent real-time data that is constantl... more Knowledge graphs are graph-based data models which can represent real-time data that is constantly growing with the addition of new information. The question-answering systems over knowledge graphs (KGQA) retrieve answers to a natural language question from the knowledge graph. Most existing KGQA systems use static knowledge bases for offline training. After deployment, they fail to learn from unseen new entities added to the graph. There is a need for dynamic algorithms which can adapt to the evolving graphs and give interpretable results. In this research work, we propose using new auction algorithms for question answering over knowledge graphs. These algorithms can adapt to changing environments in real-time, making them suitable for offline and online training. An auction algorithm computes paths connecting an origin node to one or more destination nodes in a directed graph and uses node prices to guide the search for the path. The prices are initially assigned arbitrarily and u...
Information
Knowledge graphs gained popularity in recent years and have been useful for concept visualization... more Knowledge graphs gained popularity in recent years and have been useful for concept visualization and contextual information retrieval in various applications. However, constructing a knowledge graph by scraping long and complex unstructured texts for a new domain in the absence of a well-defined ontology or an existing labeled entity-relation dataset is difficult. Domains such as cybersecurity education can harness knowledge graphs to create a student-focused interactive and learning environment to teach cybersecurity. Learning cybersecurity involves gaining the knowledge of different attack and defense techniques, system setup and solving multi-facet complex real-world challenges that demand adaptive learning strategies and cognitive engagement. However, there are no standard datasets for the cybersecurity education domain. In this research work, we present a bottom-up approach to curate entity-relation pairs and construct knowledge graphs and question-answering models for cyberse...
Deployable Machine Learning for Security Defense, 2020
Machine learning is being embraced by information security researchers and organizations alike fo... more Machine learning is being embraced by information security researchers and organizations alike for its potential in detecting attacks that an organization faces, specifically attacks that go undetected by traditional signature-based intrusion detection systems. Along with the ability to process large amounts of data, machine learning brings the potential to detect contextual and collective anomalies, an essential attribute of an ideal threat detection system. Datasets play a vital role in developing machine learning models that are capable of detecting complex and sophisticated threats like Advanced Persistent Threats (APT). However, there is currently no APT-dataset that can be used for modeling and detecting APT attacks. Characterized by the sophistication involved and the determined nature of the APT attackers, these threats are not only difficult to detect but also to model. Generic intrusion datasets have three key limitations-(1) They capture attack traffic at the external endpoints, limiting their usefulness in the context of APTs which comprise of attack vectors within the internal network as well (2) The difference between normal and anomalous behavior is quiet distinguishable in these datasets and thus fails to represent the sophisticated attackers' of APT attacks (3) The data imbalance in existing datasets do not reflect the real-world settings rendering themselves as a benchmark for supervised models and falling short of semi-supervised learning. To address these concerns, in this paper, we propose a dataset DAPT 2020 which consists of attacks that are part of Advanced Persistent Threats (APT). These attacks (1) are hard to distinguish from normal traffic flows but investigate the raw feature space and (2) comprise of traffic on both public-to-private interface and the internal (private) network. Due to the existence of severe class imbalance, we benchmark DAPT 2020 dataset on semi-supervised models and show that they perform poorly trying to detect attack traffic in the various stages of an APT.
European Conference on Social Media, May 21, 2024
ACM transactions on the internet of things, Sep 6, 2022
Information
Knowledge graphs are graph-based data models which can represent real-time data that is constantl... more Knowledge graphs are graph-based data models which can represent real-time data that is constantly growing with the addition of new information. The question-answering systems over knowledge graphs (KGQA) retrieve answers to a natural language question from the knowledge graph. Most existing KGQA systems use static knowledge bases for offline training. After deployment, they fail to learn from unseen new entities added to the graph. There is a need for dynamic algorithms which can adapt to the evolving graphs and give interpretable results. In this research work, we propose using new auction algorithms for question answering over knowledge graphs. These algorithms can adapt to changing environments in real-time, making them suitable for offline and online training. An auction algorithm computes paths connecting an origin node to one or more destination nodes in a directed graph and uses node prices to guide the search for the path. The prices are initially assigned arbitrarily and u...
Information
Knowledge graphs gained popularity in recent years and have been useful for concept visualization... more Knowledge graphs gained popularity in recent years and have been useful for concept visualization and contextual information retrieval in various applications. However, constructing a knowledge graph by scraping long and complex unstructured texts for a new domain in the absence of a well-defined ontology or an existing labeled entity-relation dataset is difficult. Domains such as cybersecurity education can harness knowledge graphs to create a student-focused interactive and learning environment to teach cybersecurity. Learning cybersecurity involves gaining the knowledge of different attack and defense techniques, system setup and solving multi-facet complex real-world challenges that demand adaptive learning strategies and cognitive engagement. However, there are no standard datasets for the cybersecurity education domain. In this research work, we present a bottom-up approach to curate entity-relation pairs and construct knowledge graphs and question-answering models for cyberse...
Deployable Machine Learning for Security Defense, 2020
Machine learning is being embraced by information security researchers and organizations alike fo... more Machine learning is being embraced by information security researchers and organizations alike for its potential in detecting attacks that an organization faces, specifically attacks that go undetected by traditional signature-based intrusion detection systems. Along with the ability to process large amounts of data, machine learning brings the potential to detect contextual and collective anomalies, an essential attribute of an ideal threat detection system. Datasets play a vital role in developing machine learning models that are capable of detecting complex and sophisticated threats like Advanced Persistent Threats (APT). However, there is currently no APT-dataset that can be used for modeling and detecting APT attacks. Characterized by the sophistication involved and the determined nature of the APT attackers, these threats are not only difficult to detect but also to model. Generic intrusion datasets have three key limitations-(1) They capture attack traffic at the external endpoints, limiting their usefulness in the context of APTs which comprise of attack vectors within the internal network as well (2) The difference between normal and anomalous behavior is quiet distinguishable in these datasets and thus fails to represent the sophisticated attackers' of APT attacks (3) The data imbalance in existing datasets do not reflect the real-world settings rendering themselves as a benchmark for supervised models and falling short of semi-supervised learning. To address these concerns, in this paper, we propose a dataset DAPT 2020 which consists of attacks that are part of Advanced Persistent Threats (APT). These attacks (1) are hard to distinguish from normal traffic flows but investigate the raw feature space and (2) comprise of traffic on both public-to-private interface and the internal (private) network. Due to the existence of severe class imbalance, we benchmark DAPT 2020 dataset on semi-supervised models and show that they perform poorly trying to detect attack traffic in the various stages of an APT.