chengguang tang - Academia.edu (original) (raw)

Papers by chengguang tang

Research paper thumbnail of xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark

arXiv (Cornell University), Oct 12, 2023

Research paper thumbnail of TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models

arXiv (Cornell University), May 30, 2024

Research paper thumbnail of Overview of Robust and Multilingual Automatic Evaluation Metrics\\for Open-Domain Dialogue Systems at DSTC 11 Track 4

arXiv (Cornell University), Jun 22, 2023

The advent and fast development of neural networks have revolutionized the research on dialogue s... more The advent and fast development of neural networks have revolutionized the research on dialogue systems and subsequently have triggered various challenges regarding their automatic evaluation. Automatic evaluation of open-domain dialogue systems as an open challenge has been the center of the attention of many researchers. Despite the consistent efforts to improve automatic metrics' correlations with human evaluation, there have been very few attempts to assess their robustness over multiple domains and dimensions. Also, their focus is mainly on the English language. All of these challenges prompt the development of automatic evaluation metrics that are reliable in various domains, dimensions, and languages. This track in the 11 th Dialogue System Technology Challenge (DSTC11) is part of the ongoing effort to promote robust and multilingual automatic evaluation metrics. This article describes the datasets and baselines provided to participants and discusses the submission and result details of the two proposed subtasks.

Research paper thumbnail of Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration

Cornell University - arXiv, Jul 14, 2022

Building document-grounded dialogue systems have received growing interest as documents convey a ... more Building document-grounded dialogue systems have received growing interest as documents convey a wealth of human knowledge and commonly exist in enterprises. Wherein, how to comprehend and retrieve information from documents is a challenging research problem. Previous work ignores the visual property of documents and treats them as plain text, resulting in incomplete modality. In this paper, we propose a Layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents (VRDs), so as to generate accurate responses in dialogue systems. LIE contains 62k annotations of three extraction tasks from 4,061 pages in product and official documents, becoming the largest VRDbased information extraction dataset to the best of our knowledge. We also develop benchmark methods that extend the token-based language model to consider layout features like humans. Empirical results show that layout is critical for VRD-based extraction, and system demonstration also verifies that the extracted knowledge can help locate the answers that users care about. CCS CONCEPTS • Computing methodologies → Information extraction; Discourse, dialogue and pragmatics.

Research paper thumbnail of Combining CALIPSO and AERONET data to classify aerosols globally

IEEE Transactions on Geoscience and Remote Sensing, 2021

Angstrom exponent (AE) and aerosol optical depth (AOD) obtained from the aerosol robotic network ... more Angstrom exponent (AE) and aerosol optical depth (AOD) obtained from the aerosol robotic network (AERONET) and volume depolarization ratio (VDR) obtained from the cloud-aerosol Lidar with orthogonal polarization (CALIOP) from March 2018 to February 2019 were used in our study. Data used in this study are direct observation, avoiding the limitations and uncertainties from the inversion process, and providing accurate information about the aerosol properties. Both instruments were within colocation criteria of a 40-km radius and ±2 h were defined as coincident cases. Six aerosol types were differentiated using the threshold method based on the AE, AOD, and VDR data. Discussion of the aerosol classification yielded the following results: 1) clean marine aerosols were the most abundant and widely distributed aerosols, followed by other types of aerosol (33.2%), polluted dust aerosols (26.8%), natural dust aerosols (2.3%), biomass burning aerosols (1.8%), and clean continental aerosols (1.1%); 2) clean marine aerosols were mainly distributed in North America and Europe, and polluted dust aerosols frequently appeared on the edges or downwind of deserts; and 3) the aerosols controlled by natural conditions (e.g., natural dust aerosols) were sensitive to seasonal variations, whereas those controlled by anthropogenic activities (e.g., polluted dust aerosols) were not. This study provides a new method for the collaborative observation of aerosol types with ground-based and satellite data. It is rare to provide annual global distribution of aerosol types and their seasonal variations; these results provide a reference for understanding the global aerosol distribution status.

Research paper thumbnail of Layout-Aware Information Extraction for Document-Grounded Dialogue

Proceedings of the 30th ACM International Conference on Multimedia

Building document-grounded dialogue systems have received growing interest as documents convey a ... more Building document-grounded dialogue systems have received growing interest as documents convey a wealth of human knowledge and commonly exist in enterprises. Wherein, how to comprehend and retrieve information from documents is a challenging research problem. Previous work ignores the visual property of documents and treats them as plain text, resulting in incomplete modality. In this paper, we propose a Layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents (VRDs), so as to generate accurate responses in dialogue systems. LIE contains 62k annotations of three extraction tasks from 4,061 pages in product and official documents, becoming the largest VRDbased information extraction dataset to the best of our knowledge. We also develop benchmark methods that extend the token-based language model to consider layout features like humans. Empirical results show that layout is critical for VRD-based extraction, and system demonstration also verifies that the extracted knowledge can help locate the answers that users care about.

Research paper thumbnail of Path-Enhanced Multi-Relational Question Answering with Knowledge Graph Embeddings

ArXiv, 2021

The multi-relational Knowledge Base Question Answering (KBQA) system performs multi-hop reasoning... more The multi-relational Knowledge Base Question Answering (KBQA) system performs multi-hop reasoning over the knowledge graph (KG) to achieve the answer. Recent approaches attempt to introduce the knowledge graph embedding (KGE) technique to handle the KG incompleteness but only consider the triple facts and neglect the significant semantic correlation between paths and multirelational questions. In this paper, we propose a Path and Knowledge Embedding-Enhanced multi-relational Question Answering model (PKEEQA), which leverages multi-hop paths between entities in the KG to evaluate the ambipolar correlation between a path embedding and a multi-relational question embedding via a customizable path representation mechanism, benefiting for achieving more accurate answers from the perspective of both the triple facts and the extra paths. Experimental results illustrate that PKEEQA improves KBQA models’ performance for multirelational question answering with explainability to some extent de...

Research paper thumbnail of DKPLM: Decomposable Knowledge-Enhanced Pre-trained Language Model for Natural Language Understanding

Proceedings of the AAAI Conference on Artificial Intelligence

Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation trip... more Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation triples injecting from knowledge graphs to improve language understanding abilities.Experiments show that our model outperforms other KEPLMs significantly over zero-shot knowledge probing tasks and multiple knowledge-aware language understanding tasks. To guarantee effective knowledge injection, previous studies integrate models with knowledge encoders for representing knowledge retrieved from knowledge graphs. The operations for knowledge retrieval and encoding bring significant computational burdens, restricting the usage of such models in real-world applications that require high inference speed. In this paper, we propose a novel KEPLM named DKPLM that decomposes knowledge injection process of the pre-trained language models in pre-training, fine-tuning and inference stages, which facilitates the applications of KEPLMs in real-world scenarios. Specifically, we first detect knowledge-aware...

Research paper thumbnail of A Survey on Dialog Management: Recent Advances and Challenges

Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog ... more Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog history, DM predicts the dialog state and decides the next action that the dialog agent should take. Recently, dialog policy learning has been widely formulated as a Reinforcement Learning (RL) problem, and more works focus on the applicability of DM. In this paper, we survey recent advances and challenges within three critical topics for DM: (1) improving model scalability to facilitate dialog system modeling in new scenarios, (2) dealing with the data scarcity problem for dialog policy learning, and (3) enhancing the training efficiency to achieve better task-completion performance . We believe that this survey can shed a light on future research in dialog management.

Research paper thumbnail of A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

ArXiv, 2020

Question Answering (QA) over Knowledge Base (KB) aims to automatically answer natural language qu... more Question Answering (QA) over Knowledge Base (KB) aims to automatically answer natural language questions via well-structured relation information between entities stored in knowledge bases. In order to make KBQA more applicable in actual scenarios, researchers have shifted their attention from simple questions to complex questions, which require more KB triples and constraint inference. In this paper, we introduce the recent advances in complex QA. Besides traditional methods relying on templates and rules, the research is categorized into a taxonomy that contains two main branches, namely Information Retrieval-based and Neural Semantic Parsing-based. After describing the methods of these branches, we analyze directions for future research and introduce the models proposed by the Alime team.

Research paper thumbnail of A Survey on Dialog Management: Recent Advances and Challenges

ArXiv, 2020

Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog ... more Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog history, DM predicts the dialog state and decides the next action that the dialog agent should take. Recently, dialog policy learning has been widely formulated as a Reinforcement Learning (RL) problem, and more works focus on the applicability of DM. In this paper, we survey recent advances and challenges within three critical topics for DM: (1) improving model scalability to facilitate dialog system modeling in new scenarios, (2) dealing with the data scarcity problem for dialog policy learning, and (3) enhancing the training efficiency to achieve better task-completion performance . We believe that this survey can shed a light on future research in dialog management.

Research paper thumbnail of Unsupervised Learning of Deterministic Dialogue Structure with Edge-Enhanced Graph Auto-Encoder

It is important for task-oriented dialogue systems to discover the dialogue structure (i.e. the g... more It is important for task-oriented dialogue systems to discover the dialogue structure (i.e. the general dialogue flow) from dialogue corpora automatically. Previous work models dialogue structure by extracting latent states for each utterance first and then calculating the transition probabilities among states. These two-stage methods ignore the contextual information when calculating the probabilities, which makes the transitions between the states ambiguous. This paper proposes a conversational graph (CG) to represent deterministic dialogue structure where nodes and edges represent the utterance and context information respectively. An unsupervised EdgeEnhanced Graph Auto-Encoder (EGAE) architecture is designed to model local-contextual and global-structural information for conversational graph learning. Furthermore, a selfsupervised objective is introduced with the response selection task to guide the unsupervised learning of the dialogue structure. Experimental results on severa...

Research paper thumbnail of Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion

Aiming at expanding few-shot relations' coverage in knowledge graphs (KGs), few-shot knowledg... more Aiming at expanding few-shot relations' coverage in knowledge graphs (KGs), few-shot knowledge graph completion (FKGC) has recently gained more research interests. Some existing models employ a few-shot relation's multi-hop neighbor information to enhance its semantic representation. However, noise neighbor information might be amplified when the neighborhood is excessively sparse and no neighbor is available to represent the few-shot relation. Moreover, modeling and inferring complex relations of one-to-many (1-N), many-to-one (N-1), and many-to-many (N-N) by previous knowledge graph completion approaches requires high model complexity and a large amount of training instances. Thus, inferring complex relations in the few-shot scenario is difficult for FKGC models due to limited training instances. In this paper, we propose a few-shot relational learning with global-local framework to address the above issues. At the global stage, a novel gated and attentive neighbor aggrega...

Research paper thumbnail of Hornet

Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Knowledge-Enhanced Pre-trained Language Models (KEPLMs) improve the language understanding abilit... more Knowledge-Enhanced Pre-trained Language Models (KEPLMs) improve the language understanding abilities of deep language models by leveraging the rich semantic knowledge from knowledge graphs, other than plain pre-training texts. However, previous efforts mostly use homogeneous knowledge (especially structured relation triples in knowledge graphs) to enhance the context-aware representations of entity mentions, whose performance may be limited by the coverage of knowledge graphs. Also, it is unclear whether these KEPLMs truly understand the injected semantic knowledge due to the "blackbox" training mechanism. In this paper, we propose a novel KEPLM named HORNET, which integrates Heterogeneous knOwledge from various structured and unstructured sources into the Roberta NETwork and hence takes full advantage of both linguistic and factual knowledge simultaneously. Specifically, we design a hybrid attention heterogeneous graph convolution network (HaHGCN) to learn heterogeneous knowledge representations based on the structured relation triplets from knowledge graphs and the unstructured entity description texts. Meanwhile, we propose the explicit dual knowledge understanding tasks to help induce a more effective infusion of the heterogeneous knowledge, promoting our model for learning the complicated mappings from the knowledge graph embedding space to the deep context-aware embedding space and vice versa. Experiments show that our HORNET model outperforms various KEPLM baselines on knowledge-aware tasks including knowledge probing, entity typing and relation extraction. Our model also achieves substantial improvement over several GLUE benchmark datasets, compared to other KEPLMs.

Research paper thumbnail of When Few-Shot Learning Meets Large-Scale Knowledge-Enhanced Pre-training: Alibaba at FewCLUE

Natural Language Processing and Chinese Computing

Research paper thumbnail of Learning Low-Resource End-To-End Goal-Oriented Dialog for Fast and Reliable System Deployment

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Research paper thumbnail of xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark

arXiv (Cornell University), Oct 12, 2023

Research paper thumbnail of TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models

arXiv (Cornell University), May 30, 2024

Research paper thumbnail of Overview of Robust and Multilingual Automatic Evaluation Metrics\\for Open-Domain Dialogue Systems at DSTC 11 Track 4

arXiv (Cornell University), Jun 22, 2023

The advent and fast development of neural networks have revolutionized the research on dialogue s... more The advent and fast development of neural networks have revolutionized the research on dialogue systems and subsequently have triggered various challenges regarding their automatic evaluation. Automatic evaluation of open-domain dialogue systems as an open challenge has been the center of the attention of many researchers. Despite the consistent efforts to improve automatic metrics' correlations with human evaluation, there have been very few attempts to assess their robustness over multiple domains and dimensions. Also, their focus is mainly on the English language. All of these challenges prompt the development of automatic evaluation metrics that are reliable in various domains, dimensions, and languages. This track in the 11 th Dialogue System Technology Challenge (DSTC11) is part of the ongoing effort to promote robust and multilingual automatic evaluation metrics. This article describes the datasets and baselines provided to participants and discusses the submission and result details of the two proposed subtasks.

Research paper thumbnail of Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration

Cornell University - arXiv, Jul 14, 2022

Building document-grounded dialogue systems have received growing interest as documents convey a ... more Building document-grounded dialogue systems have received growing interest as documents convey a wealth of human knowledge and commonly exist in enterprises. Wherein, how to comprehend and retrieve information from documents is a challenging research problem. Previous work ignores the visual property of documents and treats them as plain text, resulting in incomplete modality. In this paper, we propose a Layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents (VRDs), so as to generate accurate responses in dialogue systems. LIE contains 62k annotations of three extraction tasks from 4,061 pages in product and official documents, becoming the largest VRDbased information extraction dataset to the best of our knowledge. We also develop benchmark methods that extend the token-based language model to consider layout features like humans. Empirical results show that layout is critical for VRD-based extraction, and system demonstration also verifies that the extracted knowledge can help locate the answers that users care about. CCS CONCEPTS • Computing methodologies → Information extraction; Discourse, dialogue and pragmatics.

Research paper thumbnail of Combining CALIPSO and AERONET data to classify aerosols globally

IEEE Transactions on Geoscience and Remote Sensing, 2021

Angstrom exponent (AE) and aerosol optical depth (AOD) obtained from the aerosol robotic network ... more Angstrom exponent (AE) and aerosol optical depth (AOD) obtained from the aerosol robotic network (AERONET) and volume depolarization ratio (VDR) obtained from the cloud-aerosol Lidar with orthogonal polarization (CALIOP) from March 2018 to February 2019 were used in our study. Data used in this study are direct observation, avoiding the limitations and uncertainties from the inversion process, and providing accurate information about the aerosol properties. Both instruments were within colocation criteria of a 40-km radius and ±2 h were defined as coincident cases. Six aerosol types were differentiated using the threshold method based on the AE, AOD, and VDR data. Discussion of the aerosol classification yielded the following results: 1) clean marine aerosols were the most abundant and widely distributed aerosols, followed by other types of aerosol (33.2%), polluted dust aerosols (26.8%), natural dust aerosols (2.3%), biomass burning aerosols (1.8%), and clean continental aerosols (1.1%); 2) clean marine aerosols were mainly distributed in North America and Europe, and polluted dust aerosols frequently appeared on the edges or downwind of deserts; and 3) the aerosols controlled by natural conditions (e.g., natural dust aerosols) were sensitive to seasonal variations, whereas those controlled by anthropogenic activities (e.g., polluted dust aerosols) were not. This study provides a new method for the collaborative observation of aerosol types with ground-based and satellite data. It is rare to provide annual global distribution of aerosol types and their seasonal variations; these results provide a reference for understanding the global aerosol distribution status.

Research paper thumbnail of Layout-Aware Information Extraction for Document-Grounded Dialogue

Proceedings of the 30th ACM International Conference on Multimedia

Building document-grounded dialogue systems have received growing interest as documents convey a ... more Building document-grounded dialogue systems have received growing interest as documents convey a wealth of human knowledge and commonly exist in enterprises. Wherein, how to comprehend and retrieve information from documents is a challenging research problem. Previous work ignores the visual property of documents and treats them as plain text, resulting in incomplete modality. In this paper, we propose a Layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents (VRDs), so as to generate accurate responses in dialogue systems. LIE contains 62k annotations of three extraction tasks from 4,061 pages in product and official documents, becoming the largest VRDbased information extraction dataset to the best of our knowledge. We also develop benchmark methods that extend the token-based language model to consider layout features like humans. Empirical results show that layout is critical for VRD-based extraction, and system demonstration also verifies that the extracted knowledge can help locate the answers that users care about.

Research paper thumbnail of Path-Enhanced Multi-Relational Question Answering with Knowledge Graph Embeddings

ArXiv, 2021

The multi-relational Knowledge Base Question Answering (KBQA) system performs multi-hop reasoning... more The multi-relational Knowledge Base Question Answering (KBQA) system performs multi-hop reasoning over the knowledge graph (KG) to achieve the answer. Recent approaches attempt to introduce the knowledge graph embedding (KGE) technique to handle the KG incompleteness but only consider the triple facts and neglect the significant semantic correlation between paths and multirelational questions. In this paper, we propose a Path and Knowledge Embedding-Enhanced multi-relational Question Answering model (PKEEQA), which leverages multi-hop paths between entities in the KG to evaluate the ambipolar correlation between a path embedding and a multi-relational question embedding via a customizable path representation mechanism, benefiting for achieving more accurate answers from the perspective of both the triple facts and the extra paths. Experimental results illustrate that PKEEQA improves KBQA models’ performance for multirelational question answering with explainability to some extent de...

Research paper thumbnail of DKPLM: Decomposable Knowledge-Enhanced Pre-trained Language Model for Natural Language Understanding

Proceedings of the AAAI Conference on Artificial Intelligence

Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation trip... more Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation triples injecting from knowledge graphs to improve language understanding abilities.Experiments show that our model outperforms other KEPLMs significantly over zero-shot knowledge probing tasks and multiple knowledge-aware language understanding tasks. To guarantee effective knowledge injection, previous studies integrate models with knowledge encoders for representing knowledge retrieved from knowledge graphs. The operations for knowledge retrieval and encoding bring significant computational burdens, restricting the usage of such models in real-world applications that require high inference speed. In this paper, we propose a novel KEPLM named DKPLM that decomposes knowledge injection process of the pre-trained language models in pre-training, fine-tuning and inference stages, which facilitates the applications of KEPLMs in real-world scenarios. Specifically, we first detect knowledge-aware...

Research paper thumbnail of A Survey on Dialog Management: Recent Advances and Challenges

Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog ... more Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog history, DM predicts the dialog state and decides the next action that the dialog agent should take. Recently, dialog policy learning has been widely formulated as a Reinforcement Learning (RL) problem, and more works focus on the applicability of DM. In this paper, we survey recent advances and challenges within three critical topics for DM: (1) improving model scalability to facilitate dialog system modeling in new scenarios, (2) dealing with the data scarcity problem for dialog policy learning, and (3) enhancing the training efficiency to achieve better task-completion performance . We believe that this survey can shed a light on future research in dialog management.

Research paper thumbnail of A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

ArXiv, 2020

Question Answering (QA) over Knowledge Base (KB) aims to automatically answer natural language qu... more Question Answering (QA) over Knowledge Base (KB) aims to automatically answer natural language questions via well-structured relation information between entities stored in knowledge bases. In order to make KBQA more applicable in actual scenarios, researchers have shifted their attention from simple questions to complex questions, which require more KB triples and constraint inference. In this paper, we introduce the recent advances in complex QA. Besides traditional methods relying on templates and rules, the research is categorized into a taxonomy that contains two main branches, namely Information Retrieval-based and Neural Semantic Parsing-based. After describing the methods of these branches, we analyze directions for future research and introduce the models proposed by the Alime team.

Research paper thumbnail of A Survey on Dialog Management: Recent Advances and Challenges

ArXiv, 2020

Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog ... more Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog history, DM predicts the dialog state and decides the next action that the dialog agent should take. Recently, dialog policy learning has been widely formulated as a Reinforcement Learning (RL) problem, and more works focus on the applicability of DM. In this paper, we survey recent advances and challenges within three critical topics for DM: (1) improving model scalability to facilitate dialog system modeling in new scenarios, (2) dealing with the data scarcity problem for dialog policy learning, and (3) enhancing the training efficiency to achieve better task-completion performance . We believe that this survey can shed a light on future research in dialog management.

Research paper thumbnail of Unsupervised Learning of Deterministic Dialogue Structure with Edge-Enhanced Graph Auto-Encoder

It is important for task-oriented dialogue systems to discover the dialogue structure (i.e. the g... more It is important for task-oriented dialogue systems to discover the dialogue structure (i.e. the general dialogue flow) from dialogue corpora automatically. Previous work models dialogue structure by extracting latent states for each utterance first and then calculating the transition probabilities among states. These two-stage methods ignore the contextual information when calculating the probabilities, which makes the transitions between the states ambiguous. This paper proposes a conversational graph (CG) to represent deterministic dialogue structure where nodes and edges represent the utterance and context information respectively. An unsupervised EdgeEnhanced Graph Auto-Encoder (EGAE) architecture is designed to model local-contextual and global-structural information for conversational graph learning. Furthermore, a selfsupervised objective is introduced with the response selection task to guide the unsupervised learning of the dialogue structure. Experimental results on severa...

Research paper thumbnail of Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion

Aiming at expanding few-shot relations' coverage in knowledge graphs (KGs), few-shot knowledg... more Aiming at expanding few-shot relations' coverage in knowledge graphs (KGs), few-shot knowledge graph completion (FKGC) has recently gained more research interests. Some existing models employ a few-shot relation's multi-hop neighbor information to enhance its semantic representation. However, noise neighbor information might be amplified when the neighborhood is excessively sparse and no neighbor is available to represent the few-shot relation. Moreover, modeling and inferring complex relations of one-to-many (1-N), many-to-one (N-1), and many-to-many (N-N) by previous knowledge graph completion approaches requires high model complexity and a large amount of training instances. Thus, inferring complex relations in the few-shot scenario is difficult for FKGC models due to limited training instances. In this paper, we propose a few-shot relational learning with global-local framework to address the above issues. At the global stage, a novel gated and attentive neighbor aggrega...

Research paper thumbnail of Hornet

Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Knowledge-Enhanced Pre-trained Language Models (KEPLMs) improve the language understanding abilit... more Knowledge-Enhanced Pre-trained Language Models (KEPLMs) improve the language understanding abilities of deep language models by leveraging the rich semantic knowledge from knowledge graphs, other than plain pre-training texts. However, previous efforts mostly use homogeneous knowledge (especially structured relation triples in knowledge graphs) to enhance the context-aware representations of entity mentions, whose performance may be limited by the coverage of knowledge graphs. Also, it is unclear whether these KEPLMs truly understand the injected semantic knowledge due to the "blackbox" training mechanism. In this paper, we propose a novel KEPLM named HORNET, which integrates Heterogeneous knOwledge from various structured and unstructured sources into the Roberta NETwork and hence takes full advantage of both linguistic and factual knowledge simultaneously. Specifically, we design a hybrid attention heterogeneous graph convolution network (HaHGCN) to learn heterogeneous knowledge representations based on the structured relation triplets from knowledge graphs and the unstructured entity description texts. Meanwhile, we propose the explicit dual knowledge understanding tasks to help induce a more effective infusion of the heterogeneous knowledge, promoting our model for learning the complicated mappings from the knowledge graph embedding space to the deep context-aware embedding space and vice versa. Experiments show that our HORNET model outperforms various KEPLM baselines on knowledge-aware tasks including knowledge probing, entity typing and relation extraction. Our model also achieves substantial improvement over several GLUE benchmark datasets, compared to other KEPLMs.

Research paper thumbnail of When Few-Shot Learning Meets Large-Scale Knowledge-Enhanced Pre-training: Alibaba at FewCLUE

Natural Language Processing and Chinese Computing

Research paper thumbnail of Learning Low-Resource End-To-End Goal-Oriented Dialog for Fast and Reliable System Deployment

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics