Mingkai Deng - Academia.edu (original) (raw)

Papers by Mingkai Deng

Research paper thumbnail of RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning

Prompting has shown impressive success in enabling large pretrained language models (LMs) to perf... more Prompting has shown impressive success in enabling large pretrained language models (LMs) to perform diverse NLP tasks, especially when only few downstream data are available. Automatically finding the optimal prompt for each task, however, is challenging. Most existing work resorts to tuning soft prompt (e.g., embeddings) which falls short of interpretability, reusability across LMs, and applicability when gradients are not accessible. Discrete prompt, on the other hand, is difficult to optimize, and is often created by "enumeration (e.g., paraphrasing)-then-selection" heuristics that do not explore the prompt space systematically. This paper proposes RLPROMPT, an efficient discrete prompt optimization approach with reinforcement learning (RL). RL-PROMPT formulates a parameter-efficient policy network that generates the desired discrete prompt after training with reward. To overcome the complexity and stochasticity of reward signals by the large LM environment, we incorporate effective reward stabilization that substantially enhances the training efficiency. RLPROMPT is flexibly applicable to different types of LMs, such as masked (e.g., BERT) and left-to-right models (e.g., GPTs), for both classification and generation tasks. Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods. Interestingly, the resulting optimized prompts are often ungrammatical gibberish text; and surprisingly, those gibberish prompts are transferrable between different LMs to retain significant performance, indicating LM prompting may not follow human language patterns. 1

Research paper thumbnail of Tunneling through high energy barriers in simulated quantum annealing

We analyze the performance of simulated quantum annealing (SQA) on an optimization problem for wh... more We analyze the performance of simulated quantum annealing (SQA) on an optimization problem for which simulated classical annealing (SA) is provably inefficient because of a high energy barrier. We present evidence that SQA can pass through this barrier to find the global minimum efficiently. This demonstrates the potential for SQA to inherit some of the advantages of quantum annealing (QA), since this problem has been previously shown to be efficiently solvable by quantum adiabatic optimization.

Research paper thumbnail of Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Natural language generation (NLG) spans a broad range of tasks, each of which serves for specific... more Natural language generation (NLG) spans a broad range of tasks, each of which serves for specific objectives and desires different properties of generated text. The complexity makes automatic evaluation of NLG particularly challenging. Previous work has typically focused on a single task and developed individual evaluation metrics based on specific intuitions. In this paper, we propose a unifying perspective that facilitates the design of metrics for a wide range of language generation tasks and quality aspects. Based on the nature of information change from input to output, we classify NLG tasks into compression (e.g., summarization), transduction (e.g., text rewriting), and creation (e.g., dialog). The information alignment, or overlap, between input, context, and output text plays a common central role in characterizing the generation. Using the uniform concept of information alignment, we develop a family of interpretable metrics for various NLG tasks and aspects, often without need of gold reference data. To operationalize the metrics, we train self-supervised models to approximate information alignment as a prediction task. Experiments show the uniformly designed metrics achieve stronger or comparable correlations with human judgement compared to state-of-the-art metrics in each of diverse tasks, including text summarization, style transfer, and knowledge-grounded dialog. With information alignment as the intermediate representation, we deliver a composable library for easy NLG evaluation and future metric design. 1

Research paper thumbnail of RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning

Prompting has shown impressive success in enabling large pretrained language models (LMs) to perf... more Prompting has shown impressive success in enabling large pretrained language models (LMs) to perform diverse NLP tasks, especially when only few downstream data are available. Automatically finding the optimal prompt for each task, however, is challenging. Most existing work resorts to tuning soft prompt (e.g., embeddings) which falls short of interpretability, reusability across LMs, and applicability when gradients are not accessible. Discrete prompt, on the other hand, is difficult to optimize, and is often created by "enumeration (e.g., paraphrasing)-then-selection" heuristics that do not explore the prompt space systematically. This paper proposes RLPROMPT, an efficient discrete prompt optimization approach with reinforcement learning (RL). RL-PROMPT formulates a parameter-efficient policy network that generates the desired discrete prompt after training with reward. To overcome the complexity and stochasticity of reward signals by the large LM environment, we incorporate effective reward stabilization that substantially enhances the training efficiency. RLPROMPT is flexibly applicable to different types of LMs, such as masked (e.g., BERT) and left-to-right models (e.g., GPTs), for both classification and generation tasks. Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods. Interestingly, the resulting optimized prompts are often ungrammatical gibberish text; and surprisingly, those gibberish prompts are transferrable between different LMs to retain significant performance, indicating LM prompting may not follow human language patterns. 1

Research paper thumbnail of Tunneling through high energy barriers in simulated quantum annealing

We analyze the performance of simulated quantum annealing (SQA) on an optimization problem for wh... more We analyze the performance of simulated quantum annealing (SQA) on an optimization problem for which simulated classical annealing (SA) is provably inefficient because of a high energy barrier. We present evidence that SQA can pass through this barrier to find the global minimum efficiently. This demonstrates the potential for SQA to inherit some of the advantages of quantum annealing (QA), since this problem has been previously shown to be efficiently solvable by quantum adiabatic optimization.

Research paper thumbnail of Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Natural language generation (NLG) spans a broad range of tasks, each of which serves for specific... more Natural language generation (NLG) spans a broad range of tasks, each of which serves for specific objectives and desires different properties of generated text. The complexity makes automatic evaluation of NLG particularly challenging. Previous work has typically focused on a single task and developed individual evaluation metrics based on specific intuitions. In this paper, we propose a unifying perspective that facilitates the design of metrics for a wide range of language generation tasks and quality aspects. Based on the nature of information change from input to output, we classify NLG tasks into compression (e.g., summarization), transduction (e.g., text rewriting), and creation (e.g., dialog). The information alignment, or overlap, between input, context, and output text plays a common central role in characterizing the generation. Using the uniform concept of information alignment, we develop a family of interpretable metrics for various NLG tasks and aspects, often without need of gold reference data. To operationalize the metrics, we train self-supervised models to approximate information alignment as a prediction task. Experiments show the uniformly designed metrics achieve stronger or comparable correlations with human judgement compared to state-of-the-art metrics in each of diverse tasks, including text summarization, style transfer, and knowledge-grounded dialog. With information alignment as the intermediate representation, we deliver a composable library for easy NLG evaluation and future metric design. 1