Cheng Niu - Academia.edu (original) (raw)
Papers by Cheng Niu
2018 IEEE International Conference on Data Mining (ICDM)
Collaborative filtering is an effective and widely used recommendation approach by applying the u... more Collaborative filtering is an effective and widely used recommendation approach by applying the user-item rating matrix for recommendations, however, which usually suffers from cold-start and sparsity problems. To address these problems, hybrid methods are proposed to incorporate auxiliary information such as user/item profiles to collaborative filtering models; Cross-domain recommendation systems add a new dimension to solve these problems by leveraging ratings from other domains to improve recommendation performance. Among these methods, deep neural network based recommendation systems achieve excellent performance due to their excellent ability in learning powerful representations. However, these cross-domain recommendation systems based on deep neural network rarely consider the uncertainty of weights. Therefore, they maybe lack of calibrated probabilistic predictions and make overly confident decisions. Along this line, we propose a general cross-domain recommendation framework via Bayesian neural network to incorporate auxiliary information, which takes advantage of both the hybrid recommendation methods and the cross-domain recommendation systems. Specifically, our framework consists of two kinds of neural networks, one to learn the low dimensional representation from the one-hot codings of users/items, while the other one is to project the auxiliary information of users/items into another latent space. The final rating is produced by integrating the latent representations of the one-hot codings of users/items and the auxiliary information of users/items. The latent representations of users learnt from ratings and auxiliary information are shared across different domains for knowledge transfer. Moreover, we capture the uncertainty in all weights by representing weights with Gaussian distributions to make calibrated probabilistic predictions. We have done extensive experiments on real-world data sets to verify the effectiveness of our framework.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Neural network-based sequence-to-sequence (seq2seq) models strongly suffer from the lowdiversity ... more Neural network-based sequence-to-sequence (seq2seq) models strongly suffer from the lowdiversity problem when it comes to opendomain dialogue generation. As bland and generic utterances usually dominate the frequency distribution in our daily chitchat, avoiding them to generate more interesting responses requires complex data filtering, sampling techniques or modifying the training objective. In this paper, we propose a new perspective to diversify dialogue generation by leveraging non-conversational text. Compared with bilateral conversations, nonconversational text are easier to obtain, more diverse and cover a much broader range of topics. We collect a large-scale nonconversational corpus from multi sources including forum comments, idioms and book snippets. We further present a training paradigm to effectively incorporate these text via iterative back translation. The resulting model is tested on two conversational datasets and is shown to produce significantly more diverse responses without sacrificing the relevance with context. * Equal contribution. Conversational Text Context 暗恋的人却不喜欢我 (Translation) The one I have a crush on doesn't like me. Response 摸摸头 Head pat. Non-Conversational Text Forum Comments 暗恋这碗酒,谁喝都会醉啊 Crush is an alcoholic drink, whoever drinks it will get intoxicated.
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Rhetoric is a vital element in modern poetry, and plays an essential role in improving its aesthe... more Rhetoric is a vital element in modern poetry, and plays an essential role in improving its aesthetics. However, to date, it has not been considered in research on automatic poetry generation. In this paper, we propose a rhetorically controlled encoder-decoder for modern Chinese poetry generation. Our model relies on a continuous latent variable as a rhetoric controller to capture various rhetorical patterns in an encoder, and then incorporates rhetoricbased mixtures while generating modern Chinese poetry. For metaphor and personification, an automated evaluation shows that our model outperforms state-of-the-art baselines by a substantial margin, while a human evaluation shows that our model generates better poems than baseline methods in terms of fluency, coherence, meaningfulness, and rhetorical aesthetics.
Traditionally, word sense disambiguation (WSD) involves a different context model for each indivi... more Traditionally, word sense disambiguation (WSD) involves a different context model for each individual word. This paper presents a new approach to WSD using weakly supervised learning. Statistical models are not trained for the contexts of each individual word, but for the similarities between context pairs at category level. The insight is that the correlation regularity between the sense distinction and the context distinction can be captured at category level, independent of individual words. This approach only requires a limited amount of existing annotated training corpus in order to disambiguate the entire vocabulary. A context clustering scheme is developed within the Bayesian framework. A maximum entropy model is then trained to represent the generative probability distribution of context similarities based on heterogeneous features, including trigger words and parsing structures. Statistical annealing is applied to derive the final context clusters by globally fitting the pairwise context similarity distribution. Benchmarking shows that this new approach significantly outperforms the existing WSD systems in the unsupervised category, and rivals supervised WSD systems.
Encyclopedia of Database Systems, 2009
Encyclopedia of Database Systems, 2009
Encyclopedia of Database Systems, 2009
Chinese Journal of Physics- Taipei-
The theory of the equation of motion for the nonequilibrium Green function, developed by the pres... more The theory of the equation of motion for the nonequilibrium Green function, developed by the present authors using the Schwinger-Keldysh formalism, is adopted to treat the problem of photon-assisted tunneling through nanostructures. A quantum wire modeled as a two-level system and quantum dots with strong electron-electron interactions are considered. The density of states, electron occupation probability, tunneling current and conductivity are calculated for different cases with both diagonal and off-diagonal matrix elements of the interaction included. The electron population inversion is found due to the off-diagonal matrix elements for a wide range of the incident light frequency, suggesting a new mechanism for optical pumping. Negative resistance and other novel features of the tunneling current are also discussed.
It is now well known that the transport phenonmena can be treated by the nonequilibrium Green fun... more It is now well known that the transport phenonmena can be treated by the nonequilibrium Green function proposed by Keldysh.(L. V. Keldysh, JEPT 20), 1018 (1965). While the graphic technique of the perturbation method has been well developed, the method of equation of motion has never been applied to nonequilibrium Green functions. On the basis of the Keldysh formalism of perturbation expansion, we derive the equation of motion for the nonequilibrium Green function which incorporates both quantum mechanical and nonequilibrium statistical information. The nonperturbative solution of the equation is obtained self-consistently and is applied to investigate the photoassisted tunneling in quantum well systems with intersubband transtions. The nonequilibrium electron distribution is obtained as a function of time. The time dependent current and its dc component are also calculated. Results from this calculation are presented and discussed.
Proceedings of the sixth conference on Applied natural language processing -, 2000
This paper presents a hybrid approach for named entity (NE) tagging which combines Maximum Entrop... more This paper presents a hybrid approach for named entity (NE) tagging which combines Maximum Entropy Model (MaxEnt), Hidden Markov Model (HMM) and handcrafted grammatical rules. Each has innate strengths and weaknesses; the combination results in a very high precision tagger. MaxEnt includes external gazetteers in the system. Sub-category generation is also discussed.
Lecture Notes in Computer Science, 2009
Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering -, 2003
Physical Review B - PHYS REV B, 1999
The band gap in photonic materials with periodic spatial modulation of refractive index greater t... more The band gap in photonic materials with periodic spatial modulation of refractive index greater than unity can actually be regarded as a potential barrier for photons. Similar to semiconductor quantum well systems due to the electronic band-gap mismatch, a photonic quantum well can be constructed by sandwiching a uniform medium between two photonic barriers. The transmission and reflection coefficients of light through the photonic quantum well are calculated by a modal expansion method with an R-matrix propagation algorithm. Resonance tunneling through the photonic quantum well structure is observed by varying either the well width or the frequency of incident light. Resonance peaks are found within the band-gap region, and indicate the existence of photon virtual states in the well.
Physical Review B, 1999
... IV. RESULTS AND DISCUSSION In the numerical calculation, we choose parameters for the2D photo... more ... IV. RESULTS AND DISCUSSION In the numerical calculation, we choose parameters for the2D photonic crystal ... is taken to be 1.87 mm, and the side of the cylinder's square cross section ... work indicates that the transmission coefficient is vanishingly small for seven-layer barriers ...
Proceedings of the Ninth Conference …, 2005
Traditionally, word sense disambiguation (WSD) involves a different context classification model ... more Traditionally, word sense disambiguation (WSD) involves a different context classification model for each individual word. This paper presents a weakly supervised learning approach to WSD based on learning a word independent context pair classification model. Statistical models are not trained for classifying the word contexts, but for classifying a pair of contexts, i.e. determining if a pair of contexts of the same ambiguous word refers to the same or different senses. Using this approach, annotated corpus of a target word A can be explored to disambiguate senses of a different word B. Hence, only a limited amount of existing annotated corpus is required in order to disambiguate the entire vocabulary. In this research, maximum entropy modeling is used to train the word independent context pair classification model. Then based on the context pair classification results, clustering is performed on word mentions extracted from a large raw corpus. The resulting context clusters are mapped onto the external thesaurus WordNet. This approach shows great flexibility to efficiently integrate heterogeneous knowledge sources, e.g. trigger words and parsing structures. Based on Senseval-3 Lexical Sample standards, this approach achieves state-of-the-art performance in the unsupervised learning category, and performs comparably with the supervised Naïve Bayes system.
… of the 41st Annual Meeting on …, 2003
Google, Inc. (search), Subscribe (Full Service), Register (Limited Service, Free), Login. Search:... more Google, Inc. (search), Subscribe (Full Service), Register (Limited Service, Free), Login. Search: The ACM Digital Library The Guide. ...
2018 IEEE International Conference on Data Mining (ICDM)
Collaborative filtering is an effective and widely used recommendation approach by applying the u... more Collaborative filtering is an effective and widely used recommendation approach by applying the user-item rating matrix for recommendations, however, which usually suffers from cold-start and sparsity problems. To address these problems, hybrid methods are proposed to incorporate auxiliary information such as user/item profiles to collaborative filtering models; Cross-domain recommendation systems add a new dimension to solve these problems by leveraging ratings from other domains to improve recommendation performance. Among these methods, deep neural network based recommendation systems achieve excellent performance due to their excellent ability in learning powerful representations. However, these cross-domain recommendation systems based on deep neural network rarely consider the uncertainty of weights. Therefore, they maybe lack of calibrated probabilistic predictions and make overly confident decisions. Along this line, we propose a general cross-domain recommendation framework via Bayesian neural network to incorporate auxiliary information, which takes advantage of both the hybrid recommendation methods and the cross-domain recommendation systems. Specifically, our framework consists of two kinds of neural networks, one to learn the low dimensional representation from the one-hot codings of users/items, while the other one is to project the auxiliary information of users/items into another latent space. The final rating is produced by integrating the latent representations of the one-hot codings of users/items and the auxiliary information of users/items. The latent representations of users learnt from ratings and auxiliary information are shared across different domains for knowledge transfer. Moreover, we capture the uncertainty in all weights by representing weights with Gaussian distributions to make calibrated probabilistic predictions. We have done extensive experiments on real-world data sets to verify the effectiveness of our framework.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Neural network-based sequence-to-sequence (seq2seq) models strongly suffer from the lowdiversity ... more Neural network-based sequence-to-sequence (seq2seq) models strongly suffer from the lowdiversity problem when it comes to opendomain dialogue generation. As bland and generic utterances usually dominate the frequency distribution in our daily chitchat, avoiding them to generate more interesting responses requires complex data filtering, sampling techniques or modifying the training objective. In this paper, we propose a new perspective to diversify dialogue generation by leveraging non-conversational text. Compared with bilateral conversations, nonconversational text are easier to obtain, more diverse and cover a much broader range of topics. We collect a large-scale nonconversational corpus from multi sources including forum comments, idioms and book snippets. We further present a training paradigm to effectively incorporate these text via iterative back translation. The resulting model is tested on two conversational datasets and is shown to produce significantly more diverse responses without sacrificing the relevance with context. * Equal contribution. Conversational Text Context 暗恋的人却不喜欢我 (Translation) The one I have a crush on doesn't like me. Response 摸摸头 Head pat. Non-Conversational Text Forum Comments 暗恋这碗酒,谁喝都会醉啊 Crush is an alcoholic drink, whoever drinks it will get intoxicated.
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Rhetoric is a vital element in modern poetry, and plays an essential role in improving its aesthe... more Rhetoric is a vital element in modern poetry, and plays an essential role in improving its aesthetics. However, to date, it has not been considered in research on automatic poetry generation. In this paper, we propose a rhetorically controlled encoder-decoder for modern Chinese poetry generation. Our model relies on a continuous latent variable as a rhetoric controller to capture various rhetorical patterns in an encoder, and then incorporates rhetoricbased mixtures while generating modern Chinese poetry. For metaphor and personification, an automated evaluation shows that our model outperforms state-of-the-art baselines by a substantial margin, while a human evaluation shows that our model generates better poems than baseline methods in terms of fluency, coherence, meaningfulness, and rhetorical aesthetics.
Traditionally, word sense disambiguation (WSD) involves a different context model for each indivi... more Traditionally, word sense disambiguation (WSD) involves a different context model for each individual word. This paper presents a new approach to WSD using weakly supervised learning. Statistical models are not trained for the contexts of each individual word, but for the similarities between context pairs at category level. The insight is that the correlation regularity between the sense distinction and the context distinction can be captured at category level, independent of individual words. This approach only requires a limited amount of existing annotated training corpus in order to disambiguate the entire vocabulary. A context clustering scheme is developed within the Bayesian framework. A maximum entropy model is then trained to represent the generative probability distribution of context similarities based on heterogeneous features, including trigger words and parsing structures. Statistical annealing is applied to derive the final context clusters by globally fitting the pairwise context similarity distribution. Benchmarking shows that this new approach significantly outperforms the existing WSD systems in the unsupervised category, and rivals supervised WSD systems.
Encyclopedia of Database Systems, 2009
Encyclopedia of Database Systems, 2009
Encyclopedia of Database Systems, 2009
Chinese Journal of Physics- Taipei-
The theory of the equation of motion for the nonequilibrium Green function, developed by the pres... more The theory of the equation of motion for the nonequilibrium Green function, developed by the present authors using the Schwinger-Keldysh formalism, is adopted to treat the problem of photon-assisted tunneling through nanostructures. A quantum wire modeled as a two-level system and quantum dots with strong electron-electron interactions are considered. The density of states, electron occupation probability, tunneling current and conductivity are calculated for different cases with both diagonal and off-diagonal matrix elements of the interaction included. The electron population inversion is found due to the off-diagonal matrix elements for a wide range of the incident light frequency, suggesting a new mechanism for optical pumping. Negative resistance and other novel features of the tunneling current are also discussed.
It is now well known that the transport phenonmena can be treated by the nonequilibrium Green fun... more It is now well known that the transport phenonmena can be treated by the nonequilibrium Green function proposed by Keldysh.(L. V. Keldysh, JEPT 20), 1018 (1965). While the graphic technique of the perturbation method has been well developed, the method of equation of motion has never been applied to nonequilibrium Green functions. On the basis of the Keldysh formalism of perturbation expansion, we derive the equation of motion for the nonequilibrium Green function which incorporates both quantum mechanical and nonequilibrium statistical information. The nonperturbative solution of the equation is obtained self-consistently and is applied to investigate the photoassisted tunneling in quantum well systems with intersubband transtions. The nonequilibrium electron distribution is obtained as a function of time. The time dependent current and its dc component are also calculated. Results from this calculation are presented and discussed.
Proceedings of the sixth conference on Applied natural language processing -, 2000
This paper presents a hybrid approach for named entity (NE) tagging which combines Maximum Entrop... more This paper presents a hybrid approach for named entity (NE) tagging which combines Maximum Entropy Model (MaxEnt), Hidden Markov Model (HMM) and handcrafted grammatical rules. Each has innate strengths and weaknesses; the combination results in a very high precision tagger. MaxEnt includes external gazetteers in the system. Sub-category generation is also discussed.
Lecture Notes in Computer Science, 2009
Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering -, 2003
Physical Review B - PHYS REV B, 1999
The band gap in photonic materials with periodic spatial modulation of refractive index greater t... more The band gap in photonic materials with periodic spatial modulation of refractive index greater than unity can actually be regarded as a potential barrier for photons. Similar to semiconductor quantum well systems due to the electronic band-gap mismatch, a photonic quantum well can be constructed by sandwiching a uniform medium between two photonic barriers. The transmission and reflection coefficients of light through the photonic quantum well are calculated by a modal expansion method with an R-matrix propagation algorithm. Resonance tunneling through the photonic quantum well structure is observed by varying either the well width or the frequency of incident light. Resonance peaks are found within the band-gap region, and indicate the existence of photon virtual states in the well.
Physical Review B, 1999
... IV. RESULTS AND DISCUSSION In the numerical calculation, we choose parameters for the2D photo... more ... IV. RESULTS AND DISCUSSION In the numerical calculation, we choose parameters for the2D photonic crystal ... is taken to be 1.87 mm, and the side of the cylinder's square cross section ... work indicates that the transmission coefficient is vanishingly small for seven-layer barriers ...
Proceedings of the Ninth Conference …, 2005
Traditionally, word sense disambiguation (WSD) involves a different context classification model ... more Traditionally, word sense disambiguation (WSD) involves a different context classification model for each individual word. This paper presents a weakly supervised learning approach to WSD based on learning a word independent context pair classification model. Statistical models are not trained for classifying the word contexts, but for classifying a pair of contexts, i.e. determining if a pair of contexts of the same ambiguous word refers to the same or different senses. Using this approach, annotated corpus of a target word A can be explored to disambiguate senses of a different word B. Hence, only a limited amount of existing annotated corpus is required in order to disambiguate the entire vocabulary. In this research, maximum entropy modeling is used to train the word independent context pair classification model. Then based on the context pair classification results, clustering is performed on word mentions extracted from a large raw corpus. The resulting context clusters are mapped onto the external thesaurus WordNet. This approach shows great flexibility to efficiently integrate heterogeneous knowledge sources, e.g. trigger words and parsing structures. Based on Senseval-3 Lexical Sample standards, this approach achieves state-of-the-art performance in the unsupervised learning category, and performs comparably with the supervised Naïve Bayes system.
… of the 41st Annual Meeting on …, 2003
Google, Inc. (search), Subscribe (Full Service), Register (Limited Service, Free), Login. Search:... more Google, Inc. (search), Subscribe (Full Service), Register (Limited Service, Free), Login. Search: The ACM Digital Library The Guide. ...