Trung Nguyen Mai - Profile on Academia.edu (original) (raw)

Papers by Trung Nguyen Mai

Star2vec: From Subspace Embedding to Whole-Space Embedding for Intelligent Recommendation System (Extended Abstract)

Lecture Notes in Computer Science, 2019

Recommendation systems are powerful tools that can alleviate system overload problems by recommen... more Recommendation systems are powerful tools that can alleviate system overload problems by recommending the most relevant items (contents) to users. Recommendation systems allow users to find useful, interesting items from a significantly large space and also enhance the user’s browsing experience. Relevant items are determined by predicting user’s ratings on different items. Two traditional techniques used in recommendation system are Content-Based filtering and Collaborative-Filtering. Content-Based filtering uses content of the items that the user has involved in the past to discover items that the user might be interested in. On the other hands, Collaborative-Filtering determine the similarity between users and recommends items chosen by similar users

Balancing stability and plasticity when learning topic models from short and noisy text streams

Neurocomputing, Sep 1, 2022

arXiv (Cornell University), May 1, 2019

Most of the information on the Internet is represented in the form of microtexts, which are short... more Most of the information on the Internet is represented in the form of microtexts, which are short text snippets such as news headlines or tweets. These sources of information are abundant, and mining these data could uncover meaningful insights. Topic modeling is one of the popular methods to extract knowledge from a collection of documents; however, conventional topic models such as latent Dirichlet allocation (LDA) are unable to perform well on short documents, mostly due to the scarcity of word co-occurrence statistics embedded in the data. The objective of our research is to create a topic model that can achieve great performances on microtexts while requiring a small runtime for scalability to large datasets. To solve the lack of information of microtexts, we allow our method to take advantage of word embeddings for additional knowledge of relationships between words. For speed and scalability, we apply autoencoding variational Bayes, an algorithm that can perform efficient black-box inference in probabilistic models. The result of our work is a novel topic model called the nested variational autoencoder, which is a distribution that takes into account word vectors and is parameterized by a neural network architecture. For optimization, the model is trained to approximate the posterior distribution of the original LDA model. Experiments show the improvements of our model on microtexts as well as its runtime advantage.

Lecture Notes in Computer Science, 2017

The emerging technique of deep learning has been widely applied in many different areas. However,... more The emerging technique of deep learning has been widely applied in many different areas. However, when adopted in a certain specific domain, this technique should be combined with domain knowledge to improve efficiency and accuracy. In particular, when analyzing the applications of deep learning in sentiment analysis, we found that the current approaches are suffering from the following drawbacks: (i) the existing works have not paid much attention to the importance of different types of sentiment terms, which is an important concept in this area; and (ii) the loss function currently employed does not well reflect the degree of error of sentiment misclassification. To overcome such problem, we propose to combine domain knowledge with deep learning. Our proposal includes using sentiment scores, learnt by quadratic programming, to augment training data; and introducing penalty matrix for enhancing the loss function of cross entropy. When experimented, we achieved a significant improvement in classification results.

Context Graph Alignment Using Adversarial Learning for Air Pollution Detection on IoT Sensor Systems

Springer eBooks, 2022

Expert Systems, Oct 8, 2020

Most of the information on the Internet is represented in the form of microtexts, which are short... more Most of the information on the Internet is represented in the form of microtexts, which are short text snippets like news headlines or tweets. These source of information is abundant and mining this data could uncover meaningful insights. Topic modeling is one of the popular methods to extract knowledge from a collection of documents, nevertheless conventional topic models such as Latent Dirichlet Allocation (LDA) is unable to perform well on short documents, mostly due to the scarcity of word co-occurrence statistics embedded in the data. The objective of our research is to create a topic model which can achieve great performances on microtexts while requiring a small runtime for scalability to large datasets. To solve the lack of information of microtexts, we allow our method to take advantage of word embeddings for additional knowledge of relationships between words. For speed and scalability, we apply Auto-Encoding Variational Bayes, an algorithm that can perform efficient black-box inference in probabilistic models. The result of our work is a novel topic model called Nested Variational Autoencoder which is a distri

Helianthus annuus

CRootbox parameter file for Helianthus annuus. from: Potent... more CRootbox parameter file for Helianthus annuus. from: Potential and actual root growth variations in root systems: modeling them with a two-step stochastic approach Pages L, Xie J, Serra V Plant and Soil, 373, 723-735, 2013

USE OF BROMELAIN ISOLATED FROM PINEAPPLE (Ananas comosus) SHOOTS IN EXPERIMENTAL DESIGN FOR PRACTICAL BIOCHEMISTRY TEACHING

Journal of Science, Educational Science, 2017

Lead Engagement by Automated Real Estate Chatbot

2018 5th NAFOSTED Conference on Information and Computer Science (NICS), 2018

Recently, automated chatbot has been increasingly applied in real estate industry. Even though ch... more Recently, automated chatbot has been increasingly applied in real estate industry. Even though chatbots cannot fully replace the traditional relation between agents and home buyers, they can help to engage potential clients (or leads) in meaningful conversations, which is highly useful for lead capture. In this paper, we present an intelligent chatbot for this purpose. Various machine learning techniques, including multi-task deep learning technique for intent identification and frequent itemsets for conversation elaboration, have been employed in our system. Our chatbot has been deployed by CEO K35 GROUP JSC with daily updated data of real estate information at Hanoi and Ho Chi Minh cities, Vietnam.

Background and aimsUpland rice is often grown where water and phosphorus (P) are limited and thes... more Background and aimsUpland rice is often grown where water and phosphorus (P) are limited and these two factors interact on P bioavailability. To better understand this interaction, mechanistic models representing small-scale nutrient gradients and water dynamics in the rhizosphere of full-grown root systems are needed.MethodsRice was grown in large columns using a P-deficient soil at three different P supplies in the topsoil (deficient, suboptimal, non-limiting) in combination with two water regimes (field capacity versus drying periods). Root architectural parameters and P uptake were determined. Using a multiscale model of water and nutrient uptake, in-silico experiments were conducted by mimicking similar P and water treatments. First, 3D root systems were reconstructed by calibrating an architecure model with observed phenological root data, such as nodal root number, lateral types, interbranch distance, root diameters, and root biomass allocation along depth. Secondly, the mult...

International Journal of Computational Vision and Robotics, 2019

Sentiment analysis has been emerging recently as one of the major natural language processing (NL... more Sentiment analysis has been emerging recently as one of the major natural language processing (NLP) tasks in many applications. Especially, as social media channels (e.g. social networks or forums) have become significant sources for brands to observe user opinions about their products, this task is thus increasingly crucial. However, when applied with real data obtained from social media, we notice that there is a high volume of short and informal messages posted by users on those channels. This kind of data makes the existing works suffer from many difficulties to handle, especially ones using deep learning approaches. In this paper, we propose an approach to handle this problem. This work is extended from our previous work, in which we proposed to combine the typical deep learning technique of Convolutional Neural Networks with domain knowledge. The combination is used for acquiring additional training data augmentation and a more reasonable loss function. In this work, we further improve our architecture by various substantial enhancements, including negation-based data augmentation, transfer learning for word embeddings, the combination of word-level embeddings and character-level embeddings, and using multitask learning technique for attaching domain knowledge rules in the learning process. Those enhancements, specifically aiming to handle short and informal messages, help us to enjoy significant improvement in performance once experimenting on real datasets.

Bioresource technology, Jan 7, 2017

A pilot-scale upflow anaerobic sludge blanket (UASB)-downflow hanging sponge system (DHS) combine... more A pilot-scale upflow anaerobic sludge blanket (UASB)-downflow hanging sponge system (DHS) combined with an anaerobic baffled reactor (ABR) and a settling tank (ST) was installed in a natural rubber processing factory in South Vietnam and its process performance was evaluated for 267days. The UASB reactor achieved a total removal efficiency of 55.6±16.6% for chemical oxygen demand (COD) and 77.8±10.3% for biochemical oxygen demand (BOD) with an organic loading rate of 1.7±0.6kg-COD·m(-3)·day(-1). The final effluent of the proposed system had 140±64mg·L(-1) of total COD, 31±12mg·L(-1) of total BOD, and 58±24mg-N·L(-1) of total nitrogen. The system could significantly reduce 92% of greenhouse gas emissions and 80% of hydraulic retention times compared with current treatment systems.

Journal of Advanced Transportation, 2016

SummaryWeaving sections, a common design of motorways, require extensive lane‐change manoeuvres. ... more SummaryWeaving sections, a common design of motorways, require extensive lane‐change manoeuvres. Numerous studies have found that drivers tend to make their lane changes as soon as they enter the weaving section, as the traffic volume increases. Congestion builds up as a result of this high lane‐changing concentration. Importantly, such congestion also limits the use of existing infrastructure, the weaving section downstream. This behaviour thus affects both safety and operational aspects. The potential tool for managing motorways effectively and efficiently is cooperative intelligent transport systems (C‐ITS). This research investigates a lane‐change distribution advisory application based on C‐ITS for weaving vehicles in weaving sections.The objective of this research is to alleviate the lane‐changing concentration problem by coordinating weaving vehicles to ensure that such lane‐changing activities are evenly distributed over the existing weaving length. This is achieved by sendi...

Science Engineering Faculty Smart Transport Research Centre, 2015

Telephone Fraud Detection System

Anticarcinoma antibodies and uses thereof

Star2vec: From Subspace Embedding to Whole-Space Embedding for Intelligent Recommendation System (Extended Abstract)

Lecture Notes in Computer Science, 2019

Balancing stability and plasticity when learning topic models from short and noisy text streams

Neurocomputing, Sep 1, 2022

arXiv (Cornell University), May 1, 2019

Lecture Notes in Computer Science, 2017

Context Graph Alignment Using Adversarial Learning for Air Pollution Detection on IoT Sensor Systems

Springer eBooks, 2022

Expert Systems, Oct 8, 2020

Most of the information on the Internet is represented in the form of microtexts, which are short... more Most of the information on the Internet is represented in the form of microtexts, which are short text snippets like news headlines or tweets. These source of information is abundant and mining this data could uncover meaningful insights. Topic modeling is one of the popular methods to extract knowledge from a collection of documents, nevertheless conventional topic models such as Latent Dirichlet Allocation (LDA) is unable to perform well on short documents, mostly due to the scarcity of word co-occurrence statistics embedded in the data. The objective of our research is to create a topic model which can achieve great performances on microtexts while requiring a small runtime for scalability to large datasets. To solve the lack of information of microtexts, we allow our method to take advantage of word embeddings for additional knowledge of relationships between words. For speed and scalability, we apply Auto-Encoding Variational Bayes, an algorithm that can perform efficient black-box inference in probabilistic models. The result of our work is a novel topic model called Nested Variational Autoencoder which is a distri

Helianthus annuus

USE OF BROMELAIN ISOLATED FROM PINEAPPLE (Ananas comosus) SHOOTS IN EXPERIMENTAL DESIGN FOR PRACTICAL BIOCHEMISTRY TEACHING

Journal of Science, Educational Science, 2017

Lead Engagement by Automated Real Estate Chatbot

2018 5th NAFOSTED Conference on Information and Computer Science (NICS), 2018

International Journal of Computational Vision and Robotics, 2019

Bioresource technology, Jan 7, 2017

Journal of Advanced Transportation, 2016

Science Engineering Faculty Smart Transport Research Centre, 2015

Telephone Fraud Detection System

Anticarcinoma antibodies and uses thereof