Chenyi Zhuang - Academia.edu (original) (raw)
Papers by Chenyi Zhuang
IEEE Access, 2022
Actionable knowledge graph (AKG), a specialized version of knowledge graph, was proposed recently... more Actionable knowledge graph (AKG), a specialized version of knowledge graph, was proposed recently to represent, analyze, and predict human action, thus facilitating deeper understanding of human action by robots. However, the automatic construction of AKGs from action-related corpora is still an unexplored problem. In this study, we first propose three unsupervised matrix factorization-based frameworks for AKG generation from three different perspectives: subject, context and functionality of action, respectively. Further, we propose a hybrid model based on neural network matrix factorization (NNMF) that considers multi-source signals simultaneously. It not only learns the latent action representations, but also learns the optimal learning objective rather than assuming it to be fixed. To quantitatively verify the utility of the constructed AKGs, we introduce a novel application, that is, predicting the most likely missing action records in Wikipedia biographies. Experimental results on a large-scale Wikipedia biography dataset show that the proposed model brings significant improvement over the baselines, which demonstrates the strong expressiveness of our generated AKGs. INDEX TERMS Actionable knowledge graph, neural network matrix factorization, text mining, web mining This article has been accepted for publication in IEEE Access.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Given the enormous number of users and items, industrial cascade recommendation systems (RS) are ... more Given the enormous number of users and items, industrial cascade recommendation systems (RS) are continuously expanded in size and complexity to deliver relevant items, such as news, services, and commodities, to the appropriate users. In a real-world scenario with hundreds of thousands requests per second, significant computation is required to infer personalized results for each request, resulting in a massive energy consumption and carbon emission that raises concern. This paper proposes GreenFlow, a practical computation allocation framework for RS, that considers both accuracy and carbon emission during inference. For each stage (e.g., recall, pre-ranking, ranking, etc.) of a cascade RS, when a user triggers a request, we define two actions that determine the computation: (1) the trained instances of models with different computational complexity; and (2) the number of items to be inferred in the stage. We refer to the combinations of actions in all stages as action chains. A r...
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Modeling sequential data is essential to many applications such as natural language processing, r... more Modeling sequential data is essential to many applications such as natural language processing, recommendation systems, time series predictions, anomaly detection, etc. When processing sequential data, one of the critical issues is how to capture the temporalcorrelation among events. Though prevalent and effective in many applications, conventional approaches such as RNNs and Transformers, struggle with handling the non-stationary characteristics (i.e., such temporal-correlation among events would change over time), which is indeed encountered in many real-world scenarios. In this paper, we present a non-stationary time-aware kernelized attention approach for input sequences of neural networks. By constructing the Generalized Spectral Mixture Kernel (GSMK), and integrating it to the attention mechanism, we mathematically reveal its representation capability in terms of the time-dependent temporal-correlation. Following that, a novel neural network structure is proposed, which would enable us to encode both stationary and non-stationary time event series. Finally, we demonstrate the performance of the proposed method on both synthetic data which presents the theoretical insights, and a variety of real-world datasets which shows its competitive performance against related work. CCS CONCEPTS • Computing methodologies → Neural networks; Kernel methods.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017
Technologies are increasingly taking advantage of the explosion in the amount of data generated b... more Technologies are increasingly taking advantage of the explosion in the amount of data generated by social multimedia (e.g., web searches, ad targeting, and urban computing). In this paper, we propose a multi-view learning framework for presenting the construction of a new urban movement knowledge graph, which could greatly facilitate the research domains mentioned above. In particular, by viewing GPS trajectory data from temporal, spatial, and spatiotemporal points of view, we construct a knowledge graph of which nodes and edges are their locations and relations, respectively. On the knowledge graph, both nodes and edges are represented in latent semantic space. We verify its utility by subsequently applying the knowledge graph to predict the extent of user attention (high or low) paid to different locations in a city. Experimental evaluations and analysis of a real-world dataset show significant improvements in comparison to state-of-the-art methods.
Information Retrieval Technology, 2017
Discovering knowledge from social images available on social network services (SNSs) is in the sp... more Discovering knowledge from social images available on social network services (SNSs) is in the spotlight. For example, objects that appear frequently in images shot around a certain city may represent its characteristics (local culture, etc.) and may become the valuable sightseeing resources for people from other countries or cities. However, due to the diverse quality of social images, it is still not easy to discover such common objects from them with the conventional object discovery methods. In this paper, we propose a novel unsupervised ranking method of predicted object bounding boxes for discovering common objects from a mixed-class and noisy image dataset. Extensive experiments on standard and extended benchmarks demonstrate the effectiveness of our proposed approach. We also show the usefulness of our method with a real application in which a city’s characteristics (i.e., culture elements) are discovered from a set of images collected there.
2019 IEEE International Conference on Big Data and Smart Computing (BigComp), 2019
Experienced opinions about products and services can guide a potential user for a better purchase... more Experienced opinions about products and services can guide a potential user for a better purchase decision. Fine-grained aspect level opinions embedded within reviews must be explored to discover experienced users' latent opinion about the aspects (i.e. features of products like cost, value for money, etc.) and their relative importance. In this paper, we present an unsupervised approach for discovering coherent hotel aspects based on the user attention. This model effectively integrates techniques like topic modeling and word embeddings along with the frequent noun-adjective co-occurrence statistics to automatically discover coherent hotel aspects. Further supervised methods are used to understand the user's relative emphasis on the aspects and finally rank the hotels. This method does not assume any predefined seed words and discovers coherent level aspects by directly using user attention and word co-occurrence statistics in addition to topic modeling and word embeddings. The performance evaluation of this method was done by collecting various hotel reviews from multiple travel websites. Results show that the proposed methods improved the baseline performance up to 90%. Hence, the results thus obtained are very promising and indicate that the system is simple, scalable and most of all accurate in ranking hotels based on the latent aspects expressed in the user reviews.
Assessing the quality of sightseeing spots is a key challenge to satisfy the diverse needs of tou... more Assessing the quality of sightseeing spots is a key challenge to satisfy the diverse needs of tourists and discover new sightseeing resources (spots). In this paper, we propose an element-oriented method of landscape assessment that analyzes images available on image-sharing web sites. The experimental results demonstrate that our method is superior to the existing ones based on low-level visual features and user behavior analysis.
Travel route recommendation services that recommend a sequence of points-of-interest (POIs) for t... more Travel route recommendation services that recommend a sequence of points-of-interest (POIs) for tourists are very useful in location-based social networks (LBSNs). Currently, most of the work that addresses this task are focusing on personalization and POI features, which estimate user-location relations while rarely considering transitions, i.e., the relationships between locations. To this end, we propose a latent factorization model that learns transition patterns with enhanced spatial-temporal features between locations. Furthermore, we recommend travel routes by combining knowledge on locations and transitions. Experimental results with public datasets reveal that our approaches improve upon the performance of conventional methods.
Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020
With the revolution of mobile internet, online finance has grown explosively. In this new area, o... more With the revolution of mobile internet, online finance has grown explosively. In this new area, one challenge of significant importance is how to effectively deliver the financial products or services to a set of target users by marketing. Given a product or service to be promoted and a set of users as seeds, audience expansion is such a targeting technique, which aims to find potential audience among a large number of users. However, in the context of finance, financial products and services are dynamic in nature as they co-vary with the socio-economic environment. Moreover, marketing campaigns for promoting products or services always consist of different rules of play, even for the same type of products or services. As a result, there is a strong demand for the timeliness of seeds in financial targeting. Conventional one-stage audience expansion methods, which generate expanded users by expanding over seeds, would encounter two problems under this setting: (1) the seeds would inevitably involve a number of users that are not representative for expansion, and direct expansion over these noisy seeds would dramatically deteriorate the performance; (2) one-stage expansion over fixed seeds cannot timely and accurately capture users' preferences over the currently running campaign due to the lack of timeliness of seeds. To address the above challenges, in this paper, we present a novel two-stage audience expansion system Hubble. In the first cold-start stage, a reweighting mechanism is devised to suppress the noises within seeds, which is motivated from the observation on the relationship between golden seeds and their corresponding density in the embedding space. With incrementally collecting feedbacks from users, we further include these feedbacks to guide subsequent audience expansion in the second stage. But the distribution of these feedbacks is usually biased and cannot fully characterize the distribution of all target audiences. Therefore, we propose a method to incorporate biased feedbacks with seeds in a meta-learning manner to pan for golden seeds from the noisy seed-set. Finally, we conduct extensive experiments on three real datasets and online A/B testing, which demonstrate the effectiveness of the proposed method. In addition, we release two datasets for boosting the study of this new research topic.
Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18, 2018
The problem of extracting meaningful data through graph analysis spans a range of different field... more The problem of extracting meaningful data through graph analysis spans a range of different fields, such as the internet, social networks, biological networks, and many others. The importance of being able to effectively mine and learn from such data continues to grow as more and more structured data become available. In this paper, we present a simple and scalable semi-supervised learning method for graph-structured data in which only a very small portion of the training data are labeled. To sufficiently embed the graph knowledge, our method performs graph convolution from different views of the raw data. In particular, a dual graph convolutional neural network method is devised to jointly consider the two essential assumptions of semi-supervised learning: (1) local consistency and (2) global consistency. Accordingly, two convolutional neural networks are devised to embed the local-consistency-based and global-consistency-based knowledge, respectively. Given the different data transformations from the two networks, we then introduce an unsupervised temporal loss function for the ensemble. In experiments using both unsupervised and supervised loss functions, our method outperforms state-of-the-art techniques on different datasets.
Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, 2019
We propose a general view that demonstrates the relationship between network embedding approaches... more We propose a general view that demonstrates the relationship between network embedding approaches and matrix factorization. Unlike previous works that present the equivalence for the approaches from a skip-gram model perspective, we provide a more fundamental connection from an optimization (objective function) perspective. We demonstrate that matrix factorization is equivalent to optimizing two objectives: one is for bringing together the embeddings of similar nodes; the other is for separating the embeddings of distant nodes. The matrix to be factorized has a general form: S-β. The elements of mathbfS\mathbfS mathbfS indicate pairwise node similarities. They can be based on any user-defined similarity/distance measure or learned from random walks on networks. The shift number β is related to a parameter that balances the two objectives. More importantly, the resulting embeddings are sensitive to β and we can improve the embeddings by tuning β. Experiments show that matrix factorization based on a new proposed similarity measure and β-tuning strategy significantly outperforms existing matrix factorization approaches on a range of benchmark networks.
IEICE Transactions on Information and Systems, 2019
A travel route recommendation service that recommends a sequence of points of interest for touris... more A travel route recommendation service that recommends a sequence of points of interest for tourists traveling in an unfamiliar city is a very useful tool in the field of location-based social networks. Although there are many web services and mobile applications that can help tourists to plan their trips by providing information about sightseeing attractions, travel route recommendation services are still not widely applied. One reason could be that most of the previous studies that addressed this task were based on the orienteering problem model, which mainly focuses on the estimation of a user-location relation (for example, a user preference). This assumes that a user receives a reward by visiting a point of interest and the travel route is recommended by maximizing the total rewards from visiting those locations. However, a location-location relation, which we introduce as a transition pattern in this paper, implies useful information such as visiting order and can help to improve the quality of travel route recommendations. To this end, we propose a travel route recommendation method by combining location and transition knowledge, which assigns rewards for both locations and transitions.
Computer Science and Information Systems, 2019
Graph embedding aims at learning representations of nodes in a low dimensional vector space. Good... more Graph embedding aims at learning representations of nodes in a low dimensional vector space. Good embeddings should preserve the graph topological structure. To study how much such structure can be preserved, we propose evaluation methods from four aspects: 1) How well the graph can be reconstructed based on the embeddings, 2) The divergence of the original link distribution and the embedding-derived distribution, 3) The consistency of communities discovered from the graph and embeddings, and 4) To what extent we can employ embeddings to facilitate link prediction. We find that it is insufficient to rely on the embeddings to reconstruct the original graph, to discover communities, and to predict links at a high precision. Thus, the embeddings by the state-of-the-art approaches can only preserve part of the topological structure.
Multimedia Tools and Applications, 2019
In this paper, we propose a robust visual object clustering approach based on bounding box rankin... more In this paper, we propose a robust visual object clustering approach based on bounding box ranking to discover the characteristics of objects from real-world datasets containing a large number of noisy images, and apply it to sightseeing spot assessment. The purpose is to develop a diversity of resources for sightseeing from images available on social network services (SNS). Objects appearing frequently in images captured in a certain city may represent a certain characteristic of it (local culture, architecture, and so on). Such knowledge can be used to discover various sightseeing resources from the perspective of the user rather than that of the provider (e.g., a travel agency). However, owing to the variable quality of images on SNS, it is challenging to identify objects common to several images by using conventional object discovery methods, and this is where the proposed approach is useful. Extensive experiments on standard and extended benchmarks verified its effectiveness. We also tested the proposed method on an application where the characteristics of a city (i.e., cultural elements) were discovered from a set of images of it. Moreover, by utilizing the objects discovered from images on SNS, we propose an object-level assessment framework to rank sightseeing spots by assigning scores and verify its performance.
International Journal of Big Data Intelligence, 2018
Multimedia Tools and Applications, 2016
Technologies are increasingly taking advantage of the explosion of social media (e.g., web search... more Technologies are increasingly taking advantage of the explosion of social media (e.g., web searches, ad targeting, personalized geo-social recommendations, urban computing). Estimating the characteristics of users, or user profiling, is one of the key challenges for such technologies. This paper focuses on the important problem of automatically estimating social networking service (SNS) user authority with a given city, which can significantly improve location-based services and systems. The "authority" in our work measures a user's familiarity with a particular city. By analyzing users' social, temporal, and spatial behavior, we respectively propose and compare three models for user authority: a social-networkdriven model, time-driven model, and location-driven model. Furthermore, we discuss the integration of these three models. Finally, by using these user-profiling models, we propose a new application for geo-social recommendations. In contrast to related studies, which focus on popular and famous points of interests (POIs), our models help discover obscure POIs that are not well known. Experimental evaluations and analysis on a real dataset collected from three cities demonstrate the performance of the proposed user-profiling models. To verify the effect of discovering obscure POIs, the proposed application was implemented to discover and explore obscure POIs in Kyoto, Japan.
2016 IEEE Second International Conference on Multimedia Big Data (BigMM), 2016
Recommendation of points of interests (POIs) is drawing more attention to meet the growing demand... more Recommendation of points of interests (POIs) is drawing more attention to meet the growing demands of tourists. Thus, a POI's quality (sightseeing value) needs to be estimated. In contrast to conventional studies that rank POIs on the basis of user behavior analysis, this paper presents methods to estimate quality by analyzing geo-social images. Our approach estimates the sightseeing value from two aspects: (1) nature value and (2) culture value. For the nature value, we extract image features that are related to favorable human perception to verify whether a POI would satisfy tourists in terms of environmental psychology. Three criteria are defined accordingly: coherence, image-ability, and visual-scale. For the culture value, we recognize the main cultural element (i.e., architecture) included in a POI. In the experiments, we applied our methods to real POIs and found that our approach assessed sightseeing value effectively.
Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, 2015
In contrast to conventional studies of discovering hot spots, by analyzing geo-tagged images on F... more In contrast to conventional studies of discovering hot spots, by analyzing geo-tagged images on Flickr, we introduce novel methods to discover obscure sightseeing spots that are less well-known while still worth visiting. To this end, we face two new challenges that the classical authority analysis based methods do not encounter: how to discover and rank spots on the basis of 1) popularity (obscurity level) and 2) scenery quality. For the first challenge, we estimate the obscurity level of a spot in accordance with the visiting asymmetry between photographers who are familiar with a target city and those who are not. For the second challenge, the behavior of both viewers who browsed the images and photographers are analyzed per each spot. We also develop an application system to help users to explore sightseeing spots with different geographical granularities. Experimental evaluations and analysis on a real dataset well demonstrate the effectiveness of the proposed methods.
IEEE Access, 2022
Actionable knowledge graph (AKG), a specialized version of knowledge graph, was proposed recently... more Actionable knowledge graph (AKG), a specialized version of knowledge graph, was proposed recently to represent, analyze, and predict human action, thus facilitating deeper understanding of human action by robots. However, the automatic construction of AKGs from action-related corpora is still an unexplored problem. In this study, we first propose three unsupervised matrix factorization-based frameworks for AKG generation from three different perspectives: subject, context and functionality of action, respectively. Further, we propose a hybrid model based on neural network matrix factorization (NNMF) that considers multi-source signals simultaneously. It not only learns the latent action representations, but also learns the optimal learning objective rather than assuming it to be fixed. To quantitatively verify the utility of the constructed AKGs, we introduce a novel application, that is, predicting the most likely missing action records in Wikipedia biographies. Experimental results on a large-scale Wikipedia biography dataset show that the proposed model brings significant improvement over the baselines, which demonstrates the strong expressiveness of our generated AKGs. INDEX TERMS Actionable knowledge graph, neural network matrix factorization, text mining, web mining This article has been accepted for publication in IEEE Access.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Given the enormous number of users and items, industrial cascade recommendation systems (RS) are ... more Given the enormous number of users and items, industrial cascade recommendation systems (RS) are continuously expanded in size and complexity to deliver relevant items, such as news, services, and commodities, to the appropriate users. In a real-world scenario with hundreds of thousands requests per second, significant computation is required to infer personalized results for each request, resulting in a massive energy consumption and carbon emission that raises concern. This paper proposes GreenFlow, a practical computation allocation framework for RS, that considers both accuracy and carbon emission during inference. For each stage (e.g., recall, pre-ranking, ranking, etc.) of a cascade RS, when a user triggers a request, we define two actions that determine the computation: (1) the trained instances of models with different computational complexity; and (2) the number of items to be inferred in the stage. We refer to the combinations of actions in all stages as action chains. A r...
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Modeling sequential data is essential to many applications such as natural language processing, r... more Modeling sequential data is essential to many applications such as natural language processing, recommendation systems, time series predictions, anomaly detection, etc. When processing sequential data, one of the critical issues is how to capture the temporalcorrelation among events. Though prevalent and effective in many applications, conventional approaches such as RNNs and Transformers, struggle with handling the non-stationary characteristics (i.e., such temporal-correlation among events would change over time), which is indeed encountered in many real-world scenarios. In this paper, we present a non-stationary time-aware kernelized attention approach for input sequences of neural networks. By constructing the Generalized Spectral Mixture Kernel (GSMK), and integrating it to the attention mechanism, we mathematically reveal its representation capability in terms of the time-dependent temporal-correlation. Following that, a novel neural network structure is proposed, which would enable us to encode both stationary and non-stationary time event series. Finally, we demonstrate the performance of the proposed method on both synthetic data which presents the theoretical insights, and a variety of real-world datasets which shows its competitive performance against related work. CCS CONCEPTS • Computing methodologies → Neural networks; Kernel methods.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017
Technologies are increasingly taking advantage of the explosion in the amount of data generated b... more Technologies are increasingly taking advantage of the explosion in the amount of data generated by social multimedia (e.g., web searches, ad targeting, and urban computing). In this paper, we propose a multi-view learning framework for presenting the construction of a new urban movement knowledge graph, which could greatly facilitate the research domains mentioned above. In particular, by viewing GPS trajectory data from temporal, spatial, and spatiotemporal points of view, we construct a knowledge graph of which nodes and edges are their locations and relations, respectively. On the knowledge graph, both nodes and edges are represented in latent semantic space. We verify its utility by subsequently applying the knowledge graph to predict the extent of user attention (high or low) paid to different locations in a city. Experimental evaluations and analysis of a real-world dataset show significant improvements in comparison to state-of-the-art methods.
Information Retrieval Technology, 2017
Discovering knowledge from social images available on social network services (SNSs) is in the sp... more Discovering knowledge from social images available on social network services (SNSs) is in the spotlight. For example, objects that appear frequently in images shot around a certain city may represent its characteristics (local culture, etc.) and may become the valuable sightseeing resources for people from other countries or cities. However, due to the diverse quality of social images, it is still not easy to discover such common objects from them with the conventional object discovery methods. In this paper, we propose a novel unsupervised ranking method of predicted object bounding boxes for discovering common objects from a mixed-class and noisy image dataset. Extensive experiments on standard and extended benchmarks demonstrate the effectiveness of our proposed approach. We also show the usefulness of our method with a real application in which a city’s characteristics (i.e., culture elements) are discovered from a set of images collected there.
2019 IEEE International Conference on Big Data and Smart Computing (BigComp), 2019
Experienced opinions about products and services can guide a potential user for a better purchase... more Experienced opinions about products and services can guide a potential user for a better purchase decision. Fine-grained aspect level opinions embedded within reviews must be explored to discover experienced users' latent opinion about the aspects (i.e. features of products like cost, value for money, etc.) and their relative importance. In this paper, we present an unsupervised approach for discovering coherent hotel aspects based on the user attention. This model effectively integrates techniques like topic modeling and word embeddings along with the frequent noun-adjective co-occurrence statistics to automatically discover coherent hotel aspects. Further supervised methods are used to understand the user's relative emphasis on the aspects and finally rank the hotels. This method does not assume any predefined seed words and discovers coherent level aspects by directly using user attention and word co-occurrence statistics in addition to topic modeling and word embeddings. The performance evaluation of this method was done by collecting various hotel reviews from multiple travel websites. Results show that the proposed methods improved the baseline performance up to 90%. Hence, the results thus obtained are very promising and indicate that the system is simple, scalable and most of all accurate in ranking hotels based on the latent aspects expressed in the user reviews.
Assessing the quality of sightseeing spots is a key challenge to satisfy the diverse needs of tou... more Assessing the quality of sightseeing spots is a key challenge to satisfy the diverse needs of tourists and discover new sightseeing resources (spots). In this paper, we propose an element-oriented method of landscape assessment that analyzes images available on image-sharing web sites. The experimental results demonstrate that our method is superior to the existing ones based on low-level visual features and user behavior analysis.
Travel route recommendation services that recommend a sequence of points-of-interest (POIs) for t... more Travel route recommendation services that recommend a sequence of points-of-interest (POIs) for tourists are very useful in location-based social networks (LBSNs). Currently, most of the work that addresses this task are focusing on personalization and POI features, which estimate user-location relations while rarely considering transitions, i.e., the relationships between locations. To this end, we propose a latent factorization model that learns transition patterns with enhanced spatial-temporal features between locations. Furthermore, we recommend travel routes by combining knowledge on locations and transitions. Experimental results with public datasets reveal that our approaches improve upon the performance of conventional methods.
Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020
With the revolution of mobile internet, online finance has grown explosively. In this new area, o... more With the revolution of mobile internet, online finance has grown explosively. In this new area, one challenge of significant importance is how to effectively deliver the financial products or services to a set of target users by marketing. Given a product or service to be promoted and a set of users as seeds, audience expansion is such a targeting technique, which aims to find potential audience among a large number of users. However, in the context of finance, financial products and services are dynamic in nature as they co-vary with the socio-economic environment. Moreover, marketing campaigns for promoting products or services always consist of different rules of play, even for the same type of products or services. As a result, there is a strong demand for the timeliness of seeds in financial targeting. Conventional one-stage audience expansion methods, which generate expanded users by expanding over seeds, would encounter two problems under this setting: (1) the seeds would inevitably involve a number of users that are not representative for expansion, and direct expansion over these noisy seeds would dramatically deteriorate the performance; (2) one-stage expansion over fixed seeds cannot timely and accurately capture users' preferences over the currently running campaign due to the lack of timeliness of seeds. To address the above challenges, in this paper, we present a novel two-stage audience expansion system Hubble. In the first cold-start stage, a reweighting mechanism is devised to suppress the noises within seeds, which is motivated from the observation on the relationship between golden seeds and their corresponding density in the embedding space. With incrementally collecting feedbacks from users, we further include these feedbacks to guide subsequent audience expansion in the second stage. But the distribution of these feedbacks is usually biased and cannot fully characterize the distribution of all target audiences. Therefore, we propose a method to incorporate biased feedbacks with seeds in a meta-learning manner to pan for golden seeds from the noisy seed-set. Finally, we conduct extensive experiments on three real datasets and online A/B testing, which demonstrate the effectiveness of the proposed method. In addition, we release two datasets for boosting the study of this new research topic.
Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18, 2018
The problem of extracting meaningful data through graph analysis spans a range of different field... more The problem of extracting meaningful data through graph analysis spans a range of different fields, such as the internet, social networks, biological networks, and many others. The importance of being able to effectively mine and learn from such data continues to grow as more and more structured data become available. In this paper, we present a simple and scalable semi-supervised learning method for graph-structured data in which only a very small portion of the training data are labeled. To sufficiently embed the graph knowledge, our method performs graph convolution from different views of the raw data. In particular, a dual graph convolutional neural network method is devised to jointly consider the two essential assumptions of semi-supervised learning: (1) local consistency and (2) global consistency. Accordingly, two convolutional neural networks are devised to embed the local-consistency-based and global-consistency-based knowledge, respectively. Given the different data transformations from the two networks, we then introduce an unsupervised temporal loss function for the ensemble. In experiments using both unsupervised and supervised loss functions, our method outperforms state-of-the-art techniques on different datasets.
Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, 2019
We propose a general view that demonstrates the relationship between network embedding approaches... more We propose a general view that demonstrates the relationship between network embedding approaches and matrix factorization. Unlike previous works that present the equivalence for the approaches from a skip-gram model perspective, we provide a more fundamental connection from an optimization (objective function) perspective. We demonstrate that matrix factorization is equivalent to optimizing two objectives: one is for bringing together the embeddings of similar nodes; the other is for separating the embeddings of distant nodes. The matrix to be factorized has a general form: S-β. The elements of mathbfS\mathbfS mathbfS indicate pairwise node similarities. They can be based on any user-defined similarity/distance measure or learned from random walks on networks. The shift number β is related to a parameter that balances the two objectives. More importantly, the resulting embeddings are sensitive to β and we can improve the embeddings by tuning β. Experiments show that matrix factorization based on a new proposed similarity measure and β-tuning strategy significantly outperforms existing matrix factorization approaches on a range of benchmark networks.
IEICE Transactions on Information and Systems, 2019
A travel route recommendation service that recommends a sequence of points of interest for touris... more A travel route recommendation service that recommends a sequence of points of interest for tourists traveling in an unfamiliar city is a very useful tool in the field of location-based social networks. Although there are many web services and mobile applications that can help tourists to plan their trips by providing information about sightseeing attractions, travel route recommendation services are still not widely applied. One reason could be that most of the previous studies that addressed this task were based on the orienteering problem model, which mainly focuses on the estimation of a user-location relation (for example, a user preference). This assumes that a user receives a reward by visiting a point of interest and the travel route is recommended by maximizing the total rewards from visiting those locations. However, a location-location relation, which we introduce as a transition pattern in this paper, implies useful information such as visiting order and can help to improve the quality of travel route recommendations. To this end, we propose a travel route recommendation method by combining location and transition knowledge, which assigns rewards for both locations and transitions.
Computer Science and Information Systems, 2019
Graph embedding aims at learning representations of nodes in a low dimensional vector space. Good... more Graph embedding aims at learning representations of nodes in a low dimensional vector space. Good embeddings should preserve the graph topological structure. To study how much such structure can be preserved, we propose evaluation methods from four aspects: 1) How well the graph can be reconstructed based on the embeddings, 2) The divergence of the original link distribution and the embedding-derived distribution, 3) The consistency of communities discovered from the graph and embeddings, and 4) To what extent we can employ embeddings to facilitate link prediction. We find that it is insufficient to rely on the embeddings to reconstruct the original graph, to discover communities, and to predict links at a high precision. Thus, the embeddings by the state-of-the-art approaches can only preserve part of the topological structure.
Multimedia Tools and Applications, 2019
In this paper, we propose a robust visual object clustering approach based on bounding box rankin... more In this paper, we propose a robust visual object clustering approach based on bounding box ranking to discover the characteristics of objects from real-world datasets containing a large number of noisy images, and apply it to sightseeing spot assessment. The purpose is to develop a diversity of resources for sightseeing from images available on social network services (SNS). Objects appearing frequently in images captured in a certain city may represent a certain characteristic of it (local culture, architecture, and so on). Such knowledge can be used to discover various sightseeing resources from the perspective of the user rather than that of the provider (e.g., a travel agency). However, owing to the variable quality of images on SNS, it is challenging to identify objects common to several images by using conventional object discovery methods, and this is where the proposed approach is useful. Extensive experiments on standard and extended benchmarks verified its effectiveness. We also tested the proposed method on an application where the characteristics of a city (i.e., cultural elements) were discovered from a set of images of it. Moreover, by utilizing the objects discovered from images on SNS, we propose an object-level assessment framework to rank sightseeing spots by assigning scores and verify its performance.
International Journal of Big Data Intelligence, 2018
Multimedia Tools and Applications, 2016
Technologies are increasingly taking advantage of the explosion of social media (e.g., web search... more Technologies are increasingly taking advantage of the explosion of social media (e.g., web searches, ad targeting, personalized geo-social recommendations, urban computing). Estimating the characteristics of users, or user profiling, is one of the key challenges for such technologies. This paper focuses on the important problem of automatically estimating social networking service (SNS) user authority with a given city, which can significantly improve location-based services and systems. The "authority" in our work measures a user's familiarity with a particular city. By analyzing users' social, temporal, and spatial behavior, we respectively propose and compare three models for user authority: a social-networkdriven model, time-driven model, and location-driven model. Furthermore, we discuss the integration of these three models. Finally, by using these user-profiling models, we propose a new application for geo-social recommendations. In contrast to related studies, which focus on popular and famous points of interests (POIs), our models help discover obscure POIs that are not well known. Experimental evaluations and analysis on a real dataset collected from three cities demonstrate the performance of the proposed user-profiling models. To verify the effect of discovering obscure POIs, the proposed application was implemented to discover and explore obscure POIs in Kyoto, Japan.
2016 IEEE Second International Conference on Multimedia Big Data (BigMM), 2016
Recommendation of points of interests (POIs) is drawing more attention to meet the growing demand... more Recommendation of points of interests (POIs) is drawing more attention to meet the growing demands of tourists. Thus, a POI's quality (sightseeing value) needs to be estimated. In contrast to conventional studies that rank POIs on the basis of user behavior analysis, this paper presents methods to estimate quality by analyzing geo-social images. Our approach estimates the sightseeing value from two aspects: (1) nature value and (2) culture value. For the nature value, we extract image features that are related to favorable human perception to verify whether a POI would satisfy tourists in terms of environmental psychology. Three criteria are defined accordingly: coherence, image-ability, and visual-scale. For the culture value, we recognize the main cultural element (i.e., architecture) included in a POI. In the experiments, we applied our methods to real POIs and found that our approach assessed sightseeing value effectively.
Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, 2015
In contrast to conventional studies of discovering hot spots, by analyzing geo-tagged images on F... more In contrast to conventional studies of discovering hot spots, by analyzing geo-tagged images on Flickr, we introduce novel methods to discover obscure sightseeing spots that are less well-known while still worth visiting. To this end, we face two new challenges that the classical authority analysis based methods do not encounter: how to discover and rank spots on the basis of 1) popularity (obscurity level) and 2) scenery quality. For the first challenge, we estimate the obscurity level of a spot in accordance with the visiting asymmetry between photographers who are familiar with a target city and those who are not. For the second challenge, the behavior of both viewers who browsed the images and photographers are analyzed per each spot. We also develop an application system to help users to explore sightseeing spots with different geographical granularities. Experimental evaluations and analysis on a real dataset well demonstrate the effectiveness of the proposed methods.