Baichuan Zhang - Meta | LinkedIn (original) (raw)

About

I'm a Staff Software Engineer / Research Scientist at Meta (Facebook) in the Ads org…

Articles by Baichuan

Activity

Experience & Education

View Baichuan’s full experience

See their title, tenure and more.

Publications

ASONAM July 1, 2019

Data Science and Engineering Journal May 25, 2019

Network embedding methodologies, which learn a distributed vector representation for each vertex in a network, have attracted considerable interest in recent years. Existing works have demonstrated that vertex representation learned through an embedding method provides superior performance in many real-world applications, such as node classification, link prediction, and community detection. However, most of the existing methods for network embedding only utilize topological information of a…
Network embedding methodologies, which learn a distributed vector representation for each vertex in a network, have attracted considerable interest in recent years. Existing works have demonstrated that vertex representation learned through an embedding method provides superior performance in many real-world applications, such as node classification, link prediction, and community detection. However, most of the existing methods for network embedding only utilize topological information of a vertex, ignoring a rich set of nodal attributes (such as, user profiles of an online social network, or textual contents of a citation network), which is abundant in all real-life networks. A joint network embedding that takes into account both attributional and relational information entails a complete network information and could further enrich the learned vector representations. In this work, we present Neural-Brane, a novel Neural Bayesian Personalized Ranking based Attributed Network Embedding. For a given network, Neural-Brane extracts latent feature representation of its vertices using a designed neural network model that unifies network topological information and nodal attributes; Besides, it utilizes Bayesian personalized ranking objective, which exploits the proximity ordering between a similar node-pair and a dissimilar node-pair. We evaluate the quality of vertex embedding produced by Neural-Brane by solving the node classification and clustering tasks on four real-world datasets. Experimental results demonstrate the superiority of our proposed method over the state-of-the-art existing methods.
Other authors

The ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2019) March 8, 2019

The name disambiguation task partitions a collection of records pertaining to
a given name, such that there is a one-to-one correspondence between the
partitions and a group of people, all sharing that given name. Most existing
solutions for this task are proposed for static data. However, more realistic scenarios stipulate emergence of records in a streaming fashion where records may belong to known as well as unknown persons all sharing the same name. This requires a flexible name…
The name disambiguation task partitions a collection of records pertaining to
a given name, such that there is a one-to-one correspondence between the
partitions and a group of people, all sharing that given name. Most existing
solutions for this task are proposed for static data. However, more realistic scenarios stipulate emergence of records in a streaming fashion where records may belong to known as well as unknown persons all sharing the same name. This requires a flexible name disambiguation algorithm that can not only classify records of known persons represented in the training data by their existing records but can also identify records of new ambiguous persons with no existing records included in the initial training dataset. Toward achieving this objective, in this paper we present a non-parametric Bayesian framework that utilizes a Dirichlet Process Gaussian Mixture Model (DPGMM) as a core engine for online name disambiguation task. A Sequential Importance Sampling with Resampling (SISR) technique, also known as particle filtering, is proposed for inference to simultaneously perform online classification and new class discovery. Specifically, for each online record, we approximate its class conditional posterior distribution by a set of particles and their weights, which are updated in a sequential manner without the need to re-access previously observed records. We also propose an interactive version of our online name disambiguation method, which improves the prediction accuracy by exploiting user feedback.
Other authors

CIKM 2018 August 6, 2018

Job recommendation is an important task for the modern recruitment industry. An excellent job recommender system not only enables to recommend a higher paying job which is maximally aligned with the skill-set of the current job, but also suggests to acquire few additional skills which are required to assume the new position. In this work, we created three types of information networks from the historical job data: (i) job transition network, (ii) job-skill network, and (iii) skill co-occurrence…
Job recommendation is an important task for the modern recruitment industry. An excellent job recommender system not only enables to recommend a higher paying job which is maximally aligned with the skill-set of the current job, but also suggests to acquire few additional skills which are required to assume the new position. In this work, we created three types of information networks from the historical job data: (i) job transition network, (ii) job-skill network, and (iii) skill co-occurrence network. We provide a representation learning model which can utilize the information from all three networks to jointly learn the representation of the jobs and skills in the shared kkk-dimensional latent space. In our experiments, we show that by jointly learning the representation for the jobs and skills, our model provides better recommendation for both jobs and skills. Additionally, we also show some case studies which validate our claims.
Other authors

Social Network Analysis and Mining March 1, 2018

The majority of directed social networks, such as, Twitter, Flickr, and Google+
exhibit \textit{reciprocal altruism}, a social psychology phenomenon, which
drives a vertex to create a reciprocal link
with another vertex which has created a directed link towards the former. In
existing works, scientists have already predicted the possibility of the creation of
reciprocal link---a task known as ``reciprocal link prediction". However,
an equally important problem is…
The majority of directed social networks, such as, Twitter, Flickr, and Google+
exhibit \textit{reciprocal altruism}, a social psychology phenomenon, which
drives a vertex to create a reciprocal link
with another vertex which has created a directed link towards the former. In
existing works, scientists have already predicted the possibility of the creation of
reciprocal link---a task known as ``reciprocal link prediction". However,
an equally important problem is determining the interval time between the creation of the
first link (also called parasocial link) and its corresponding reciprocal link.
No existing works have considered solving this problem, which is the focus of
this paper. Predicting the reciprocal link interval time is a challenging problem
for two reasons: First, there is a lack of effective features, since well-known link
prediction features are designed for undirected networks and for the binary
classification task, hence they do not work well for the interval time prediction;
Second, the presence of \textit{ever-waiting} links (i.e., parasocial links for which a reciprocal
link is not formed within the observation period) makes the traditional supervised
regression methods unsuitable for such data. In this paper, we propose a solution for
the reciprocal link interval time prediction task. We map this problem to a survival
analysis task and show through extensive experiments on real-world datasets
that survival analysis methods perform better than traditional regression, neural
network based models, and support vector regression (SVR) for solving reciprocal interval time prediction.
Other authors

Social Network Analysis and Mining December 21, 2017

The smallest eigenvalues and the associated eigenvectors (i.e., eigenpairs) of a graph Laplacian matrix have been widely used for spectral clustering and community detection. However, in real-life applications the number of clusters or communities (say, K) is generally unknown a-priori. Consequently, the majority of the existing methods either choose K heuristically or they repeat
the clustering method with different choices of K and accept the best clustering result. The first option, more…
The smallest eigenvalues and the associated eigenvectors (i.e., eigenpairs) of a graph Laplacian matrix have been widely used for spectral clustering and community detection. However, in real-life applications the number of clusters or communities (say, K) is generally unknown a-priori. Consequently, the majority of the existing methods either choose K heuristically or they repeat
the clustering method with different choices of K and accept the best clustering result. The first option, more often, yields suboptimal result, while the second option is computationally expensive. In this work, we propose an incremental method for constructing the eigenspectrum of the graph Laplacian matrix. This method leverages the eigenstructure of graph Laplacian matrix to obtain the K-th smallest eigenpair of the Laplacian matrix given a collection of all previously computed K − 1 smallest eigenpairs. Our proposed method adapts the Laplacian matrix such that the batch eigenvalue decomposition problem transforms into an efficient sequential leading eigenpair computation problem. As a practical application, we consider user-guided spectral clustering. Specifically, we demonstrate that users can utilize the proposed incremental method for effective eigenpair computation and for determining the desired number of clusters based on multiple clustering metrics.
Other authors

Patents

Issued April 23, 2020 US 16528467

Automated systems and methods for determining jobs, skills, and training recommendations are disclosed. An example system includes one or more processors of an employment website entity. The one or more processors are configured to extract job information and skill information from resumes in a resume database, generate a job transition graph based on the job information, generate a job-skill graph based on the job information and the skill information, and generate a skill re-occurrence graph…
Automated systems and methods for determining jobs, skills, and training recommendations are disclosed. An example system includes one or more processors of an employment website entity. The one or more processors are configured to extract job information and skill information from resumes in a resume database, generate a job transition graph based on the job information, generate a job-skill graph based on the job information and the skill information, and generate a skill re-occurrence graph based on the skill information. The one or more processors are configured to determine a jobs matrix by minimizing an objective function based on the job transition graph, the job-skill graph, and the skill re-occurrence graph. The one or more processors are configured to retrieve next job recommendations from the jobs matrix based on a the current job position of a candidate and present, via an employment app, the next-job recommendations for the candidate.
Other inventors

Languages

Professional working proficiency

Native or bilingual proficiency

Recommendations received

More activity by Baichuan

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Baichuan Zhang

91 others named Baichuan Zhang are on LinkedIn

See others named Baichuan Zhang

Add new skills with these courses