tam le - Academia.edu (original) (raw)

Papers by tam le

Research paper thumbnail of Tree-Sliced Variants of Wasserstein Distances

Cornell University - arXiv, Feb 1, 2019

Optimal transport (OT) theory defines a powerful set of tools to compare probability distribution... more Optimal transport (OT) theory defines a powerful set of tools to compare probability distributions. OT suffers however from a few drawbacks, computational and statistical, which have encouraged the proposal of several regularized variants of OT in the recent literature, one of the most notable being the sliced formulation, which exploits the closed-form formula between univariate distributions by projecting high-dimensional measures onto random lines. We consider in this work a more general family of ground metrics, namely tree metrics, which also yield fast closed-form computations and negative definite, and of which the sliced-Wasserstein distance is a particular case (the tree is a chain). We propose the tree-sliced Wasserstein distance, computed by averaging the Wasserstein distance between these measures using random tree metrics, built adaptively in either low or high-dimensional spaces. Exploiting the negative definiteness of that distance, we also propose a positive definite kernel, and test it against other baselines on a few benchmark tasks.

Research paper thumbnail of Flow-based Alignment Approaches for Probability Measures in Different Spaces

Cornell University - arXiv, Oct 10, 2019

Gromov-Wasserstein (GW) is a powerful tool to compare probability measures whose supports are in ... more Gromov-Wasserstein (GW) is a powerful tool to compare probability measures whose supports are in different metric spaces. GW suffers however from a computational drawback since it requires to solve a complex non-convex quadratic program. We consider in this work a specific family of cost metrics, namely tree metrics for a space of supports of each probability measure, and aim for developing efficient and scalable discrepancies between the probability measures. By leveraging a tree structure, we propose to align flows from a root to each support instead of pair-wise tree metrics of supports, i.e., flows from a support to another, in GW. Consequently, we propose a novel discrepancy, named Flow-based Alignment (FlowAlign), by matching the flows of the probability measures. We show that FlowAlign shares a similar structure as a univariate optimal transport distance. Therefore, FlowAlign is fast for computation and scalable for large-scale applications. By further exploring tree structures, we propose a variant of FlowAlign, named Depth-based Alignment (DepthAlign), by aligning the flows hierarchically along each depth level of the tree structures. Theoretically, we prove that both FlowAlign and DepthAlign are pseudo-distances. Moreover, we also derive tree-sliced variants, computed by averaging the corresponding FlowAlign / DepthAlign using random tree metrics, built adaptively in spaces of supports. Empirically, we test our proposed discrepancies against other baselines on some benchmark tasks. * Equal contribution. Preprint. Under review.

Research paper thumbnail of Subgradient sampling for nonsmooth nonconvex minimization

Cornell University - arXiv, Feb 28, 2022

Risk minimization for nonsmooth nonconvex problems naturally leads to firstorder sampling or, by ... more Risk minimization for nonsmooth nonconvex problems naturally leads to firstorder sampling or, by an abuse of terminology, to stochastic subgradient descent. We establish the convergence of this method in the path-differentiable case, and describe more precise results under additional geometric assumptions. We recover and improve results from Ermoliev-Norkin [1] by using a different approach: conservative calculus and the ODE method. In the definable case, we show that first-order subgradient sampling avoids artificial critical point with probability one and applies moreover to a large range of risk minimization problems in deep learning, based on the backpropagation oracle. As byproducts of our approach, we obtain several results on integration of independent interest, such as an interchange result for conservative derivatives and integrals, or the definability of set-valued parameterized integrals.

Research paper thumbnail of Point-set Distances for Learning Representations of 3D Point Clouds

2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021

Learning an effective representation of 3D point clouds requires a good metric to measure the dis... more Learning an effective representation of 3D point clouds requires a good metric to measure the discrepancy between two 3D point sets, which is non-trivial due to their irregularity. Most of the previous works resort to using the Chamfer discrepancy or Earth Mover's distance, but those metrics are either ineffective in measuring the differences between point clouds or computationally expensive. In this paper, we conduct a systematic study with extensive experiments on distance metrics for 3D point clouds. From this study, we propose to use sliced Wasserstein distance and its variants for learning representations of 3D point clouds. In addition, we introduce a new algorithm to estimate sliced Wasserstein distance that guarantees that the estimated value is close enough to the true one. Experiments show that the sliced Wasserstein distance and its variants allow the neural network to learn a more efficient representation compared to the Chamfer discrepancy. We demonstrate the efficiency of the sliced Wasserstein metric and its variants on several tasks in 3D computer vision including training a point cloud autoencoder, generative modeling, transfer learning, and point cloud registration.

Research paper thumbnail of Fast Tree Variants of Gromov-Wasserstein

Gromov-Wasserstein (GW) is a powerful tool to compare probability measures whose supports are in ... more Gromov-Wasserstein (GW) is a powerful tool to compare probability measures whose supports are in different metric spaces. GW suffers however from a computational drawback since it requires to solve a complex non-convex quadratic program. We consider in this work a specific family of ground metrics, namely tree metrics for a space of supports of each probability measure in GW. By leveraging a tree structure, we propose to use flows from a root to each support to represent a probability measure whose supports are in a tree metric space. We consequently propose a novel tree variant of GW, namely flow-based tree GW (), by matching the flows of the probability measures. We then show that shares a similar structure as a univariate optimal transport distance. Therefore, is fast for computation and can scale up for large-scale applications. In order to further explore tree structures, we propose another tree variant of GW, namely depth-based tree GW (), by aligning the flows of the probab...

Research paper thumbnail of On Scalable Variant of Wasserstein Barycenter

ArXiv, 2019

We study a variant of Wasserstein barycenter problem, which we refer to as \emph{tree-sliced Wass... more We study a variant of Wasserstein barycenter problem, which we refer to as \emph{tree-sliced Wasserstein barycenter}, by leveraging the structure of tree metrics for the ground metrics in the formulation of Wasserstein distance. Drawing on the tree structure, we propose efficient algorithms for solving the unconstrained and constrained versions of tree-sliced Wasserstein barycenter. The algorithms have fast computational time and efficient memory usage, especially for high dimensional settings while demonstrating favorable results when the tree metrics are appropriately constructed. Experimental results on large-scale synthetic and real datasets from Wasserstein barycenter for documents with word embedding, multilevel clustering, and scalable Bayes problems show the advantages of tree-sliced Wasserstein barycenter over (Sinkhorn) Wasserstein barycenter.

Research paper thumbnail of Optimal Transport Kernels for Sequential and Parallel Neural Architecture Search

ArXiv, 2021

Neural architecture search (NAS) automates the design of deep neural networks. One of the main ch... more Neural architecture search (NAS) automates the design of deep neural networks. One of the main challenges in searching complex and non-continuous architectures is to compare the similarity of networks that the conventional Euclidean metric may fail to capture. Optimal transport (OT) is resilient to such complex structure by considering the minimal cost for transporting a network into another. However, the OT is generally not negative definite which may limit its ability to build the positive-definite kernels required in many kernel-dependent frameworks. Building upon tree-Wasserstein (TW), which is a negative definite variant of OT, we develop a novel discrepancy for neural architectures, and demonstrate it within a Gaussian process surrogate model for the sequential NAS settings. Furthermore, we derive a novel parallel NAS, using quality k-determinantal point process on the GP posterior, to select diverse and high-performing architectures from a discrete set of candidates. Empirica...

Research paper thumbnail of Entropy Partial Transport with Tree Metrics: Theory and Practice

Optimal transport (OT) theory provides powerful tools to compare probability measures. However, O... more Optimal transport (OT) theory provides powerful tools to compare probability measures. However, OT is limited to nonnegative measures having the same mass, and suffers serious drawbacks about its computation and statistics. This leads to several proposals of regularized variants of OT in the recent literature. In this work, we consider an entropy partial transport (EPT) problem for nonnegative measures on a tree having different masses. The EPT is shown to be equivalent to a standard complete OT problem on a one-node extended tree. We derive its dual formulation, then leverage this to propose a novel regularization for EPT which admits fast computation and negative definiteness. To our knowledge, the proposed regularized EPT is the first approach that yields a closed-form solution among available variants of unbalanced OT for general nonnegative measures. For practical applications without prior knowledge about the tree structure for measures, we propose tree-sliced variants of the ...

Research paper thumbnail of Image Categorization Using Hierarchical Spatial Matching Kernel

The Journal of the Institute of Image Electronics Engineers of Japan, 2013

Research paper thumbnail of Unsupervised Riemannian Metric Learning for Histograms Using Aitchison Transformations

Many applications in machine learning handle bags of features or histograms rather than simple ve... more Many applications in machine learning handle bags of features or histograms rather than simple vectors. In that context, defining a proper geometry to compare histograms can be crucial for many machine learning algorithms. While one might be tempted to use a default metric such as the Euclidean metric, empirical evidence shows this may not be the best choice when dealing with observations that lie in the probability simplex. Additionally, it might be desirable to choose a metric adaptively based on data. We consider in this paper the problem of learning a Riemannian metric on the simplex given unlabeled histogram data. We follow the approach of Lebanon (2006), who proposed to estimate such a metric within a parametric family by maximizing the inverse volume of a given data set of points under that metric. The metrics we consider on the multinomial simplex are pull-back metrics of the Fisher information parameterized by operations within the simplex known as Aitchison (1982) transfor...

Research paper thumbnail of A Lightweight Block Validation Method for Resource-Constrained IoT Devices in Blockchain-Based Applications

2019 IEEE 20th International Symposium on "A World of Wireless, Mobile and Multimedia Networks" (WoWMoM), 2019

Secure access control to a wide variety of Internet of Things (IoT) devices has become critical. ... more Secure access control to a wide variety of Internet of Things (IoT) devices has become critical. Blockchain-based access control frameworks are promising technologies to support secure access to IoT devices in pervasive computing applications. However, in most of the proposed solutions, the IoT devices rely on a trusted server to retrieve critical access control data from the blockchains. We propose a method for IoT devices to validate blockchain data without solely being dependent on a central server. In our approach, several witnesses on the network can be selected randomly by the devices to validate access control information. Our method is aided by Bloom filters, which are shown to be lightweight for resource-constrained devices.

Research paper thumbnail of Clinical evidence in the treatment of white spot lesions following fixed orthodontic therapy: a meta-analysis

Australasian Orthodontic Journal, 2021

Objective This systematic review aims to determine the most effective method of treatment to remi... more Objective This systematic review aims to determine the most effective method of treatment to remineralise post-orthodontic white spot lesions (WSLs). Method Six databases were accessed and searched for articles. Screening and selection were conducted according to the PRISMA guidelines using predetermined inclusion and exclusion criteria. Two reviewers independently assessed and extracted identified studies and relevance disagreement was resolved through consensus. Experimental studies were included that involved (i) patients of any age who had WSLs after the removal of fixed appliances, (ii) any treatment to remineralise the WSLs compared with no treatment or a placebo, and (iii) measurement of the changes in enamel mineralisation status after treatment. Eligible articles were assessed for internal bias and underwent narrative synthesis. A meta-analysis using random-effects modelling was performed to calculate a pooled estimate and assess between-study variability using Cochran’s Q ...

Research paper thumbnail of LSMI-Sinkhorn: Semi-supervised Mutual Information Estimation with Optimal Transport

Machine Learning and Knowledge Discovery in Databases. Research Track, 2021

Estimating mutual information is an important statistics and machine learning problem. To estimat... more Estimating mutual information is an important statistics and machine learning problem. To estimate the mutual information from data, a common practice is preparing a set of paired samples {(xi, yi)} n i=1 i.i.d. ∼ p(x, y). However, in many situations, it is difficult to obtain a large number of data pairs. To address this problem, we propose the semisupervised Squared-loss Mutual Information (SMI) estimation method using a small number of paired samples and the available unpaired ones. We first represent SMI through the density ratio function, where the expectation is approximated by the samples from marginals and its assignment parameters. The objective is formulated using the optimal transport problem and quadratic programming. Then, we introduce the Least-Squares Mutual Information with Sinkhorn (LSMI-Sinkhorn) algorithm for efficient optimization. Through experiments, we first demonstrate that the proposed method can estimate the SMI without a large number of paired samples. Then, we show the effectiveness of the proposed LSMI-Sinkhorn algorithm on various types of machine learning problems such as image matching and photo album summarization. Code

Research paper thumbnail of CXCL10/CXCR3 signaling contributes to an inflammatory microenvironment and its blockade enhances progression of murine pancreatic precancerous lesions

eLife, 2021

The development of pancreatic cancer requires recruitment and activation of different macrophage ... more The development of pancreatic cancer requires recruitment and activation of different macrophage populations. However, little is known about how macrophages are attracted to the pancreas after injury or an oncogenic event, and how they crosstalk with lesion cells or other cells of the lesion microenvironment. Here, we delineate the importance of CXCL10/CXCR3 signaling during the early phase of murine pancreatic cancer. We show that CXCL10 is produced by pancreatic precancerous lesion cells in response to IFNγ signaling and that inflammatory macrophages are recipients for this chemokine. CXCL10/CXCR3 signaling in macrophages mediates their chemoattraction to the pancreas, enhances their proliferation, and maintains their inflammatory identity. Blocking of CXCL10/CXCR3 signaling in vivo shifts macrophage populations to a tumor-promoting (Ym1+, Fizz+, Arg1+) phenotype, increases fibrosis, and mediates progression of lesions, highlighting the importance of this pathway in PDA developmen...

Research paper thumbnail of Physical Security Model Development of an Electrochemical Facility

Research paper thumbnail of AB106. P080. Interferon gamma-inducible protein 10 in pancreatic cancer progression

Annals of Pancreatic Cancer, 2018

Research paper thumbnail of A general representation scheme for crystalline solids based on Voronoi-tessellation real feature values and atomic property data

Science and technology of advanced materials, 2018

Increasing attention has been paid to materials informatics approaches that promise efficient and... more Increasing attention has been paid to materials informatics approaches that promise efficient and fast discovery and optimization of functional inorganic materials. Technical breakthrough is urgently requested to advance this field and efforts have been made in the development of materials descriptors to encode or represent characteristics of crystalline solids, such as chemical composition, crystal structure, electronic structure, etc. We propose a general representation scheme for crystalline solids that lifts restrictions on atom ordering, cell periodicity, and system cell size based on structural descriptors of directly binned Voronoi-tessellation real feature values and atomic/chemical descriptors based on the electronegativity of elements in the crystal. Comparison was made vs. radial distribution function (RDF) feature vector, in terms of predictive accuracy on density functional theory (DFT) material properties: cohesive energy (CE), density (), electronic band gap (BG), and...

Research paper thumbnail of Glycosylated RAFT polymers with varying PEG linkers produce different siRNA uptake, gene silencing and toxicity profiles

Biomacromolecules, Jan 23, 2017

Achieving efficient and targeted delivery of short interfering (siRNA) is an important research c... more Achieving efficient and targeted delivery of short interfering (siRNA) is an important research challenge to overcome to render highly promising siRNA therapies clinically successful. Challenges exist in designing synthetic carriers for these RNAi constructs that provide protection against serum degradation, extended blood retention times, effective cellular uptake through a variety of uptake mechanisms, endosomal escape and efficient cargo release. These challenges have resulted in a significant body of research, and led to many important findings about the chemical composition and structural layout of the delivery vector for optimal gene silencing. The challenge of targeted delivery vectors remains, and strategies to take advantage of nature's self-selective cellular uptake mechanisms for specific organ cells, such as the liver, have enabled researchers to step closer to achieving this goal. In this work we report the design, synthesis and biological evaluation of a novel poly...

Research paper thumbnail of Patterns of persistent HPV infection after treatment for cervical intraepithelial neoplasia (CIN): A systematic review

International journal of cancer, Jul 25, 2017

A systematic review of the literature was conducted to determine the estimates of and definitions... more A systematic review of the literature was conducted to determine the estimates of and definitions for human papillomavirus (HPV) persistence in women following treatment of cervical intra-epithelial neoplasia (CIN). A total of 45 studies presented data on post-treatment HPV persistence among 6,106 women. Most studies assessed HPV persistence after loop excision (42%), followed by conization (7%), cryotherapy (11%), laser treatment (4%), interferon-alpha, therapeutic vaccination, and photodynamic therapy (2% each) and mixed treatment (38%). Baseline HPV testing was conducted before or at treatment for most studies (96%). Follow-up HPV testing ranged from 1.5 to 80 months after baseline. Median HPV persistence tended to decrease with increasing follow-up time, declining from 27% at 3 months after treatment to 21% at 6 months, 15% at 12 months, and 10% at 24 months. Post-treatment HPV persistence estimates varied widely and were influenced by patient age, HPV-type, detection method, tr...

Research paper thumbnail of System and method for securing cargo to a load bearing surface

Research paper thumbnail of Tree-Sliced Variants of Wasserstein Distances

Cornell University - arXiv, Feb 1, 2019

Optimal transport (OT) theory defines a powerful set of tools to compare probability distribution... more Optimal transport (OT) theory defines a powerful set of tools to compare probability distributions. OT suffers however from a few drawbacks, computational and statistical, which have encouraged the proposal of several regularized variants of OT in the recent literature, one of the most notable being the sliced formulation, which exploits the closed-form formula between univariate distributions by projecting high-dimensional measures onto random lines. We consider in this work a more general family of ground metrics, namely tree metrics, which also yield fast closed-form computations and negative definite, and of which the sliced-Wasserstein distance is a particular case (the tree is a chain). We propose the tree-sliced Wasserstein distance, computed by averaging the Wasserstein distance between these measures using random tree metrics, built adaptively in either low or high-dimensional spaces. Exploiting the negative definiteness of that distance, we also propose a positive definite kernel, and test it against other baselines on a few benchmark tasks.

Research paper thumbnail of Flow-based Alignment Approaches for Probability Measures in Different Spaces

Cornell University - arXiv, Oct 10, 2019

Gromov-Wasserstein (GW) is a powerful tool to compare probability measures whose supports are in ... more Gromov-Wasserstein (GW) is a powerful tool to compare probability measures whose supports are in different metric spaces. GW suffers however from a computational drawback since it requires to solve a complex non-convex quadratic program. We consider in this work a specific family of cost metrics, namely tree metrics for a space of supports of each probability measure, and aim for developing efficient and scalable discrepancies between the probability measures. By leveraging a tree structure, we propose to align flows from a root to each support instead of pair-wise tree metrics of supports, i.e., flows from a support to another, in GW. Consequently, we propose a novel discrepancy, named Flow-based Alignment (FlowAlign), by matching the flows of the probability measures. We show that FlowAlign shares a similar structure as a univariate optimal transport distance. Therefore, FlowAlign is fast for computation and scalable for large-scale applications. By further exploring tree structures, we propose a variant of FlowAlign, named Depth-based Alignment (DepthAlign), by aligning the flows hierarchically along each depth level of the tree structures. Theoretically, we prove that both FlowAlign and DepthAlign are pseudo-distances. Moreover, we also derive tree-sliced variants, computed by averaging the corresponding FlowAlign / DepthAlign using random tree metrics, built adaptively in spaces of supports. Empirically, we test our proposed discrepancies against other baselines on some benchmark tasks. * Equal contribution. Preprint. Under review.

Research paper thumbnail of Subgradient sampling for nonsmooth nonconvex minimization

Cornell University - arXiv, Feb 28, 2022

Risk minimization for nonsmooth nonconvex problems naturally leads to firstorder sampling or, by ... more Risk minimization for nonsmooth nonconvex problems naturally leads to firstorder sampling or, by an abuse of terminology, to stochastic subgradient descent. We establish the convergence of this method in the path-differentiable case, and describe more precise results under additional geometric assumptions. We recover and improve results from Ermoliev-Norkin [1] by using a different approach: conservative calculus and the ODE method. In the definable case, we show that first-order subgradient sampling avoids artificial critical point with probability one and applies moreover to a large range of risk minimization problems in deep learning, based on the backpropagation oracle. As byproducts of our approach, we obtain several results on integration of independent interest, such as an interchange result for conservative derivatives and integrals, or the definability of set-valued parameterized integrals.

Research paper thumbnail of Point-set Distances for Learning Representations of 3D Point Clouds

2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021

Learning an effective representation of 3D point clouds requires a good metric to measure the dis... more Learning an effective representation of 3D point clouds requires a good metric to measure the discrepancy between two 3D point sets, which is non-trivial due to their irregularity. Most of the previous works resort to using the Chamfer discrepancy or Earth Mover's distance, but those metrics are either ineffective in measuring the differences between point clouds or computationally expensive. In this paper, we conduct a systematic study with extensive experiments on distance metrics for 3D point clouds. From this study, we propose to use sliced Wasserstein distance and its variants for learning representations of 3D point clouds. In addition, we introduce a new algorithm to estimate sliced Wasserstein distance that guarantees that the estimated value is close enough to the true one. Experiments show that the sliced Wasserstein distance and its variants allow the neural network to learn a more efficient representation compared to the Chamfer discrepancy. We demonstrate the efficiency of the sliced Wasserstein metric and its variants on several tasks in 3D computer vision including training a point cloud autoencoder, generative modeling, transfer learning, and point cloud registration.

Research paper thumbnail of Fast Tree Variants of Gromov-Wasserstein

Gromov-Wasserstein (GW) is a powerful tool to compare probability measures whose supports are in ... more Gromov-Wasserstein (GW) is a powerful tool to compare probability measures whose supports are in different metric spaces. GW suffers however from a computational drawback since it requires to solve a complex non-convex quadratic program. We consider in this work a specific family of ground metrics, namely tree metrics for a space of supports of each probability measure in GW. By leveraging a tree structure, we propose to use flows from a root to each support to represent a probability measure whose supports are in a tree metric space. We consequently propose a novel tree variant of GW, namely flow-based tree GW (), by matching the flows of the probability measures. We then show that shares a similar structure as a univariate optimal transport distance. Therefore, is fast for computation and can scale up for large-scale applications. In order to further explore tree structures, we propose another tree variant of GW, namely depth-based tree GW (), by aligning the flows of the probab...

Research paper thumbnail of On Scalable Variant of Wasserstein Barycenter

ArXiv, 2019

We study a variant of Wasserstein barycenter problem, which we refer to as \emph{tree-sliced Wass... more We study a variant of Wasserstein barycenter problem, which we refer to as \emph{tree-sliced Wasserstein barycenter}, by leveraging the structure of tree metrics for the ground metrics in the formulation of Wasserstein distance. Drawing on the tree structure, we propose efficient algorithms for solving the unconstrained and constrained versions of tree-sliced Wasserstein barycenter. The algorithms have fast computational time and efficient memory usage, especially for high dimensional settings while demonstrating favorable results when the tree metrics are appropriately constructed. Experimental results on large-scale synthetic and real datasets from Wasserstein barycenter for documents with word embedding, multilevel clustering, and scalable Bayes problems show the advantages of tree-sliced Wasserstein barycenter over (Sinkhorn) Wasserstein barycenter.

Research paper thumbnail of Optimal Transport Kernels for Sequential and Parallel Neural Architecture Search

ArXiv, 2021

Neural architecture search (NAS) automates the design of deep neural networks. One of the main ch... more Neural architecture search (NAS) automates the design of deep neural networks. One of the main challenges in searching complex and non-continuous architectures is to compare the similarity of networks that the conventional Euclidean metric may fail to capture. Optimal transport (OT) is resilient to such complex structure by considering the minimal cost for transporting a network into another. However, the OT is generally not negative definite which may limit its ability to build the positive-definite kernels required in many kernel-dependent frameworks. Building upon tree-Wasserstein (TW), which is a negative definite variant of OT, we develop a novel discrepancy for neural architectures, and demonstrate it within a Gaussian process surrogate model for the sequential NAS settings. Furthermore, we derive a novel parallel NAS, using quality k-determinantal point process on the GP posterior, to select diverse and high-performing architectures from a discrete set of candidates. Empirica...

Research paper thumbnail of Entropy Partial Transport with Tree Metrics: Theory and Practice

Optimal transport (OT) theory provides powerful tools to compare probability measures. However, O... more Optimal transport (OT) theory provides powerful tools to compare probability measures. However, OT is limited to nonnegative measures having the same mass, and suffers serious drawbacks about its computation and statistics. This leads to several proposals of regularized variants of OT in the recent literature. In this work, we consider an entropy partial transport (EPT) problem for nonnegative measures on a tree having different masses. The EPT is shown to be equivalent to a standard complete OT problem on a one-node extended tree. We derive its dual formulation, then leverage this to propose a novel regularization for EPT which admits fast computation and negative definiteness. To our knowledge, the proposed regularized EPT is the first approach that yields a closed-form solution among available variants of unbalanced OT for general nonnegative measures. For practical applications without prior knowledge about the tree structure for measures, we propose tree-sliced variants of the ...

Research paper thumbnail of Image Categorization Using Hierarchical Spatial Matching Kernel

The Journal of the Institute of Image Electronics Engineers of Japan, 2013

Research paper thumbnail of Unsupervised Riemannian Metric Learning for Histograms Using Aitchison Transformations

Many applications in machine learning handle bags of features or histograms rather than simple ve... more Many applications in machine learning handle bags of features or histograms rather than simple vectors. In that context, defining a proper geometry to compare histograms can be crucial for many machine learning algorithms. While one might be tempted to use a default metric such as the Euclidean metric, empirical evidence shows this may not be the best choice when dealing with observations that lie in the probability simplex. Additionally, it might be desirable to choose a metric adaptively based on data. We consider in this paper the problem of learning a Riemannian metric on the simplex given unlabeled histogram data. We follow the approach of Lebanon (2006), who proposed to estimate such a metric within a parametric family by maximizing the inverse volume of a given data set of points under that metric. The metrics we consider on the multinomial simplex are pull-back metrics of the Fisher information parameterized by operations within the simplex known as Aitchison (1982) transfor...

Research paper thumbnail of A Lightweight Block Validation Method for Resource-Constrained IoT Devices in Blockchain-Based Applications

2019 IEEE 20th International Symposium on "A World of Wireless, Mobile and Multimedia Networks" (WoWMoM), 2019

Secure access control to a wide variety of Internet of Things (IoT) devices has become critical. ... more Secure access control to a wide variety of Internet of Things (IoT) devices has become critical. Blockchain-based access control frameworks are promising technologies to support secure access to IoT devices in pervasive computing applications. However, in most of the proposed solutions, the IoT devices rely on a trusted server to retrieve critical access control data from the blockchains. We propose a method for IoT devices to validate blockchain data without solely being dependent on a central server. In our approach, several witnesses on the network can be selected randomly by the devices to validate access control information. Our method is aided by Bloom filters, which are shown to be lightweight for resource-constrained devices.

Research paper thumbnail of Clinical evidence in the treatment of white spot lesions following fixed orthodontic therapy: a meta-analysis

Australasian Orthodontic Journal, 2021

Objective This systematic review aims to determine the most effective method of treatment to remi... more Objective This systematic review aims to determine the most effective method of treatment to remineralise post-orthodontic white spot lesions (WSLs). Method Six databases were accessed and searched for articles. Screening and selection were conducted according to the PRISMA guidelines using predetermined inclusion and exclusion criteria. Two reviewers independently assessed and extracted identified studies and relevance disagreement was resolved through consensus. Experimental studies were included that involved (i) patients of any age who had WSLs after the removal of fixed appliances, (ii) any treatment to remineralise the WSLs compared with no treatment or a placebo, and (iii) measurement of the changes in enamel mineralisation status after treatment. Eligible articles were assessed for internal bias and underwent narrative synthesis. A meta-analysis using random-effects modelling was performed to calculate a pooled estimate and assess between-study variability using Cochran’s Q ...

Research paper thumbnail of LSMI-Sinkhorn: Semi-supervised Mutual Information Estimation with Optimal Transport

Machine Learning and Knowledge Discovery in Databases. Research Track, 2021

Estimating mutual information is an important statistics and machine learning problem. To estimat... more Estimating mutual information is an important statistics and machine learning problem. To estimate the mutual information from data, a common practice is preparing a set of paired samples {(xi, yi)} n i=1 i.i.d. ∼ p(x, y). However, in many situations, it is difficult to obtain a large number of data pairs. To address this problem, we propose the semisupervised Squared-loss Mutual Information (SMI) estimation method using a small number of paired samples and the available unpaired ones. We first represent SMI through the density ratio function, where the expectation is approximated by the samples from marginals and its assignment parameters. The objective is formulated using the optimal transport problem and quadratic programming. Then, we introduce the Least-Squares Mutual Information with Sinkhorn (LSMI-Sinkhorn) algorithm for efficient optimization. Through experiments, we first demonstrate that the proposed method can estimate the SMI without a large number of paired samples. Then, we show the effectiveness of the proposed LSMI-Sinkhorn algorithm on various types of machine learning problems such as image matching and photo album summarization. Code

Research paper thumbnail of CXCL10/CXCR3 signaling contributes to an inflammatory microenvironment and its blockade enhances progression of murine pancreatic precancerous lesions

eLife, 2021

The development of pancreatic cancer requires recruitment and activation of different macrophage ... more The development of pancreatic cancer requires recruitment and activation of different macrophage populations. However, little is known about how macrophages are attracted to the pancreas after injury or an oncogenic event, and how they crosstalk with lesion cells or other cells of the lesion microenvironment. Here, we delineate the importance of CXCL10/CXCR3 signaling during the early phase of murine pancreatic cancer. We show that CXCL10 is produced by pancreatic precancerous lesion cells in response to IFNγ signaling and that inflammatory macrophages are recipients for this chemokine. CXCL10/CXCR3 signaling in macrophages mediates their chemoattraction to the pancreas, enhances their proliferation, and maintains their inflammatory identity. Blocking of CXCL10/CXCR3 signaling in vivo shifts macrophage populations to a tumor-promoting (Ym1+, Fizz+, Arg1+) phenotype, increases fibrosis, and mediates progression of lesions, highlighting the importance of this pathway in PDA developmen...

Research paper thumbnail of Physical Security Model Development of an Electrochemical Facility

Research paper thumbnail of AB106. P080. Interferon gamma-inducible protein 10 in pancreatic cancer progression

Annals of Pancreatic Cancer, 2018

Research paper thumbnail of A general representation scheme for crystalline solids based on Voronoi-tessellation real feature values and atomic property data

Science and technology of advanced materials, 2018

Increasing attention has been paid to materials informatics approaches that promise efficient and... more Increasing attention has been paid to materials informatics approaches that promise efficient and fast discovery and optimization of functional inorganic materials. Technical breakthrough is urgently requested to advance this field and efforts have been made in the development of materials descriptors to encode or represent characteristics of crystalline solids, such as chemical composition, crystal structure, electronic structure, etc. We propose a general representation scheme for crystalline solids that lifts restrictions on atom ordering, cell periodicity, and system cell size based on structural descriptors of directly binned Voronoi-tessellation real feature values and atomic/chemical descriptors based on the electronegativity of elements in the crystal. Comparison was made vs. radial distribution function (RDF) feature vector, in terms of predictive accuracy on density functional theory (DFT) material properties: cohesive energy (CE), density (), electronic band gap (BG), and...

Research paper thumbnail of Glycosylated RAFT polymers with varying PEG linkers produce different siRNA uptake, gene silencing and toxicity profiles

Biomacromolecules, Jan 23, 2017

Achieving efficient and targeted delivery of short interfering (siRNA) is an important research c... more Achieving efficient and targeted delivery of short interfering (siRNA) is an important research challenge to overcome to render highly promising siRNA therapies clinically successful. Challenges exist in designing synthetic carriers for these RNAi constructs that provide protection against serum degradation, extended blood retention times, effective cellular uptake through a variety of uptake mechanisms, endosomal escape and efficient cargo release. These challenges have resulted in a significant body of research, and led to many important findings about the chemical composition and structural layout of the delivery vector for optimal gene silencing. The challenge of targeted delivery vectors remains, and strategies to take advantage of nature's self-selective cellular uptake mechanisms for specific organ cells, such as the liver, have enabled researchers to step closer to achieving this goal. In this work we report the design, synthesis and biological evaluation of a novel poly...

Research paper thumbnail of Patterns of persistent HPV infection after treatment for cervical intraepithelial neoplasia (CIN): A systematic review

International journal of cancer, Jul 25, 2017

A systematic review of the literature was conducted to determine the estimates of and definitions... more A systematic review of the literature was conducted to determine the estimates of and definitions for human papillomavirus (HPV) persistence in women following treatment of cervical intra-epithelial neoplasia (CIN). A total of 45 studies presented data on post-treatment HPV persistence among 6,106 women. Most studies assessed HPV persistence after loop excision (42%), followed by conization (7%), cryotherapy (11%), laser treatment (4%), interferon-alpha, therapeutic vaccination, and photodynamic therapy (2% each) and mixed treatment (38%). Baseline HPV testing was conducted before or at treatment for most studies (96%). Follow-up HPV testing ranged from 1.5 to 80 months after baseline. Median HPV persistence tended to decrease with increasing follow-up time, declining from 27% at 3 months after treatment to 21% at 6 months, 15% at 12 months, and 10% at 24 months. Post-treatment HPV persistence estimates varied widely and were influenced by patient age, HPV-type, detection method, tr...

Research paper thumbnail of System and method for securing cargo to a load bearing surface