Jiang Hu - Academia.edu (original) (raw)
Related Authors
University of Maryland Baltimore County
Uploads
Papers by Jiang Hu
International Journal on …, 2002
While techniques for evaluating the performance of lower-level document analysis tasks such as op... more While techniques for evaluating the performance of lower-level document analysis tasks such as optical character recognition have gained acceptance in the literature, attempts to formalize the problem for higherlevel algorithms, while receiving a fair amount of attention in terms of theory, have generally been less successful in practice, perhaps owing to their complexity. In this paper, we introduce intuitive, easy-to-implement evaluation schemes for the related problems of table detection and table structure recognition. We also present the results of several small experiments, demonstrating how well the methodologies work and the useful sorts of feedback they provide. We first consider the table detection problem. Here algorithms can yield various classes of errors, including non-table regions improperly labeled as tables (insertion errors), tables missed completely (deletion errors), larger tables broken into a number of smaller ones (splitting errors), and groups of smaller tables combined to form larger ones (merging errors). This leads naturally to the use of an edit distance approach for assessing the results of table detection. Next we address the problem of evaluating table structure recognition. Our model is based on a directed acyclic attribute graph, or table DAG. We describe a new paradigm, "graph probing," for comparing the results returned by the recognition system and the representation created during ground-truthing. Probing is in fact a general concept that could be applied to other document recognition tasks as well.
icdar, 1997
A method for the automatic verification of on-line handwritten signatures using both global and l... more A method for the automatic verification of on-line handwritten signatures using both global and local fea-tures is described. The global and local features cap-ture various aspects of signature shape and dynamics of signature production. We demonstrate that with the ...
AMIA Joint Summits on Translational Science proceedings AMIA Summit on Translational Science, 2015
Personalized predictive models are customized for an individual patient and trained using informa... more Personalized predictive models are customized for an individual patient and trained using information from similar patients. Compared to global models trained on all patients, they have the potential to produce more accurate risk scores and capture more relevant risk factors for individual patients. This paper presents an approach for building personalized predictive models and generating personalized risk factor profiles. A locally supervised metric learning (LSML) similarity measure is trained for diabetes onset and used to find clinically similar patients. Personalized risk profiles are created by analyzing the parameters of the trained personalized logistic regression models. A 15,000 patient data set, derived from electronic health records, is used to evaluate the approach. The predictive results show that the personalized models can outperform the global model. Cluster analysis of the risk profiles show groups of patients with similar risk factors, differences in the top risk ...
International Journal on …, 2002
While techniques for evaluating the performance of lower-level document analysis tasks such as op... more While techniques for evaluating the performance of lower-level document analysis tasks such as optical character recognition have gained acceptance in the literature, attempts to formalize the problem for higherlevel algorithms, while receiving a fair amount of attention in terms of theory, have generally been less successful in practice, perhaps owing to their complexity. In this paper, we introduce intuitive, easy-to-implement evaluation schemes for the related problems of table detection and table structure recognition. We also present the results of several small experiments, demonstrating how well the methodologies work and the useful sorts of feedback they provide. We first consider the table detection problem. Here algorithms can yield various classes of errors, including non-table regions improperly labeled as tables (insertion errors), tables missed completely (deletion errors), larger tables broken into a number of smaller ones (splitting errors), and groups of smaller tables combined to form larger ones (merging errors). This leads naturally to the use of an edit distance approach for assessing the results of table detection. Next we address the problem of evaluating table structure recognition. Our model is based on a directed acyclic attribute graph, or table DAG. We describe a new paradigm, "graph probing," for comparing the results returned by the recognition system and the representation created during ground-truthing. Probing is in fact a general concept that could be applied to other document recognition tasks as well.
icdar, 1997
A method for the automatic verification of on-line handwritten signatures using both global and l... more A method for the automatic verification of on-line handwritten signatures using both global and local fea-tures is described. The global and local features cap-ture various aspects of signature shape and dynamics of signature production. We demonstrate that with the ...
AMIA Joint Summits on Translational Science proceedings AMIA Summit on Translational Science, 2015
Personalized predictive models are customized for an individual patient and trained using informa... more Personalized predictive models are customized for an individual patient and trained using information from similar patients. Compared to global models trained on all patients, they have the potential to produce more accurate risk scores and capture more relevant risk factors for individual patients. This paper presents an approach for building personalized predictive models and generating personalized risk factor profiles. A locally supervised metric learning (LSML) similarity measure is trained for diabetes onset and used to find clinically similar patients. Personalized risk profiles are created by analyzing the parameters of the trained personalized logistic regression models. A 15,000 patient data set, derived from electronic health records, is used to evaluate the approach. The predictive results show that the personalized models can outperform the global model. Cluster analysis of the risk profiles show groups of patients with similar risk factors, differences in the top risk ...