Sally Goldman - Academia.edu (original) (raw)

Papers by Sally Goldman

Research paper thumbnail of The power of self-directed learning

Machine Learning, 1994

This article studies self-directed learning, a variant of the on-line (or incremental) learning m... more This article studies self-directed learning, a variant of the on-line (or incremental) learning model in which the learner selects the presentation order for the instances. Alternatively, one can view this model as a variation of learning with membership queries in which the learner is only "charged" for membership queries for which it could not predict the outcome. We give tight bounds on the complexity of self-directed learning for the concept classes of monomials, monotone DNF formulas, and axis-parallel rectangles in {0, 1,. .., n-1} 6. These results demonstrate that the number of mistakes under self-directed learning can be surprisingly small. We then show that learning complexity in the model of self-directed learning is less than that of all other commonly studied on-line and query learning models. Next we explore the relationship between the complexity of self-directed learning and the Vapnik-Chervonenkis (VC-)dimension. We show that, in general, the VC-dimension and the self-directed learning complexity are incomparable. However, for some special cases, we show that the VC-dimension gives a lower bound for the self-directed learning complexity. Finally, we explore a relationship between Mitchell's version space algorithm and the existence of self-directed learning algorithms that make few mistakes.

Research paper thumbnail of Asking queries to minimize errors

Research paper thumbnail of Smartacking: improving TCP performance from the receiving end

Page 1. 6 JOURNAL OF INTERNET ENGINEERING, VOL. 1, NO. 1, JANUARY 2007 Smartacking: Improving TCP... more Page 1. 6 JOURNAL OF INTERNET ENGINEERING, VOL. 1, NO. 1, JANUARY 2007 Smartacking: Improving TCP Performance from the Receiving End Daniel K. Blandford, Sally A. Goldman, Sergey Gorinsky, Yan Zhou, and Daniel R. Dooly ...

Research paper thumbnail of A non-iterative maximum entropy algorithm

Research paper thumbnail of Real-valued multiple-instance learning with queries

Proceedings of the 12th International Conference on Algorithmic Learning Theory, Nov 25, 2001

While there has been a significant amount of theoretical and empirical research on the multiple-i... more While there has been a significant amount of theoretical and empirical research on the multiple-instance learning model, most of this research is for concept learning. However, for the important application area of drug discovery, a real-valued classification is preferable. In this paper we initiate a theoretical study of real-valued multiple-instance learning. We prove that the problem of finding a target point consistent with a set of labeled multiple-instance examples (or bags) is NP-complete, and that the problem of learning from real-valued multiple-instance examples is as hard as learning DNF. Another contribution of our work is in defining and studying a multiple-instance membership query (MI-MQ). We give a positive result on exactly learning the target point for a multiple-instance problem in which the learner is provided with a MI-MQ oracle and a single adversarially selected bag.

Research paper thumbnail of Content-Based Image Retrieval Using Multiple-Instance Learning

Proceedings of the Nineteenth International Conference on Machine Learning, Jul 8, 2002

Research paper thumbnail of Exact identification of circuits using fixed points of amplification functions

Siamcomp, 1993

Page 1. Exact Identification of Circuits Using Fixed Points of Amplification Functions (Extended ... more Page 1. Exact Identification of Circuits Using Fixed Points of Amplification Functions (Extended Abstract) Sally A. Goldman Michael J. Kearns Robert E. Schapire Laboratory for Computer Science Massachusetts Inst it ute of Technology Cambridge, Massachusetts 02139 ...

Research paper thumbnail of EM-DD: an improved multi-instance learning technique

Research paper thumbnail of A Theoretical and Empirical Study of a Noise-Tolerant Algorithm to Learn Geometric Patterns (Extended Abstract)

Research paper thumbnail of A Theoretical and Empirical Study of a Noise-Tolerant Algorithm to Learn Geometric Patterns

Ml, 1999

Developing the ability to recognize a landmark from a visual image of a robot's current location ... more Developing the ability to recognize a landmark from a visual image of a robot's current location is a fundamental problem in robotics. We describe a way in which the landmark matching problem can be mapped to that of learning a one-dimensional geometric pattern. The first contribution of our work is an efficient noisetolerant algorithm (designed using the statistical query model) to PAC learn the class of one-dimensional geometric patterns. The second contribution of our work is an empirical study of our algorithm that provides some evidence that statistical query algorithms may be valuable for use in practice for handling noisy data.

Research paper thumbnail of Agnostic learning of geometric patterns (extended abstract)

Proceedings of the tenth annual conference on Computational learning theory - COLT '97, 1997

Research paper thumbnail of Learning unions of boxes with membership and equivalence queries

Proceedings of the seventh annual conference on Computational learning theory - COLT '94, 1994

Page 1. Learning Unions of Boxes with Membership and Equivalence Queries Paul W. Goldberg* Sally ... more Page 1. Learning Unions of Boxes with Membership and Equivalence Queries Paul W. Goldberg* Sally A. Goldmant H. David Mathias Department 1423 Dept. of Computer Science Dept. of Computer Science Sandia National ...

Research paper thumbnail of Learning k -term DNF formulas with an incomplete membership oracle

Proceedings of the fifth annual workshop on Computational learning theory - COLT '92, 1992

Page 1. Learning k-term DNF Formulas with an Incomplete Membership Oracle Sally A. Goldman Depart... more Page 1. Learning k-term DNF Formulas with an Incomplete Membership Oracle Sally A. Goldman Department of Computer Science Washington University St. Louis, MO 63130 sg@cs.wustl.edu H. David Mathias Department of Computer Science Washington University St. ...

Research paper thumbnail of Making Maximum Entropy Computations Easier By Adding Extra Constraints (Extended Abstract)

Maximum-Entropy and Bayesian Methods in Science and Engineering, 1988

ABSTRACT This paper presents a new way to compute the probability distribution with maximum entro... more ABSTRACT This paper presents a new way to compute the probability distribution with maximum entropy satisfying a set of constraints. Unlike previous approaches, our method is integrated with the planning of data collection and tabulation. We show how adding constraints and performing the associated additional tabulations can substantially speed up computation by replacing the usual iterative techniques with a straight-forward computation. These extra constraints are shown to correspond to the intermediate tables used in Cheeseman's method. We also show that the class of constraint graphs that our method handles is a proper generalization of Pearl's singly-connected networks. An open problem is to determine a minimal set of constraints necessary to make a hypergraph acyclic. We conjecture that this problem is NP-complete, and discuss heuristics to approximate the optimal solution.

Research paper thumbnail of Exact Identification of Circuits Using Fixed Points of Amplification Functions

Colt Proceedings 1990, 1990

Page 1. Exact Identification of Circuits Using Fixed Points of Amplification Functions (Extended ... more Page 1. Exact Identification of Circuits Using Fixed Points of Amplification Functions (Extended Abstract) Sally A. Goldman Michael J. Kearns Robert E. Schapire Laboratory for Computer Science Massachusetts Inst it ute of Technology Cambridge, Massachusetts 02139 ...

Research paper thumbnail of University of Massachusetts/Hughes: Description of the CIRCUS system as used in MUC5

Message Understanding Conference, 1993

Research paper thumbnail of UMass/Hughes

Proceedings of the 5th conference on Message understanding - MUC5 '93, 1993

The primary goal of our effort is the development of robust and portable language processin g cap... more The primary goal of our effort is the development of robust and portable language processin g capabilities for information extraction applications. The system under evaluation here is based on language processing components that have demonstrated strong performance capabilities in previous evaluation s [ ] . Having demonstrated the general viability of these techniques, we are no w concentrating on the practicality of our technology by creating trainable system components to replac e hand-coded data and manually-engineered software.

Research paper thumbnail of Meta-Evaluation of Image Segmentation Using Machine Learning

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1 (CVPR'06), 2006

Image segmentation is a fundamental step in many computer vision applications. Generally, the cho... more Image segmentation is a fundamental step in many computer vision applications. Generally, the choice of a segmentation algorithm, or parameterization of a given algorithm, is selected at the application level and fixed for all images within that application. Our goal is to create a stand-alone method to evaluate segmentation quality. Stand-alone methods have the advantage that they do not require a manually-segmented reference image for comparison, and can therefore be used for real-time evaluation. Current stand-alone evaluation methods often work well for some types of images, but poorly for others. We propose a meta-evaluation method in which any set of base evaluation methods are combined by a machine learning algorithm that coalesces their evaluations based on a learned weighting function, which depends upon the image to be segmented. The training data used by the machine learning algorithm can be labeled by a human, based on similarity to a human-generated reference segmentation, or based upon system-level performance. Experimental results demonstrate that our method performs better than the existing stand-alone segmentation evaluation methods.

Research paper thumbnail of Computational Learning Theory

Chapman & Hall/CRC Applied Algorithms and Data Structures series, 1998

Research paper thumbnail of A Non-Iterative Maximum Entropy Algorithm

Machine Intelligence and Pattern Recognition, 1988

Research paper thumbnail of The power of self-directed learning

Machine Learning, 1994

This article studies self-directed learning, a variant of the on-line (or incremental) learning m... more This article studies self-directed learning, a variant of the on-line (or incremental) learning model in which the learner selects the presentation order for the instances. Alternatively, one can view this model as a variation of learning with membership queries in which the learner is only "charged" for membership queries for which it could not predict the outcome. We give tight bounds on the complexity of self-directed learning for the concept classes of monomials, monotone DNF formulas, and axis-parallel rectangles in {0, 1,. .., n-1} 6. These results demonstrate that the number of mistakes under self-directed learning can be surprisingly small. We then show that learning complexity in the model of self-directed learning is less than that of all other commonly studied on-line and query learning models. Next we explore the relationship between the complexity of self-directed learning and the Vapnik-Chervonenkis (VC-)dimension. We show that, in general, the VC-dimension and the self-directed learning complexity are incomparable. However, for some special cases, we show that the VC-dimension gives a lower bound for the self-directed learning complexity. Finally, we explore a relationship between Mitchell's version space algorithm and the existence of self-directed learning algorithms that make few mistakes.

Research paper thumbnail of Asking queries to minimize errors

Research paper thumbnail of Smartacking: improving TCP performance from the receiving end

Page 1. 6 JOURNAL OF INTERNET ENGINEERING, VOL. 1, NO. 1, JANUARY 2007 Smartacking: Improving TCP... more Page 1. 6 JOURNAL OF INTERNET ENGINEERING, VOL. 1, NO. 1, JANUARY 2007 Smartacking: Improving TCP Performance from the Receiving End Daniel K. Blandford, Sally A. Goldman, Sergey Gorinsky, Yan Zhou, and Daniel R. Dooly ...

Research paper thumbnail of A non-iterative maximum entropy algorithm

Research paper thumbnail of Real-valued multiple-instance learning with queries

Proceedings of the 12th International Conference on Algorithmic Learning Theory, Nov 25, 2001

While there has been a significant amount of theoretical and empirical research on the multiple-i... more While there has been a significant amount of theoretical and empirical research on the multiple-instance learning model, most of this research is for concept learning. However, for the important application area of drug discovery, a real-valued classification is preferable. In this paper we initiate a theoretical study of real-valued multiple-instance learning. We prove that the problem of finding a target point consistent with a set of labeled multiple-instance examples (or bags) is NP-complete, and that the problem of learning from real-valued multiple-instance examples is as hard as learning DNF. Another contribution of our work is in defining and studying a multiple-instance membership query (MI-MQ). We give a positive result on exactly learning the target point for a multiple-instance problem in which the learner is provided with a MI-MQ oracle and a single adversarially selected bag.

Research paper thumbnail of Content-Based Image Retrieval Using Multiple-Instance Learning

Proceedings of the Nineteenth International Conference on Machine Learning, Jul 8, 2002

Research paper thumbnail of Exact identification of circuits using fixed points of amplification functions

Siamcomp, 1993

Page 1. Exact Identification of Circuits Using Fixed Points of Amplification Functions (Extended ... more Page 1. Exact Identification of Circuits Using Fixed Points of Amplification Functions (Extended Abstract) Sally A. Goldman Michael J. Kearns Robert E. Schapire Laboratory for Computer Science Massachusetts Inst it ute of Technology Cambridge, Massachusetts 02139 ...

Research paper thumbnail of EM-DD: an improved multi-instance learning technique

Research paper thumbnail of A Theoretical and Empirical Study of a Noise-Tolerant Algorithm to Learn Geometric Patterns (Extended Abstract)

Research paper thumbnail of A Theoretical and Empirical Study of a Noise-Tolerant Algorithm to Learn Geometric Patterns

Ml, 1999

Developing the ability to recognize a landmark from a visual image of a robot's current location ... more Developing the ability to recognize a landmark from a visual image of a robot's current location is a fundamental problem in robotics. We describe a way in which the landmark matching problem can be mapped to that of learning a one-dimensional geometric pattern. The first contribution of our work is an efficient noisetolerant algorithm (designed using the statistical query model) to PAC learn the class of one-dimensional geometric patterns. The second contribution of our work is an empirical study of our algorithm that provides some evidence that statistical query algorithms may be valuable for use in practice for handling noisy data.

Research paper thumbnail of Agnostic learning of geometric patterns (extended abstract)

Proceedings of the tenth annual conference on Computational learning theory - COLT '97, 1997

Research paper thumbnail of Learning unions of boxes with membership and equivalence queries

Proceedings of the seventh annual conference on Computational learning theory - COLT '94, 1994

Page 1. Learning Unions of Boxes with Membership and Equivalence Queries Paul W. Goldberg* Sally ... more Page 1. Learning Unions of Boxes with Membership and Equivalence Queries Paul W. Goldberg* Sally A. Goldmant H. David Mathias Department 1423 Dept. of Computer Science Dept. of Computer Science Sandia National ...

Research paper thumbnail of Learning k -term DNF formulas with an incomplete membership oracle

Proceedings of the fifth annual workshop on Computational learning theory - COLT '92, 1992

Page 1. Learning k-term DNF Formulas with an Incomplete Membership Oracle Sally A. Goldman Depart... more Page 1. Learning k-term DNF Formulas with an Incomplete Membership Oracle Sally A. Goldman Department of Computer Science Washington University St. Louis, MO 63130 sg@cs.wustl.edu H. David Mathias Department of Computer Science Washington University St. ...

Research paper thumbnail of Making Maximum Entropy Computations Easier By Adding Extra Constraints (Extended Abstract)

Maximum-Entropy and Bayesian Methods in Science and Engineering, 1988

ABSTRACT This paper presents a new way to compute the probability distribution with maximum entro... more ABSTRACT This paper presents a new way to compute the probability distribution with maximum entropy satisfying a set of constraints. Unlike previous approaches, our method is integrated with the planning of data collection and tabulation. We show how adding constraints and performing the associated additional tabulations can substantially speed up computation by replacing the usual iterative techniques with a straight-forward computation. These extra constraints are shown to correspond to the intermediate tables used in Cheeseman's method. We also show that the class of constraint graphs that our method handles is a proper generalization of Pearl's singly-connected networks. An open problem is to determine a minimal set of constraints necessary to make a hypergraph acyclic. We conjecture that this problem is NP-complete, and discuss heuristics to approximate the optimal solution.

Research paper thumbnail of Exact Identification of Circuits Using Fixed Points of Amplification Functions

Colt Proceedings 1990, 1990

Page 1. Exact Identification of Circuits Using Fixed Points of Amplification Functions (Extended ... more Page 1. Exact Identification of Circuits Using Fixed Points of Amplification Functions (Extended Abstract) Sally A. Goldman Michael J. Kearns Robert E. Schapire Laboratory for Computer Science Massachusetts Inst it ute of Technology Cambridge, Massachusetts 02139 ...

Research paper thumbnail of University of Massachusetts/Hughes: Description of the CIRCUS system as used in MUC5

Message Understanding Conference, 1993

Research paper thumbnail of UMass/Hughes

Proceedings of the 5th conference on Message understanding - MUC5 '93, 1993

The primary goal of our effort is the development of robust and portable language processin g cap... more The primary goal of our effort is the development of robust and portable language processin g capabilities for information extraction applications. The system under evaluation here is based on language processing components that have demonstrated strong performance capabilities in previous evaluation s [ ] . Having demonstrated the general viability of these techniques, we are no w concentrating on the practicality of our technology by creating trainable system components to replac e hand-coded data and manually-engineered software.

Research paper thumbnail of Meta-Evaluation of Image Segmentation Using Machine Learning

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1 (CVPR'06), 2006

Image segmentation is a fundamental step in many computer vision applications. Generally, the cho... more Image segmentation is a fundamental step in many computer vision applications. Generally, the choice of a segmentation algorithm, or parameterization of a given algorithm, is selected at the application level and fixed for all images within that application. Our goal is to create a stand-alone method to evaluate segmentation quality. Stand-alone methods have the advantage that they do not require a manually-segmented reference image for comparison, and can therefore be used for real-time evaluation. Current stand-alone evaluation methods often work well for some types of images, but poorly for others. We propose a meta-evaluation method in which any set of base evaluation methods are combined by a machine learning algorithm that coalesces their evaluations based on a learned weighting function, which depends upon the image to be segmented. The training data used by the machine learning algorithm can be labeled by a human, based on similarity to a human-generated reference segmentation, or based upon system-level performance. Experimental results demonstrate that our method performs better than the existing stand-alone segmentation evaluation methods.

Research paper thumbnail of Computational Learning Theory

Chapman & Hall/CRC Applied Algorithms and Data Structures series, 1998

Research paper thumbnail of A Non-Iterative Maximum Entropy Algorithm

Machine Intelligence and Pattern Recognition, 1988