Trip Denton - Academia.edu (original) (raw)
Papers by Trip Denton
A common problem in computer science is how to represent a large dataset in a smaller more compac... more A common problem in computer science is how to represent a large dataset in a smaller more compact form. This thesis describes a generalized framework for selecting canonical subsets of data points that are highly representative of the original larger dataset. The contributions of the work are formulation of the subset selection problem as an optimization problem, an analysis of the complexity of the problem, the development of approximation algorithms to compute canonical subsets, and a demonstration of the utility of the algorithms in several problem domains.
Traditional model-driven engineering (MDE) techniques rely on a paradigm where systems are develo... more Traditional model-driven engineering (MDE) techniques rely on a paradigm where systems are developed using tightly coupled, monolithic modeling tools. Such monolithic modeling tools address many concerns, but operate largely in isolation of one another. As system size and complexity grow to become ultra-largescale (ULS) systems, it is becoming clear that no single monolithic modeling tool can capture all the concerns of an ULS system. It is therefore essential that isolated modeling tools collaborate with each other when realizing ULS systems.
Domain-specific modeling languages (DSMLs) are designed to provide precise abstractions of domain... more Domain-specific modeling languages (DSMLs) are designed to provide precise abstractions of domain-specific constructs. However, models for complex systems typically do not fit neatly within a single domain and capturing all important aspects of such a system requires developing multiple models using different DSMLs. Combining these models into multi-models presents difficult challenges, most importantly those of integrating the various models and keeping both the models and their associated data synchronized. To this end, we present NAOMI, an experimental platform for enabling multiple models, developed in different DSMLs, to work together. NAOMI analyzes model dependencies to determine the impact of changes to one model on other dependent models and coordinates the propagation of necessary model changes. NAOMI also serves as a useful testbed for exploring how diverse modeling paradigms can be combined.
Scale space interest points capture important photometric and deep structure information of an im... more Scale space interest points capture important photometric and deep structure information of an image. The information content of such points can be made explicit using image reconstruction. In this paper we will consider the problem of combining multiple types of interest points used for image reconstruction. It is shown that ordering the complete set of points by differential (quadratic) TV-norm (which works for single feature types) does not yield optimal results for combined point sets. The paper presents a method to solve this problem using canonical sets of scale space features. Qualitative and quantitative analysis show improved performance over simple ordering of points using the TV-norm.
Journal of Systems and Software, 2005
When large software systems are reverse engineered, one of the views that is produced is the syst... more When large software systems are reverse engineered, one of the views that is produced is the system decomposition hierarchy. 9 This hierarchy shows the systemÕs subsystems, the contents of the subsystems (i.e., modules or other subsystems), and so on. Soft-10 ware clustering tools create the system decomposition automatically or semi-automatically with the aid of the software engineer. 11
Software applications typically have many features that vary in their similarity. We define a mea... more Software applications typically have many features that vary in their similarity. We define a measurement of similarity between pairs of features based on their underlying implementations and use this measurement to compute a set of canonical features. The Canonical Features Set (CFS) consists of a small number of features that are as dissimilar as possible to each other, yet are most representative of the features that are not in the CFS. The members of the CFS are distinguishing features and understanding their implementation provides the engineer with an overview of the system undergoing scrutiny. The members of the CFS can also be used as cluster centroids to partition the entire set of features. Partitioning the set of features can simplify the understanding of large and complex software systems. Additionally, when a specific feature must undergo maintenance, it is helpful to know which features are most closely related to it. We demonstrate the utility of our method through the analysis of the Jext, Firefox, and Gaim software systems.
Computer Vision and Image Understanding, 2008
Many object recognition and localization techniques utilize multiple levels of local representati... more Many object recognition and localization techniques utilize multiple levels of local representations. These local feature representations are common, and one way to improve the efficiency of algorithms that use them is to reduce the size of the local representations. There has been previous work on selecting subsets of image features, but the focus here is on a systematic study of the feature selection problem. We have developed a combinatorial characterization of the feature subset selection problem that leads to a general optimization framework. This framework optimizes multiple objectives and allows the encoding of global constraints. The features selected by this algorithm are able to achieve improved performance on the problem of object localization. We present a dataset of synthetic images, along with ground-truth information, which allows us to precisely measure and compare the performance of feature subset algorithms. Our experiments show that subsets of image features produced by our method, stable bounded canonical sets (SBCS), outperform subsets produced by K-Means clustering and threshold-based methods for the task of object localization under occlusion.
Porous three-dimensional (3D) tissue scaffolds directly influence cell attachment, proliferation,... more Porous three-dimensional (3D) tissue scaffolds directly influence cell attachment, proliferation, and guidance of new tissue formation. Cells respond to a scaffold's architecture, mechanical properties, and transport properties. Given the number of design constraints, scaffold design must include multiple design parameters. Using a unit-cell based assembly approach, we introduce a method to account for multiple design parameters during scaffold assembly. This paper presents our method for integrating multiple parameters for unit-cell selection.
Given a set of patterns and a similarity measure between them, we will present an optimization fr... more Given a set of patterns and a similarity measure between them, we will present an optimization framework to approximate a small subset, known as a canonical set, whose members closely resemble the members of the original set. We will present a combinatorial formulation of the canonical set problem in terms of quadratic optimization integer programming, present a relaxation through semidefinite programming, and propose a bounded performance rounding procedure for its approximation solution in polynomial time. Through a set of experiments we will investigate the application of canonical sets for computing a summary of views from a dense set of 2D views computed for a 3D object.
A common approach to the image matching problem is representing images as sets of features in som... more A common approach to the image matching problem is representing images as sets of features in some feature space followed by establishing correspondences among the features. Previous work by Huttenlocher and Ullman [1] shows how a similarity transformation – rotation, translation, and scaling – between two images may be determined assuming that three corresponding image points are known. While robust, such methods suffer from computational inefficiencies for general feature sets. We describe a method whereby the feature sets may be summarized using the stable bounded canonical set (SBCS), thus allowing the efficient computation of point correspondences between large feature sets. We use a notion of stability to influence the set summarization such that stable image features are preferred.
Typical documentation for object-oriented programs includes descriptions of the parameters and re... more Typical documentation for object-oriented programs includes descriptions of the parameters and return types of each method in a class, but little or no information on valid method invocation sequences. Knowing the sequence with which methods of a class can be invoked is useful information especially for software engineers (e.g., developers, testers) who are actively involved in the maintenance of large software systems. This paper describes a new approach and a tool for generating class usage scenarios (i.e., how a class is used by other classes) from method invocations, which are collected during the execution of the software. Our approach is algorithmic and employs the notion of canonical sets to categorize method sequences into groups of similar sequences, where each group represents a usage scenario for a given class.
Given a collection of sets of 2-D views of 3-D objects and a similarity measure between them, we ... more Given a collection of sets of 2-D views of 3-D objects and a similarity measure between them, we present a method for summarizing the sets using a small subset called a bounded canonical set (BCS), whose members best represent the members of the original set. This means that members of the BCS are as dissimilar from each other as possible, while at the same time being as similar as possible to the non-BCS members. This paper will extend our earlier work on computing canonical sets in several ways: by omitting the need for a multi-objective optimization, by allowing the imposition of cardinality constraints, and by introducing a total similarity function. We evaluate the applicability of BCS to view selection in a view-based object recognition environment.
The implementations of software features evolve as an application matures. We define a measure of... more The implementations of software features evolve as an application matures. We define a measure of feature implementation overlap that determines how similar features are in their execution by examining their call graphs. We consider how this measure changes over time, and evaluate the hypothesis that over time and subsequent versions of a software application, the implementations of semantically similar features converge. As the features of an application converge in their implementation, we are able to more effectively determine groups of semantically similar features and to reduce the cost of program comprehension by selecting few key features that give an overview of the system. We present a case study analyzing the features of the Jext, Firefox, and Gaim software systems to support our hypothesis.
Tissue scaffolds must satisfy multiple design constraints, such as geometry, mechanical propertie... more Tissue scaffolds must satisfy multiple design constraints, such as geometry, mechanical properties, and connectivity, to yield a functioning heterogeneous tissue. One method that accounts for these multiple constraints is the unit-cell based assembly approach. In this method, the volume that represents the natural tissue is filled with unit-cells that meet the design requirements of the volume. This approach requires data exchanges between several procedures including design, characterization, assembly, analysis, and fabrication procedures. In the paper, we present a data exchange system to store and retrieve the unit-cell information and customize data migration among applications. We also present our application of this data exchange system to facilitate the management of data flow.
A common problem in computer science is how to represent a large dataset in a smaller more compac... more A common problem in computer science is how to represent a large dataset in a smaller more compact form. This thesis describes a generalized framework for selecting canonical subsets of data points that are highly representative of the original larger dataset. The contributions of the work are formulation of the subset selection problem as an optimization problem, an analysis of the complexity of the problem, the development of approximation algorithms to compute canonical subsets, and a demonstration of the utility of the algorithms in several problem domains.
Traditional model-driven engineering (MDE) techniques rely on a paradigm where systems are develo... more Traditional model-driven engineering (MDE) techniques rely on a paradigm where systems are developed using tightly coupled, monolithic modeling tools. Such monolithic modeling tools address many concerns, but operate largely in isolation of one another. As system size and complexity grow to become ultra-largescale (ULS) systems, it is becoming clear that no single monolithic modeling tool can capture all the concerns of an ULS system. It is therefore essential that isolated modeling tools collaborate with each other when realizing ULS systems.
Domain-specific modeling languages (DSMLs) are designed to provide precise abstractions of domain... more Domain-specific modeling languages (DSMLs) are designed to provide precise abstractions of domain-specific constructs. However, models for complex systems typically do not fit neatly within a single domain and capturing all important aspects of such a system requires developing multiple models using different DSMLs. Combining these models into multi-models presents difficult challenges, most importantly those of integrating the various models and keeping both the models and their associated data synchronized. To this end, we present NAOMI, an experimental platform for enabling multiple models, developed in different DSMLs, to work together. NAOMI analyzes model dependencies to determine the impact of changes to one model on other dependent models and coordinates the propagation of necessary model changes. NAOMI also serves as a useful testbed for exploring how diverse modeling paradigms can be combined.
Scale space interest points capture important photometric and deep structure information of an im... more Scale space interest points capture important photometric and deep structure information of an image. The information content of such points can be made explicit using image reconstruction. In this paper we will consider the problem of combining multiple types of interest points used for image reconstruction. It is shown that ordering the complete set of points by differential (quadratic) TV-norm (which works for single feature types) does not yield optimal results for combined point sets. The paper presents a method to solve this problem using canonical sets of scale space features. Qualitative and quantitative analysis show improved performance over simple ordering of points using the TV-norm.
Journal of Systems and Software, 2005
When large software systems are reverse engineered, one of the views that is produced is the syst... more When large software systems are reverse engineered, one of the views that is produced is the system decomposition hierarchy. 9 This hierarchy shows the systemÕs subsystems, the contents of the subsystems (i.e., modules or other subsystems), and so on. Soft-10 ware clustering tools create the system decomposition automatically or semi-automatically with the aid of the software engineer. 11
Software applications typically have many features that vary in their similarity. We define a mea... more Software applications typically have many features that vary in their similarity. We define a measurement of similarity between pairs of features based on their underlying implementations and use this measurement to compute a set of canonical features. The Canonical Features Set (CFS) consists of a small number of features that are as dissimilar as possible to each other, yet are most representative of the features that are not in the CFS. The members of the CFS are distinguishing features and understanding their implementation provides the engineer with an overview of the system undergoing scrutiny. The members of the CFS can also be used as cluster centroids to partition the entire set of features. Partitioning the set of features can simplify the understanding of large and complex software systems. Additionally, when a specific feature must undergo maintenance, it is helpful to know which features are most closely related to it. We demonstrate the utility of our method through the analysis of the Jext, Firefox, and Gaim software systems.
Computer Vision and Image Understanding, 2008
Many object recognition and localization techniques utilize multiple levels of local representati... more Many object recognition and localization techniques utilize multiple levels of local representations. These local feature representations are common, and one way to improve the efficiency of algorithms that use them is to reduce the size of the local representations. There has been previous work on selecting subsets of image features, but the focus here is on a systematic study of the feature selection problem. We have developed a combinatorial characterization of the feature subset selection problem that leads to a general optimization framework. This framework optimizes multiple objectives and allows the encoding of global constraints. The features selected by this algorithm are able to achieve improved performance on the problem of object localization. We present a dataset of synthetic images, along with ground-truth information, which allows us to precisely measure and compare the performance of feature subset algorithms. Our experiments show that subsets of image features produced by our method, stable bounded canonical sets (SBCS), outperform subsets produced by K-Means clustering and threshold-based methods for the task of object localization under occlusion.
Porous three-dimensional (3D) tissue scaffolds directly influence cell attachment, proliferation,... more Porous three-dimensional (3D) tissue scaffolds directly influence cell attachment, proliferation, and guidance of new tissue formation. Cells respond to a scaffold's architecture, mechanical properties, and transport properties. Given the number of design constraints, scaffold design must include multiple design parameters. Using a unit-cell based assembly approach, we introduce a method to account for multiple design parameters during scaffold assembly. This paper presents our method for integrating multiple parameters for unit-cell selection.
Given a set of patterns and a similarity measure between them, we will present an optimization fr... more Given a set of patterns and a similarity measure between them, we will present an optimization framework to approximate a small subset, known as a canonical set, whose members closely resemble the members of the original set. We will present a combinatorial formulation of the canonical set problem in terms of quadratic optimization integer programming, present a relaxation through semidefinite programming, and propose a bounded performance rounding procedure for its approximation solution in polynomial time. Through a set of experiments we will investigate the application of canonical sets for computing a summary of views from a dense set of 2D views computed for a 3D object.
A common approach to the image matching problem is representing images as sets of features in som... more A common approach to the image matching problem is representing images as sets of features in some feature space followed by establishing correspondences among the features. Previous work by Huttenlocher and Ullman [1] shows how a similarity transformation – rotation, translation, and scaling – between two images may be determined assuming that three corresponding image points are known. While robust, such methods suffer from computational inefficiencies for general feature sets. We describe a method whereby the feature sets may be summarized using the stable bounded canonical set (SBCS), thus allowing the efficient computation of point correspondences between large feature sets. We use a notion of stability to influence the set summarization such that stable image features are preferred.
Typical documentation for object-oriented programs includes descriptions of the parameters and re... more Typical documentation for object-oriented programs includes descriptions of the parameters and return types of each method in a class, but little or no information on valid method invocation sequences. Knowing the sequence with which methods of a class can be invoked is useful information especially for software engineers (e.g., developers, testers) who are actively involved in the maintenance of large software systems. This paper describes a new approach and a tool for generating class usage scenarios (i.e., how a class is used by other classes) from method invocations, which are collected during the execution of the software. Our approach is algorithmic and employs the notion of canonical sets to categorize method sequences into groups of similar sequences, where each group represents a usage scenario for a given class.
Given a collection of sets of 2-D views of 3-D objects and a similarity measure between them, we ... more Given a collection of sets of 2-D views of 3-D objects and a similarity measure between them, we present a method for summarizing the sets using a small subset called a bounded canonical set (BCS), whose members best represent the members of the original set. This means that members of the BCS are as dissimilar from each other as possible, while at the same time being as similar as possible to the non-BCS members. This paper will extend our earlier work on computing canonical sets in several ways: by omitting the need for a multi-objective optimization, by allowing the imposition of cardinality constraints, and by introducing a total similarity function. We evaluate the applicability of BCS to view selection in a view-based object recognition environment.
The implementations of software features evolve as an application matures. We define a measure of... more The implementations of software features evolve as an application matures. We define a measure of feature implementation overlap that determines how similar features are in their execution by examining their call graphs. We consider how this measure changes over time, and evaluate the hypothesis that over time and subsequent versions of a software application, the implementations of semantically similar features converge. As the features of an application converge in their implementation, we are able to more effectively determine groups of semantically similar features and to reduce the cost of program comprehension by selecting few key features that give an overview of the system. We present a case study analyzing the features of the Jext, Firefox, and Gaim software systems to support our hypothesis.
Tissue scaffolds must satisfy multiple design constraints, such as geometry, mechanical propertie... more Tissue scaffolds must satisfy multiple design constraints, such as geometry, mechanical properties, and connectivity, to yield a functioning heterogeneous tissue. One method that accounts for these multiple constraints is the unit-cell based assembly approach. In this method, the volume that represents the natural tissue is filled with unit-cells that meet the design requirements of the volume. This approach requires data exchanges between several procedures including design, characterization, assembly, analysis, and fabrication procedures. In the paper, we present a data exchange system to store and retrieve the unit-cell information and customize data migration among applications. We also present our application of this data exchange system to facilitate the management of data flow.