Ee-Peng LIM | Singapore Management University (original) (raw)

Papers by Ee-Peng LIM

Research paper thumbnail of Fifth ACM International workshop on web information and data management (WIDM 2003)

Data & Knowledge Engineering, Sep 1, 2005

Google, Inc. (search). ...

Research paper thumbnail of Predicting Project Outcome Leveraging Socio-Technical Network Patterns

There are many software projects started daily; some are successful, while others are not. Succes... more There are many software projects started daily; some are successful, while others are not. Successful projects get completed, are used by many people, and bring benefits to users. Failed projects do not bring similar benefits. In this work, we are interested in developing an effective machine learning solution that predicts project outcome (i.e., success or failures) from developer socio-technical network. To do so, we investigate successful and failed projects to find factors that differentiate the two. We analyze the socio-technical aspect of the software development process by focusing at the people that contribute to these projects and the interactions among them. We first form a collaboration graph for each software project. We then create a training set consisting of two graph databases corresponding to successful and failed projects respectively. A new data mining approach is then employed to extract discriminative rich patterns that appear frequently on the successful projects but rarely on the failed projects. We find that these automatically mined patterns are effective features to predict project outcomes. We experiment our solution on projects in SourceForge.Net, the largest open source software development portal, and show that under 10 fold cross validation, our approach could achieve an accuracy of more than 90% and an AUC score of 0.86. We also present and analyze some mined socio-technical patterns.

Research paper thumbnail of Reviewers for Modeling Technologies and Intelligent Systems Track

Research paper thumbnail of Jangyeon Park Shietung Peng Olivier Perrin Pradeep Ray Masashi Saito

7th International Conference on Parallel and Distributed Systems (ICPADS'00)

Research paper thumbnail of Part 1: Data Warehousing and Data Mining-1.2: Data Mining-Session chair: Roger Hsiang-Li Chiang, SINGAPORE-A Heuristic Method for Correlating Attribute Group Pairs in Data Mining

Research paper thumbnail of Mining coherent anomaly collections on web data

Abstract The recent boom of weblogs and social media has attached increasing importance to the id... more Abstract The recent boom of weblogs and social media has attached increasing importance to the identification of suspicious users with unusual behavior, such as spammers or fraudulent reviewers. A typical spamming strategy is to employ multiple dummy accounts to collectively promote a target, be it a URL or a product. Consequently, these suspicious accounts exhibit certain coherent anomalous behavior identifiable as a collection.

Research paper thumbnail of On composing a reliable composite web service: a study of dynamic web service selection

Abstract Dynamic Web service selection refers to determining a subset of component Web services t... more Abstract Dynamic Web service selection refers to determining a subset of component Web services to be invoked so as to orchestrate a composite Web service. Previous work in Web service selection usually assumes the invocations of Web service operations to be independent of one another. This assumption however does not hold in practice as both the composite and component Web services often impose some orderings on the invocation of their operations.

Research paper thumbnail of Correction and analysis

Correction and analysis. Ee-Peng Lim International Conference on Digital Libraries: Proceedings o... more Correction and analysis. Ee-Peng Lim International Conference on Digital Libraries: Proceedings of the 3 rd ACM/IEEE-CS joint conference on Digital libraries 2003, 2003. Abstract not available. 80 Computer Applications(General)(CI).

Research paper thumbnail of Senior Program Committee

Research paper thumbnail of Z. Qian, S. Zhang, K. Yim, S. Lu

Research paper thumbnail of Ontology-based web annotation framework for hyperlink structures.

Research paper thumbnail of Shakedown and steady-state responses of elastic-plastic solids in large displacements

Elastic-perfectly plastic solids (or structures) subjected to loads quasi-statically varying with... more Elastic-perfectly plastic solids (or structures) subjected to loads quasi-statically varying within a specified domain are addressed in the framework of large displacements and the additive strain decomposition rule.

Research paper thumbnail of Trust-oriented composite service selection with qos constraints

Abstract: In Service-Oriented Computing (SOC) environments, service clients interact with service... more Abstract: In Service-Oriented Computing (SOC) environments, service clients interact with service providers for consuming services. From the viewpoint of service clients, the trust level of a service or a service provider is a critical factor to consider in service selection, particularly when a client is looking for a service from a large set of services or service providers. However, a invoked service may be composed of other services.

Research paper thumbnail of DATA AND KNOWEDGE ENGINEERIN

DATA AND KNOWEDGE ENGINEERIN. Volume 54, Issue 3. pp. 277-393 (September 2005. Fifth ACM Internat... more DATA AND KNOWEDGE ENGINEERIN. Volume 54, Issue 3. pp. 277-393 (September 2005. Fifth ACM International Workshop on Web Information and Data Management (WIDM 2003). Pages 277-278. Roger HL Chiang, Alberto HF Laender and Ee-Peng Lim. Special papers. Clustering Web pages based on their structure. Pages 279-299. Valter Crescenzi, Paolo Merialdo and Paolo Missier. Clustering documents into a web directory for bootstrapping a supervised classification. Pages 301-325.

Research paper thumbnail of Managing Geospatial and Georeferenced Web Resources

G-Portal [1] is a Web-based digital library that collects metadataof geospatial and georeferenced... more G-Portal [1] is a Web-based digital library that collects metadataof geospatial and georeferenced resources on the Weband provides digital library services to access them. It adoptsa map-based interface as its primary point of access to visualizeand manipulate the distributed geospatial and georeferencedcontent. A classification-based interface is alsoprovided to classify and visualize all resources. This interfaceis supported by a flexible classification language andthe backend classification engine.

Research paper thumbnail of Framework and knowledge for database integration

Abstract Traditionally, data integration research has focused primarily on understanding integrat... more Abstract Traditionally, data integration research has focused primarily on understanding integration issues from the data instance and schema perspectives. However, when the integration of heterogeneous databases is performed without considering the semantics of local databases, an incorrectly integrated database may result. Moreover, most integration tasks must be performed manually.

Research paper thumbnail of Combining Multiple Sources of Evidence for Information Retrieval Using Logistic Regression

Research paper thumbnail of Y. Pan, Y. Tang, S. Li

Research paper thumbnail of Centre for Advanced Information Systems Nanyang Technological University Singapore 639798

Research paper thumbnail of Cooperative multi-attribute bilateral online negotiation for e-commerce

Abstract Currently, fixed-price sale and online auction are two major sale modes in the applied e... more Abstract Currently, fixed-price sale and online auction are two major sale modes in the applied electronic commerce systems. Bilateral negotiation does not yet have a satisfying performance in the Internet-based transactions. In this paper, the time-independent feature of online negotiations is emphasized. Correspondingly, a formal mathematical model of online negotiation is established. We also present a flexible and feasible bilateral negotiation protocol which is used in an agent-based cooperative negotiation.

Research paper thumbnail of Fifth ACM International workshop on web information and data management (WIDM 2003)

Data & Knowledge Engineering, Sep 1, 2005

Google, Inc. (search). ...

Research paper thumbnail of Predicting Project Outcome Leveraging Socio-Technical Network Patterns

There are many software projects started daily; some are successful, while others are not. Succes... more There are many software projects started daily; some are successful, while others are not. Successful projects get completed, are used by many people, and bring benefits to users. Failed projects do not bring similar benefits. In this work, we are interested in developing an effective machine learning solution that predicts project outcome (i.e., success or failures) from developer socio-technical network. To do so, we investigate successful and failed projects to find factors that differentiate the two. We analyze the socio-technical aspect of the software development process by focusing at the people that contribute to these projects and the interactions among them. We first form a collaboration graph for each software project. We then create a training set consisting of two graph databases corresponding to successful and failed projects respectively. A new data mining approach is then employed to extract discriminative rich patterns that appear frequently on the successful projects but rarely on the failed projects. We find that these automatically mined patterns are effective features to predict project outcomes. We experiment our solution on projects in SourceForge.Net, the largest open source software development portal, and show that under 10 fold cross validation, our approach could achieve an accuracy of more than 90% and an AUC score of 0.86. We also present and analyze some mined socio-technical patterns.

Research paper thumbnail of Reviewers for Modeling Technologies and Intelligent Systems Track

Research paper thumbnail of Jangyeon Park Shietung Peng Olivier Perrin Pradeep Ray Masashi Saito

7th International Conference on Parallel and Distributed Systems (ICPADS'00)

Research paper thumbnail of Part 1: Data Warehousing and Data Mining-1.2: Data Mining-Session chair: Roger Hsiang-Li Chiang, SINGAPORE-A Heuristic Method for Correlating Attribute Group Pairs in Data Mining

Research paper thumbnail of Mining coherent anomaly collections on web data

Abstract The recent boom of weblogs and social media has attached increasing importance to the id... more Abstract The recent boom of weblogs and social media has attached increasing importance to the identification of suspicious users with unusual behavior, such as spammers or fraudulent reviewers. A typical spamming strategy is to employ multiple dummy accounts to collectively promote a target, be it a URL or a product. Consequently, these suspicious accounts exhibit certain coherent anomalous behavior identifiable as a collection.

Research paper thumbnail of On composing a reliable composite web service: a study of dynamic web service selection

Abstract Dynamic Web service selection refers to determining a subset of component Web services t... more Abstract Dynamic Web service selection refers to determining a subset of component Web services to be invoked so as to orchestrate a composite Web service. Previous work in Web service selection usually assumes the invocations of Web service operations to be independent of one another. This assumption however does not hold in practice as both the composite and component Web services often impose some orderings on the invocation of their operations.

Research paper thumbnail of Correction and analysis

Correction and analysis. Ee-Peng Lim International Conference on Digital Libraries: Proceedings o... more Correction and analysis. Ee-Peng Lim International Conference on Digital Libraries: Proceedings of the 3 rd ACM/IEEE-CS joint conference on Digital libraries 2003, 2003. Abstract not available. 80 Computer Applications(General)(CI).

Research paper thumbnail of Senior Program Committee

Research paper thumbnail of Z. Qian, S. Zhang, K. Yim, S. Lu

Research paper thumbnail of Ontology-based web annotation framework for hyperlink structures.

Research paper thumbnail of Shakedown and steady-state responses of elastic-plastic solids in large displacements

Elastic-perfectly plastic solids (or structures) subjected to loads quasi-statically varying with... more Elastic-perfectly plastic solids (or structures) subjected to loads quasi-statically varying within a specified domain are addressed in the framework of large displacements and the additive strain decomposition rule.

Research paper thumbnail of Trust-oriented composite service selection with qos constraints

Abstract: In Service-Oriented Computing (SOC) environments, service clients interact with service... more Abstract: In Service-Oriented Computing (SOC) environments, service clients interact with service providers for consuming services. From the viewpoint of service clients, the trust level of a service or a service provider is a critical factor to consider in service selection, particularly when a client is looking for a service from a large set of services or service providers. However, a invoked service may be composed of other services.

Research paper thumbnail of DATA AND KNOWEDGE ENGINEERIN

DATA AND KNOWEDGE ENGINEERIN. Volume 54, Issue 3. pp. 277-393 (September 2005. Fifth ACM Internat... more DATA AND KNOWEDGE ENGINEERIN. Volume 54, Issue 3. pp. 277-393 (September 2005. Fifth ACM International Workshop on Web Information and Data Management (WIDM 2003). Pages 277-278. Roger HL Chiang, Alberto HF Laender and Ee-Peng Lim. Special papers. Clustering Web pages based on their structure. Pages 279-299. Valter Crescenzi, Paolo Merialdo and Paolo Missier. Clustering documents into a web directory for bootstrapping a supervised classification. Pages 301-325.

Research paper thumbnail of Managing Geospatial and Georeferenced Web Resources

G-Portal [1] is a Web-based digital library that collects metadataof geospatial and georeferenced... more G-Portal [1] is a Web-based digital library that collects metadataof geospatial and georeferenced resources on the Weband provides digital library services to access them. It adoptsa map-based interface as its primary point of access to visualizeand manipulate the distributed geospatial and georeferencedcontent. A classification-based interface is alsoprovided to classify and visualize all resources. This interfaceis supported by a flexible classification language andthe backend classification engine.

Research paper thumbnail of Framework and knowledge for database integration

Abstract Traditionally, data integration research has focused primarily on understanding integrat... more Abstract Traditionally, data integration research has focused primarily on understanding integration issues from the data instance and schema perspectives. However, when the integration of heterogeneous databases is performed without considering the semantics of local databases, an incorrectly integrated database may result. Moreover, most integration tasks must be performed manually.

Research paper thumbnail of Combining Multiple Sources of Evidence for Information Retrieval Using Logistic Regression

Research paper thumbnail of Y. Pan, Y. Tang, S. Li

Research paper thumbnail of Centre for Advanced Information Systems Nanyang Technological University Singapore 639798

Research paper thumbnail of Cooperative multi-attribute bilateral online negotiation for e-commerce

Abstract Currently, fixed-price sale and online auction are two major sale modes in the applied e... more Abstract Currently, fixed-price sale and online auction are two major sale modes in the applied electronic commerce systems. Bilateral negotiation does not yet have a satisfying performance in the Internet-based transactions. In this paper, the time-independent feature of online negotiations is emphasized. Correspondingly, a formal mathematical model of online negotiation is established. We also present a flexible and feasible bilateral negotiation protocol which is used in an agent-based cooperative negotiation.