Joseph Davis | The University of Sydney (original) (raw)
Papers by Joseph Davis
With the growing emphasis on metrics such as citation count and h-index for research assessment, ... more With the growing emphasis on metrics such as citation count and h-index for research assessment, several reports of gaming and cartel-like formations for boosting citation statistics have emerged. However, such cartels are extremely difficult to detect. This paper presents a systematic approach to visualizing and computing clique and other anomalous patterns through ego-centric citation network analysis by drilling down into the details of individual researcher's citations. After grouping the citations into three categories, namely, self- citations, co-author citations, and distant citations, we focus our analysis on the outliers with relatively very high proportion of self- and co-author citations. By analyzing the complete co-authorship citation networks of these researchers one at a time along with all the co-authors and by merging these networks, we were able to isolate and visualize cliques and anomalous citation patterns that suggest plausible collusion. Our exploratory an...
Social Computing and Social Media: Experience Design and Social Network Analysis, 2021
ArXiv, 2018
The open source development model has become a paradigm shift from traditional in-house/closed-so... more The open source development model has become a paradigm shift from traditional in-house/closed-source software development model, with many successes. Traditionally, open source projects were characterized essentially by their individual volunteer developers. Such tradition has changed significantly with the participation of many organizations in particular. However, there exists a knowledge gap concerning how open source developer communities evolve. In this paper, we present some observations on open source developer communities. In particular, we analyze git repositories of 20 well-known open source projects, with over 3 million commit activities in total. The analysis has been carried out in three respects, productivity, diversity and growth using the Spearman's rank correlation coefficient, diversity index and the Gompertz/logistic curves, respectively. We find out that (a) the Spearman's rank correlation coefficient between active contributors and commit activities rev...
French Journal of Management Information Systems, 2003
Open Source Software (OSS) development model has attracted considerable attention in recent years... more Open Source Software (OSS) development model has attracted considerable attention in recent years, primarily because it offers a non-proprietary and socially beneficial model of software development backed by a dedicated community of developers and users who share and expand their knowledge and expertise. This research investigates the evolution of open source software using a case study of the Samba project. Through the application of both qualitative and quantitative techniques, Samba's software development and evolution over a seven-year period are tracked and assessed. This assessment and the findings of similar, previously reported studies lead us to propose a general framework for the evolvabiltity and the key drivers of open source software evolution. open-source communities for socially coordinated software development.
Harnessing human computation through crowdsourcing offers a new approach to solving complex probl... more Harnessing human computation through crowdsourcing offers a new approach to solving complex problems, especially those that are relatively easy for humans but difficult for computers. Micro-tasking platforms such as Amazon Mechanical Turk have attracted large, on-demand workforces of millions of workers as well as hundreds of thousands of job requesters. Achieving high quality results and minimizing the total task execution times are the two of the main goals of these crowdsourcing systems. Drawing on cognitive load theory and usability design principles, we study the effects of different user interface designs on performance and the latency of crowdsourcing systems. Our results indicate that complex and poorly designed user interfaces contributed to lower worker performance and increased task latency.
Proceedings. IEEE International Conference on Web Services, 2004., 2004
2006 IEEE International Conference on Services Computing (SCC'06), 2006
European Journal of Operational Research, 1995
Knowledge sharing and management (KSM) has emerged as an important issue for international manage... more Knowledge sharing and management (KSM) has emerged as an important issue for international management. However, there is considerable confusion as to what constitutes organizational knowledge, whether and how it can be systematically managed, and what are some of the effective organizational and technological mechanisms for facilitating knowledge management. This study seeks to unravel the complexities associated with it, initially by developing a typology of such mechanisms. This will provide a starting point for a detailed field study carried out in a large, multinational company (Du Pont) focusing on the critical issues, concrete practices, bottlenecks, and constraints in knowledge sharing and management in two functional areas, two business units, and four countries. Using the field data, we also elucidate a dynamic model of knowledge with emphasis on ongoing interpretation and contextualization of previously generated knowledge. Design guidelines for implementing computer-based systems to support KSM are also presented.
The project aims to design and build a digital portal which will document, organise, and preserve... more The project aims to design and build a digital portal which will document, organise, and preserve aspects of Balinese cultural heritage and related knowledge for the benefit of the wider community and the younger generations in particular. We present the details of our research dealing with one aspect of Balinese culture, the Balinese traditional communication system (kulkul), undertaken in the Indonesian island of Bali. This knowledge is held largely in tacit form in the Balinese community and tends to be poorly documented and fragmented. A basic ontology of key kulkul-related concepts and terms and their interrelationships was developed to support the semantic searching and browsing of the online portal and related resources. Much of the content for the portal was acquired through community-based crowdsourcing. We also discuss the procedures employed evaluate the digital portal prototype.
of universities represents a complex endeavor which involves gathering, weighting, and analyzing ... more of universities represents a complex endeavor which involves gathering, weighting, and analyzing diverse data. Emerging semantic technologies enable the Web of Data, a giant graph of interconnected information resources, also known as Linked Data. A recent community effort, Linking Open Data project, offers the possibility of accessing a large number of semantically described and linked concepts in various domains. In this paper, we propose a novel approach to take advantage of this structured data in the domain of universities to develop proxy measures of their relative standing for ranking purposes. Derived from information theory, our approach of computing the Information Content for universities and ranking them based on these scores achieved results comparable to the international ranking systems such as Shanghai Jiao Tong University, Times Higher Education, and QS. The metric we developed can also be used for innovative semantic applications in a range of domains for entity ra...
Linked Data allows structured data to be published in a standard manner so that datasets from div... more Linked Data allows structured data to be published in a standard manner so that datasets from diverse domains can be interlinked. By leveraging Semantic Web standards and technologies, a growing amount of semantic content has been published on the Web as Linked Open Data (LOD). The LOD cloud has made available a large volume of structured data in a range of domains via liberal licenses. The semantic content of LOD in conjunction with the advanced searching and querying mechanisms provided by SPARQL has opened up unprecedented opportunities not only for enhancing existing applications, but also for developing new and innovative semantic applications. However, SPARQL is inadequate to deal with functionalities such as comparing, prioritizing, and ranking search results which are fundamental to applications such as recommendation provision, matchmaking, social network analysis, visualization, and data clustering. This paper addresses this problem by developing a systematic measurement model of semantic similarity between resources in Linked Data. By drawing extensively on a feature-based definition of Linked Data, it proposes a generalized information content-based approach that improves on previous methods which are typically restricted to specific knowledge representation models and less relevant in the context of Linked Data. It is validated and evaluated for measuring item similarity in recommender systems. The experimental evaluation of the proposed measure shows that our approach can outperform comparable recommender systems that use conventional similarity measures.
computer.org
Sheikh Iqbal Ahamed, Marquette University, USA Muhammad Masoon Alam, IMSciences, Pakistan Antonia... more Sheikh Iqbal Ahamed, Marquette University, USA Muhammad Masoon Alam, IMSciences, Pakistan Antonia Albani, University of St. Gallen Müller-Friedberg, Switzerland Uwe Assmann, TU Dresden Bai Xiaoyin, Tsinghua University, China Akhilesh Bajaj, The University of Tulsa, USA Janaka L. Balasooriya, Arizona State University, USA Sujoy Basu, HP Labs Palo Alto, USA Paul Buhler, College of Charleston, USA Jiannong Cao, Hong Kong Polytechnic University, Hong Kong Rong N. Chang, IBM TJ Watson Research Center, USA Sanjay Chaudhary, ...
Ontologies have been created for many different subjects and by independent groups around the wor... more Ontologies have been created for many different subjects and by independent groups around the world. The nonexistence of a commonly accepted and used general purpose upper-ontology makes it difficult to integrate these ontologies through merge and alignment operations. The majority of the algorithms proposed so far rely on syntactic analysis, disregarding the structural properties of the source ontologies. In our previous work, we proposed an alignment method that considers the structural properties of an upper-ontology constructed using a thesaurus and Formal Concept Analysis technique (FCA). We also analyzed the FCA's lattice structure and proposed a measure of similarity based on Tversky's model, which allowed us to identify closely related concepts in different source ontologies. In this paper, we apply the alignment method to ontologies developed for a completely different domain, and enhance the solution by providing a navigational aid for the lattice. It is well known that one of the main drawbacks of the application of FCA is that the resulting lattice soon becomes cluttered when the number of objects and attributes increases. The proposed solution is based on hyperbolic visualization and on structural elements of the lattice.
With the growing emphasis on metrics such as citation count and h-index for research assessment, ... more With the growing emphasis on metrics such as citation count and h-index for research assessment, several reports of gaming and cartel-like formations for boosting citation statistics have emerged. However, such cartels are extremely difficult to detect. This paper presents a systematic approach to visualizing and computing clique and other anomalous patterns through ego-centric citation network analysis by drilling down into the details of individual researcher's citations. After grouping the citations into three categories, namely, self- citations, co-author citations, and distant citations, we focus our analysis on the outliers with relatively very high proportion of self- and co-author citations. By analyzing the complete co-authorship citation networks of these researchers one at a time along with all the co-authors and by merging these networks, we were able to isolate and visualize cliques and anomalous citation patterns that suggest plausible collusion. Our exploratory an...
Social Computing and Social Media: Experience Design and Social Network Analysis, 2021
ArXiv, 2018
The open source development model has become a paradigm shift from traditional in-house/closed-so... more The open source development model has become a paradigm shift from traditional in-house/closed-source software development model, with many successes. Traditionally, open source projects were characterized essentially by their individual volunteer developers. Such tradition has changed significantly with the participation of many organizations in particular. However, there exists a knowledge gap concerning how open source developer communities evolve. In this paper, we present some observations on open source developer communities. In particular, we analyze git repositories of 20 well-known open source projects, with over 3 million commit activities in total. The analysis has been carried out in three respects, productivity, diversity and growth using the Spearman's rank correlation coefficient, diversity index and the Gompertz/logistic curves, respectively. We find out that (a) the Spearman's rank correlation coefficient between active contributors and commit activities rev...
French Journal of Management Information Systems, 2003
Open Source Software (OSS) development model has attracted considerable attention in recent years... more Open Source Software (OSS) development model has attracted considerable attention in recent years, primarily because it offers a non-proprietary and socially beneficial model of software development backed by a dedicated community of developers and users who share and expand their knowledge and expertise. This research investigates the evolution of open source software using a case study of the Samba project. Through the application of both qualitative and quantitative techniques, Samba's software development and evolution over a seven-year period are tracked and assessed. This assessment and the findings of similar, previously reported studies lead us to propose a general framework for the evolvabiltity and the key drivers of open source software evolution. open-source communities for socially coordinated software development.
Harnessing human computation through crowdsourcing offers a new approach to solving complex probl... more Harnessing human computation through crowdsourcing offers a new approach to solving complex problems, especially those that are relatively easy for humans but difficult for computers. Micro-tasking platforms such as Amazon Mechanical Turk have attracted large, on-demand workforces of millions of workers as well as hundreds of thousands of job requesters. Achieving high quality results and minimizing the total task execution times are the two of the main goals of these crowdsourcing systems. Drawing on cognitive load theory and usability design principles, we study the effects of different user interface designs on performance and the latency of crowdsourcing systems. Our results indicate that complex and poorly designed user interfaces contributed to lower worker performance and increased task latency.
Proceedings. IEEE International Conference on Web Services, 2004., 2004
2006 IEEE International Conference on Services Computing (SCC'06), 2006
European Journal of Operational Research, 1995
Knowledge sharing and management (KSM) has emerged as an important issue for international manage... more Knowledge sharing and management (KSM) has emerged as an important issue for international management. However, there is considerable confusion as to what constitutes organizational knowledge, whether and how it can be systematically managed, and what are some of the effective organizational and technological mechanisms for facilitating knowledge management. This study seeks to unravel the complexities associated with it, initially by developing a typology of such mechanisms. This will provide a starting point for a detailed field study carried out in a large, multinational company (Du Pont) focusing on the critical issues, concrete practices, bottlenecks, and constraints in knowledge sharing and management in two functional areas, two business units, and four countries. Using the field data, we also elucidate a dynamic model of knowledge with emphasis on ongoing interpretation and contextualization of previously generated knowledge. Design guidelines for implementing computer-based systems to support KSM are also presented.
The project aims to design and build a digital portal which will document, organise, and preserve... more The project aims to design and build a digital portal which will document, organise, and preserve aspects of Balinese cultural heritage and related knowledge for the benefit of the wider community and the younger generations in particular. We present the details of our research dealing with one aspect of Balinese culture, the Balinese traditional communication system (kulkul), undertaken in the Indonesian island of Bali. This knowledge is held largely in tacit form in the Balinese community and tends to be poorly documented and fragmented. A basic ontology of key kulkul-related concepts and terms and their interrelationships was developed to support the semantic searching and browsing of the online portal and related resources. Much of the content for the portal was acquired through community-based crowdsourcing. We also discuss the procedures employed evaluate the digital portal prototype.
of universities represents a complex endeavor which involves gathering, weighting, and analyzing ... more of universities represents a complex endeavor which involves gathering, weighting, and analyzing diverse data. Emerging semantic technologies enable the Web of Data, a giant graph of interconnected information resources, also known as Linked Data. A recent community effort, Linking Open Data project, offers the possibility of accessing a large number of semantically described and linked concepts in various domains. In this paper, we propose a novel approach to take advantage of this structured data in the domain of universities to develop proxy measures of their relative standing for ranking purposes. Derived from information theory, our approach of computing the Information Content for universities and ranking them based on these scores achieved results comparable to the international ranking systems such as Shanghai Jiao Tong University, Times Higher Education, and QS. The metric we developed can also be used for innovative semantic applications in a range of domains for entity ra...
Linked Data allows structured data to be published in a standard manner so that datasets from div... more Linked Data allows structured data to be published in a standard manner so that datasets from diverse domains can be interlinked. By leveraging Semantic Web standards and technologies, a growing amount of semantic content has been published on the Web as Linked Open Data (LOD). The LOD cloud has made available a large volume of structured data in a range of domains via liberal licenses. The semantic content of LOD in conjunction with the advanced searching and querying mechanisms provided by SPARQL has opened up unprecedented opportunities not only for enhancing existing applications, but also for developing new and innovative semantic applications. However, SPARQL is inadequate to deal with functionalities such as comparing, prioritizing, and ranking search results which are fundamental to applications such as recommendation provision, matchmaking, social network analysis, visualization, and data clustering. This paper addresses this problem by developing a systematic measurement model of semantic similarity between resources in Linked Data. By drawing extensively on a feature-based definition of Linked Data, it proposes a generalized information content-based approach that improves on previous methods which are typically restricted to specific knowledge representation models and less relevant in the context of Linked Data. It is validated and evaluated for measuring item similarity in recommender systems. The experimental evaluation of the proposed measure shows that our approach can outperform comparable recommender systems that use conventional similarity measures.
computer.org
Sheikh Iqbal Ahamed, Marquette University, USA Muhammad Masoon Alam, IMSciences, Pakistan Antonia... more Sheikh Iqbal Ahamed, Marquette University, USA Muhammad Masoon Alam, IMSciences, Pakistan Antonia Albani, University of St. Gallen Müller-Friedberg, Switzerland Uwe Assmann, TU Dresden Bai Xiaoyin, Tsinghua University, China Akhilesh Bajaj, The University of Tulsa, USA Janaka L. Balasooriya, Arizona State University, USA Sujoy Basu, HP Labs Palo Alto, USA Paul Buhler, College of Charleston, USA Jiannong Cao, Hong Kong Polytechnic University, Hong Kong Rong N. Chang, IBM TJ Watson Research Center, USA Sanjay Chaudhary, ...
Ontologies have been created for many different subjects and by independent groups around the wor... more Ontologies have been created for many different subjects and by independent groups around the world. The nonexistence of a commonly accepted and used general purpose upper-ontology makes it difficult to integrate these ontologies through merge and alignment operations. The majority of the algorithms proposed so far rely on syntactic analysis, disregarding the structural properties of the source ontologies. In our previous work, we proposed an alignment method that considers the structural properties of an upper-ontology constructed using a thesaurus and Formal Concept Analysis technique (FCA). We also analyzed the FCA's lattice structure and proposed a measure of similarity based on Tversky's model, which allowed us to identify closely related concepts in different source ontologies. In this paper, we apply the alignment method to ontologies developed for a completely different domain, and enhance the solution by providing a navigational aid for the lattice. It is well known that one of the main drawbacks of the application of FCA is that the resulting lattice soon becomes cluttered when the number of objects and attributes increases. The proposed solution is based on hyperbolic visualization and on structural elements of the lattice.