Ali Daud Associate Professor - Academia.edu (original) (raw)

Papers by Ali Daud Associate Professor

Research paper thumbnail of A Review of Career Selection Models

Researchpedia Journal of Computing, 2020

Career selection plays an important role in every industry. It has immense influence on individua... more Career selection plays an important role in every industry. It has immense influence on individuals’ futures. Due to its importance, many researchers have proposed models and guidelines for career selection to help people select careers systematically. This article discusses various career selection models and guidelines. It elaborates the elements and factors that contribute to the career selection process and classifies them according to specific professions. It also highlights the limitations of existing models and suggests solutions and directions for future work. As the studies discussed in this article were conducted in various countries, the findings provide an international perspective that includes implications for the Arab world, China and Europe.

Available from: https://rpjc.researchpedia.info/rpjc_2020_4-a-review-of-career-selection-models/

Research paper thumbnail of Using network science to understand the link between subjects and professions

Computers in Human Behavior 106:106228, 2020

studied by people often impact their careers. The relation between education and careers has been... more studied by people often impact their careers. The relation between education and careers has been well studied by social scientists, however limited research on this relation is available in network science. Network science has emerged as a promising field to understand complex systems. We study the relation between education and careers from a network science perspective. In this research we propose methods from network science to understand the relation between the subjects studied by a person and the impact of these subjects on the career of the person. We model the relation between favorite subjects and careers using a network. The model helps in understanding the positive and negative contributions of certain subjects towards other subjects and careers. The results show that mathematics and English are the two basic subjects that are highly related to most of the professions. The detailed results are of particular significance for people associated with higher education and career development.

Research paper thumbnail of A Review of Career Selection Models

Researchpedia Journal of Computing, 2020

Career selection plays an important role in every industry. It has immense influence on individu... more Career selection plays an important role in every industry. It has immense influence on individuals’ futures. Due to its importance, many researchers have proposed models and guidelines for career selection to help people select careers systematically. This article discusses various career selection models and guidelines. It elaborates the elements and factors that contribute to the career selection process and classifies them according to specific professions. It also highlights the limitations of existing models and suggests solutions and directions for future work. As the studies discussed in this article were conducted in various countries, the findings provide an international perspective that includes implications for the Arab world, China and Europe

Research paper thumbnail of Measuring the Impact of Topic Drift in Scholarly Networks

With the increase in collaboration among researchers of various disciplines, changing the researc... more With the increase in collaboration among researchers of various disciplines, changing the research topic or working on multiple topics is not an unusual behavior. Several comprehensive efforts have been made for predicting, quantifying, and studying the researcher's impact. The question, that how the change in the field of interest over time or working in more than one topics can influence the scientific impact, remains unanswered. In this research, we study the effect of topic drift on the scientific impact of an author. We apply Author Conference Topic (ACT) model to extract topic distribution of individual authors who are working on multiple topics to compare and analyze with authors who work on a single topic. We analyze the productivity of the authors on the basis of publication count, citation count and h-index. We find that authors who stick to one topic, produce a higher impact and gain more attention. To further strengthen our results we gather the h-index of top-ranked authors working on one topic and top-ranked authors working on multiple topics and examine whether there are similar trends in their progress. The results show an evidence of significant impact of topic drift on career choices of researchers.

Research paper thumbnail of Region-wise Ranking of Sports Players based on Link Fusion

Players are ranked in various sports to show their importance over other players. Existing method... more Players are ranked in various sports to show their importance over other players. Existing methods only consider intra-type links (e.g., player to player and team to team), but ignore inter-type links (e.g., one type of player to another type of player, such as batsman to bowler and player to team) based on cognitive aspects. They also ignore the spatiality of the players. There is a strong relationship among players and their teams, which can be represented as a network consisting of multi-type interrelated objects. In this paper, we propose a players' ranking method, called Region-wise Players Link Fusion (RPLF) which is applied to the sport of cricket. RPLF considers players' region-wise intra-type and inter-type relation-based features to rank the players. Considering multi-type interrelated objects is based on the intuition that a batsman scoring high against top bowlers of a strong team or a bowler taking wickets against top batsmen of a strong team is considered as a good player. The experimental results show that RPLF provides promising insights of players' rankings. RLFP is a generic method and can be applied to different sports for ranking players.

Research paper thumbnail of Predicting Student Performance using Advanced Learning Analytics

Educational Data Mining (EDM) and Learning Analytics (LA) research have emerged as interesting ar... more Educational Data Mining (EDM) and Learning Analytics (LA) research have emerged as interesting areas of research, which are unfolding useful knowledge from educational databases for many purposes such as predicting students' success. The ability to predict a student's performance can be beneficial for actions in modern educational systems. Existing methods have used features which are mostly related to academic performance, family income and family assets; while features belonging to family expenditures and students' personal information are usually ignored. In this paper, an effort is made to investigate aforementioned feature sets by collecting the scholarship holding students' data from different universities of Pakistan. Learning analytics, discriminative and generative classification models are applied to predict whether a student will be able to complete his degree or not. Experimental results show that proposed method significantly outperforms existing methods due to exploitation of family expenditures and students' personal information feature sets. Outcomes of this EDM/LA research can serve as policy improvement method in higher education.

Research paper thumbnail of Finding Rising Stars in Co-Author Networks via Weighted Mutual Influence

Finding rising stars is a challenging and interesting task which is being investigated recently i... more Finding rising stars is a challenging and interesting task which is being investigated recently in co-author networks. Rising stars are authors who have a low research profile in the start of their career but may become experts in the future. This paper introduces a new method Weighted Mutual Influence Rank (WMIRank) for finding rising stars. WMIRank exploits influence of co-authors' citations, order of appearance and publication venues. Comprehensive experiments are performed to analyze the performance of WMIRank in comparison to baseline methods, which have ignored weighted mutual influence. AMiner 1 data for years 1995-2000 is used for experiments. List of top 30 authors as per proposed and baseline methods are compared for their average number of papers, average number of citations and achievements. Experimental results provide convincing evidence of the effectiveness of the investigated weighted mutual influence.

[Research paper thumbnail of An Adaptive Method for Clustering by Fast Search-and-Find of Density Peaks [Adaptive-DP](https://attachments.academia-assets.com/52566007/thumbnails/1.jpg)

Clustering by fast search and find of density peaks (DP) is a method in which density peaks are u... more Clustering by fast search and find of density peaks (DP) is a method in which density peaks are used to select the number of cluster centers. The DP has two input parameters: 1) the cutoff distance and 2) cluster centers. Also in DP, different methods are used to measure the density of underlying datasets. To overcome the limitations of DP, an Adaptive-DP method is proposed. In Adaptive-DP method, heat-diffusion is used to estimate density, cutoff distance is simplified, and novel method is used to discover exact number of cluster centers, adaptively. To validate the proposed method, we tested it on synthetic and real datasets, and comparison are done with the state of the art clustering methods. The experimental results validate the robustness and effectiveness of proposed method.

Research paper thumbnail of Topic-based heterogeneous rank

Research paper thumbnail of New Review of Hypermedia and Multimedia Finding the top influential bloggers based on productivity and popularity features

A blog acts as a platform of virtual communication to share comments or views about products, eve... more A blog acts as a platform of virtual communication to share comments or views about products, events and social issues. Like other social web activities, blogging actions spread to a large number of people. Users influence others in many ways, such as buying a product, having a particular political or social opinion or initiating new activity. Finding the top influential bloggers is an active research domain as it helps us in various fields, such as online marketing, e-commerce, product search and eadvertisements. There exist various models to find the influential bloggers, but they consider limited features using non-modular approach. This paper proposes a new model, Popularity and Productivity Model (PPM), based on a modular approach to find the top influential bloggers. It consists of popularity and productivity modules which exploit various features. We discuss the role of each proposed and existing features and evaluate the proposed model against the standard baseline models using datasets from the real-world blogs. The analysis using standard performance evaluation measures verifies that both productivity and popularity modules play a vital role to find influential bloggers in blogging community in an effective manner.

Research paper thumbnail of Standing on the shoulders of giants

Young scholars in academia often seek to work in collaboration with top researchers in their fiel... more Young scholars in academia often seek to work in collaboration with top researchers in their field in pursuit of a successful career. While success in academia can be defined differently, everyone agrees that training with a well-known researcher can help lead to an efficacious career. This study aims to investigate whether collaborating with established scientists does, in fact, improve junior scholars' chances of success. If not, what makes young scientists soar in their academic careers? We investigate this question by analyzing the effect of collaboration with a known-star on success of a young scholar. The results suggest that working with leading experts can lead to a successful career, but that it is not the only way. Researchers who were not fortunate enough to start their career with an elite researcher could still succeed through hard work and passion. These findings emerged from analyses of two discrete sets of well-known scholars on the career of newcomers, suggesting their strength and validity.

Research paper thumbnail of Unified Author Ranking based on Integrated Publication and Venue Rank

Authors' ranking can be used to determine authenticity of authors in particular domain. Several d... more Authors' ranking can be used to determine authenticity of authors in particular domain. Several different methods for author ranking focusing on number of publications and number of citations are proposed. In this paper, we propose ranking algorithms for publications, conferences, journals and respective authors. In publication ranking, both incoming and outgoing citations are considered. In case a publication is published in a well-reputed venue (conference or journal) then it is expected to have a high number of citations. Resultantly, due importance is given to venues and their scores are computed from popularity of their publications. Both publications' ranking and venue scores are used to rank authors, where authors having published in well reputed venues would have added benefits. We used multiple features to rank publications and venue effectively. These scores are then further used for ranking authors, instead of just using the number of citations for author ranking. Results of comparative study show a significant improvement in author ranking due to the inclusion of proposed features.

Research paper thumbnail of Modelling to identify influential bloggers in the blogosphere: A survey

The user participatory nature of the social web has revolutionized the use of the conventional we... more The user participatory nature of the social web has revolutionized the use of the conventional web. The social web is an integral part of our daily life. Due to the resulting exponential growth of the social web, a number of research domains have emerged, involving research activities that aim to study human nature, to analyse human sentiments and emotions, and to find the impact of various users in the social networks. Recently, the research focus has shifted to identifying a user's influence on other users in a social network. In the recent literature, we find a number of models proposed to find the most influential users in the blogging community. In this paper, we review the models to find these influential bloggers. The existing models are classified into feature-based and network-based categories. The feature-based models consider the salient factors to measure bloggers' influence. The network models, on the other hand, consider the graph-based social network structure of the bloggers to identify those who have the most impact on fellow members. This survey introduces each model with its features, novel aspects, and the datasets used. In addition to the discussion about the model, a comparative analysis of the datasets is presented. We conclude by discussing applications of the relevant literature, exploring open research issues and challenges, and sharing possible future directions in this active area of research.

Research paper thumbnail of Urdu language processing: a survey

Extensive work has been done on different activities of natural language processing for Western l... more Extensive work has been done on different activities of natural language processing for Western languages as compared to its Eastern counterparts particularly South Asian Languages. Western languages are termed as resource-rich languages. Core linguistic resources e.g. corpora, WordNet, dictionaries, gazetteers and associated tools being developed for Western languages are customarily available. Most South Asian Languages are low resource languages e.g. Urdu is a South Asian Language, which is among the widely spoken languages of sub-continent. Due to resources scarcity not enough work has been conducted for Urdu. The core objective of this paper is to present a survey regarding different linguistic resources that exist for Urdu language processing, to highlight different tasks in Urdu language processing and to discuss different state of the art available techniques. Conclusively, this paper attempts to describe in detail the recent increase in interest and progress made in Urdu language processing research. Initially, the available datasets for Urdu language are discussed. Characteristic, resource sharing between Hindi and Urdu, orthography, and morphology of Urdu language are provided. The aspects of the pre-processing activities such as stop words removal, Diacritics removal, Normalization and Stemming are illustrated. A review of state of the art research for the tasks such as Tokenization, Sentence Boundary Detection, Part of Speech tagging, Named Entity Recognition, Parsing and development of WordNet tasks are discussed. In addition, impact of ULP on application areas, such as, Information Retrieval, Classification and plagiarism detection is investigated. Finally, open issues and future directions for this new and dynamic area of research are provided. The goal of this paper is to organize the ULP work in a way that it can provide a platform for ULP research activities in future. Keywords Urdu language processing (ULP) · Datasets · Characteristics · Natural language processing (NLP) · Part-of-speech (POS) · Named entity recognition (NER) · Sentence boundary detection (SBD)

Research paper thumbnail of Expert Ranking using Reputation and Answer Quality of Co-existing Users

Online discussion forums provide knowledge sharing facilities to online communities. Usage of onl... more Online discussion forums provide knowledge sharing facilities to online communities. Usage of online discussion forums has increased tremendously due to the variety of services and their ability of common users to ask question and provide answers. With the passage of time, these forums can accumulate huge contents. Some of these posted discussions may not contain quality contents and may reflect users' personal opinions about topic which may contradict with a relevant answer. These low quality discussions indicate the existence of unprofessional users. Therefore, it is imperative to rank an expert in online forums. Most of the existing expert-ranking techniques consider only user's social network authority and content relevancy features as parameters of evaluating user expertise. But user reputation as a group member of thread repliers is not considered. In this context a novel solution of expert ranking in online discussion forums is proposed. We proposed two expert ranking techniques: The first technique is based on user and their co-existing user's reputation in different threaded discussions, and the second technique is based on user answers' quality and their category specialty features. Furthermore, we extended a technique expertise rank with our proposed features sets. The experimental study based on real dataset shows that the following proposed techniques perform better than existing techniques.

Research paper thumbnail of AUTHOR PRODUCTIVITY INDEXING VIA TOPIC SENSITIVE WEIGHTED CITATIONS

— Different author productivity indexing methods have been proposed in order to rank scientists o... more — Different author productivity indexing methods have been proposed in order to rank scientists on the basis of their research work. The author productivity indexing methods present in literature do not consider the topic based contribution of authors for assigning them the weighted citations in a multi-authored paper. This study proposed TSWC-index which assigns Topic Sensitive Weighted Citations to authors of a paper according to their topic relatedness. Topic of co-authors in each paper against its first author has been checked and more weight is assigned to the co-authors if their topic is same as first author. The results are compared with h-index and k th rank index. Proposed method clearly shows significant difference among author's full citations score, weighted citations score and topic sensitive weighted citations score.

Research paper thumbnail of 2016-PLOS One Jrl-MIIB A Metric to Identify Top Influential Bloggers in a Community.pdf

Research paper thumbnail of 2016-KJS Jrl-Impact of mutual influence while ranking authors in a co-authorship network.pdf

Online bibliographic databases are providing significant resources to conduct analysis of academi... more Online bibliographic databases are providing significant resources to conduct analysis of academic social networks. We believe that work of an author is always influenced by work of his or her co-authors. In this study, we investigate the impact of productivity and quality of work of an author's co-authors on his or her ranking along with his own contribution. We propose mutual influence (MI) based ranking method, which ranks authors based on (1) Publications of an author, along with impact of publications of his or her co-authors, (2) Normalized author position based Citations weight, which is calculated from the citations received by an author with respect to position of his or her name in the co-authors list,(3) MINCC that combines the impact of both factors. A series of experiments has been conducted and results show that proposed approach has capability to ranks authors in a significant way.

Research paper thumbnail of 2016-KJS Journal-A survey on the state-of-the-art machine learning models in the context of NLP.pdf

Machine learning and Statistical techniques are powerful analysis tools yet to be incorporated in... more Machine learning and Statistical techniques are powerful analysis tools yet to be incorporated in the new multidisciplinary field diversely termed as natural language processing (NLP) or computational linguistic. The linguistic knowledge may be ambiguous or contains ambiguity; therefore, various NLP tasks are carried out in order to resolve the ambiguity in speech and language processing.The current prevailing techniques for addressing various NLP tasks as a supervised learning are hidden Markov models (HMM), conditional random field (CRF), maximum entropy models (MaxEnt), support vector machines (SVM), Naïve Bays, and deep learning (DL).The goal of this survey paper is to highlight ambiguity in speech and language processing, to provide brief overview of basic categories of linguistic knowledge, to discuss different existing machine learning models and their classification into different categories and finally to provide a comprehensive review of different state of the art machine learning models with the goal that new researchers look into these techniques and depending on these, develops advance techniques. In this survey we reviewed how avantgrademachine learning models can help in this dilemma.

Research paper thumbnail of 2016-JOS Jrl-A novel framework for social web forums thread ranking based on semantics and post quality features.pdf

Online discussion forums are a valuable source of knowledge. Users may share or exchange ideas by... more Online discussion forums are a valuable source of knowledge. Users may share or exchange ideas by posting content in the form of questions and answers. With the increasing volume of online content in the form of forums, finding relevant information in forums can be a challenging task and knowledge management and quality assurance of this content are of critical importance. Although online discussion forums offer search services, in most cases only keyword search is provided. In keyword search techniques, such as cosine similarity, lexical overlap between query and document terms is considered; however, these techniques do not consider the context or meaning of the terms, thus failed to retrieve the relevant documents. Earlier content-based research efforts for improving the performance of thread retrieval were primarily based on cosine similarity technique. Cosine similarity technique assigns term-weights based on term-frequency and inverse-document frequency; however, this technique does not consider discussion semantics which may lead to less effective document retrieval. To address these issues, we have proposed two thread ranking tech-B Ch. Muhammad Shahzad Faisal niques for online discussion forums: (1) threads are ranked on the basis of a semantic similarity score between posts and (2) threads are ranked based on their participants' reputation and posts' quality. The proposed work provides a performance comparison between semantic similarity techniques and cosine similarity techniques along with reputation and post quality features in thread ranking process. Experimental results obtained using a real online forum dataset demonstrate that the proposed techniques have significantly improved thread ranking performance.

Research paper thumbnail of A Review of Career Selection Models

Researchpedia Journal of Computing, 2020

Career selection plays an important role in every industry. It has immense influence on individua... more Career selection plays an important role in every industry. It has immense influence on individuals’ futures. Due to its importance, many researchers have proposed models and guidelines for career selection to help people select careers systematically. This article discusses various career selection models and guidelines. It elaborates the elements and factors that contribute to the career selection process and classifies them according to specific professions. It also highlights the limitations of existing models and suggests solutions and directions for future work. As the studies discussed in this article were conducted in various countries, the findings provide an international perspective that includes implications for the Arab world, China and Europe.

Available from: https://rpjc.researchpedia.info/rpjc_2020_4-a-review-of-career-selection-models/

Research paper thumbnail of Using network science to understand the link between subjects and professions

Computers in Human Behavior 106:106228, 2020

studied by people often impact their careers. The relation between education and careers has been... more studied by people often impact their careers. The relation between education and careers has been well studied by social scientists, however limited research on this relation is available in network science. Network science has emerged as a promising field to understand complex systems. We study the relation between education and careers from a network science perspective. In this research we propose methods from network science to understand the relation between the subjects studied by a person and the impact of these subjects on the career of the person. We model the relation between favorite subjects and careers using a network. The model helps in understanding the positive and negative contributions of certain subjects towards other subjects and careers. The results show that mathematics and English are the two basic subjects that are highly related to most of the professions. The detailed results are of particular significance for people associated with higher education and career development.

Research paper thumbnail of A Review of Career Selection Models

Researchpedia Journal of Computing, 2020

Career selection plays an important role in every industry. It has immense influence on individu... more Career selection plays an important role in every industry. It has immense influence on individuals’ futures. Due to its importance, many researchers have proposed models and guidelines for career selection to help people select careers systematically. This article discusses various career selection models and guidelines. It elaborates the elements and factors that contribute to the career selection process and classifies them according to specific professions. It also highlights the limitations of existing models and suggests solutions and directions for future work. As the studies discussed in this article were conducted in various countries, the findings provide an international perspective that includes implications for the Arab world, China and Europe

Research paper thumbnail of Measuring the Impact of Topic Drift in Scholarly Networks

With the increase in collaboration among researchers of various disciplines, changing the researc... more With the increase in collaboration among researchers of various disciplines, changing the research topic or working on multiple topics is not an unusual behavior. Several comprehensive efforts have been made for predicting, quantifying, and studying the researcher's impact. The question, that how the change in the field of interest over time or working in more than one topics can influence the scientific impact, remains unanswered. In this research, we study the effect of topic drift on the scientific impact of an author. We apply Author Conference Topic (ACT) model to extract topic distribution of individual authors who are working on multiple topics to compare and analyze with authors who work on a single topic. We analyze the productivity of the authors on the basis of publication count, citation count and h-index. We find that authors who stick to one topic, produce a higher impact and gain more attention. To further strengthen our results we gather the h-index of top-ranked authors working on one topic and top-ranked authors working on multiple topics and examine whether there are similar trends in their progress. The results show an evidence of significant impact of topic drift on career choices of researchers.

Research paper thumbnail of Region-wise Ranking of Sports Players based on Link Fusion

Players are ranked in various sports to show their importance over other players. Existing method... more Players are ranked in various sports to show their importance over other players. Existing methods only consider intra-type links (e.g., player to player and team to team), but ignore inter-type links (e.g., one type of player to another type of player, such as batsman to bowler and player to team) based on cognitive aspects. They also ignore the spatiality of the players. There is a strong relationship among players and their teams, which can be represented as a network consisting of multi-type interrelated objects. In this paper, we propose a players' ranking method, called Region-wise Players Link Fusion (RPLF) which is applied to the sport of cricket. RPLF considers players' region-wise intra-type and inter-type relation-based features to rank the players. Considering multi-type interrelated objects is based on the intuition that a batsman scoring high against top bowlers of a strong team or a bowler taking wickets against top batsmen of a strong team is considered as a good player. The experimental results show that RPLF provides promising insights of players' rankings. RLFP is a generic method and can be applied to different sports for ranking players.

Research paper thumbnail of Predicting Student Performance using Advanced Learning Analytics

Educational Data Mining (EDM) and Learning Analytics (LA) research have emerged as interesting ar... more Educational Data Mining (EDM) and Learning Analytics (LA) research have emerged as interesting areas of research, which are unfolding useful knowledge from educational databases for many purposes such as predicting students' success. The ability to predict a student's performance can be beneficial for actions in modern educational systems. Existing methods have used features which are mostly related to academic performance, family income and family assets; while features belonging to family expenditures and students' personal information are usually ignored. In this paper, an effort is made to investigate aforementioned feature sets by collecting the scholarship holding students' data from different universities of Pakistan. Learning analytics, discriminative and generative classification models are applied to predict whether a student will be able to complete his degree or not. Experimental results show that proposed method significantly outperforms existing methods due to exploitation of family expenditures and students' personal information feature sets. Outcomes of this EDM/LA research can serve as policy improvement method in higher education.

Research paper thumbnail of Finding Rising Stars in Co-Author Networks via Weighted Mutual Influence

Finding rising stars is a challenging and interesting task which is being investigated recently i... more Finding rising stars is a challenging and interesting task which is being investigated recently in co-author networks. Rising stars are authors who have a low research profile in the start of their career but may become experts in the future. This paper introduces a new method Weighted Mutual Influence Rank (WMIRank) for finding rising stars. WMIRank exploits influence of co-authors' citations, order of appearance and publication venues. Comprehensive experiments are performed to analyze the performance of WMIRank in comparison to baseline methods, which have ignored weighted mutual influence. AMiner 1 data for years 1995-2000 is used for experiments. List of top 30 authors as per proposed and baseline methods are compared for their average number of papers, average number of citations and achievements. Experimental results provide convincing evidence of the effectiveness of the investigated weighted mutual influence.

[Research paper thumbnail of An Adaptive Method for Clustering by Fast Search-and-Find of Density Peaks [Adaptive-DP](https://attachments.academia-assets.com/52566007/thumbnails/1.jpg)

Clustering by fast search and find of density peaks (DP) is a method in which density peaks are u... more Clustering by fast search and find of density peaks (DP) is a method in which density peaks are used to select the number of cluster centers. The DP has two input parameters: 1) the cutoff distance and 2) cluster centers. Also in DP, different methods are used to measure the density of underlying datasets. To overcome the limitations of DP, an Adaptive-DP method is proposed. In Adaptive-DP method, heat-diffusion is used to estimate density, cutoff distance is simplified, and novel method is used to discover exact number of cluster centers, adaptively. To validate the proposed method, we tested it on synthetic and real datasets, and comparison are done with the state of the art clustering methods. The experimental results validate the robustness and effectiveness of proposed method.

Research paper thumbnail of Topic-based heterogeneous rank

Research paper thumbnail of New Review of Hypermedia and Multimedia Finding the top influential bloggers based on productivity and popularity features

A blog acts as a platform of virtual communication to share comments or views about products, eve... more A blog acts as a platform of virtual communication to share comments or views about products, events and social issues. Like other social web activities, blogging actions spread to a large number of people. Users influence others in many ways, such as buying a product, having a particular political or social opinion or initiating new activity. Finding the top influential bloggers is an active research domain as it helps us in various fields, such as online marketing, e-commerce, product search and eadvertisements. There exist various models to find the influential bloggers, but they consider limited features using non-modular approach. This paper proposes a new model, Popularity and Productivity Model (PPM), based on a modular approach to find the top influential bloggers. It consists of popularity and productivity modules which exploit various features. We discuss the role of each proposed and existing features and evaluate the proposed model against the standard baseline models using datasets from the real-world blogs. The analysis using standard performance evaluation measures verifies that both productivity and popularity modules play a vital role to find influential bloggers in blogging community in an effective manner.

Research paper thumbnail of Standing on the shoulders of giants

Young scholars in academia often seek to work in collaboration with top researchers in their fiel... more Young scholars in academia often seek to work in collaboration with top researchers in their field in pursuit of a successful career. While success in academia can be defined differently, everyone agrees that training with a well-known researcher can help lead to an efficacious career. This study aims to investigate whether collaborating with established scientists does, in fact, improve junior scholars' chances of success. If not, what makes young scientists soar in their academic careers? We investigate this question by analyzing the effect of collaboration with a known-star on success of a young scholar. The results suggest that working with leading experts can lead to a successful career, but that it is not the only way. Researchers who were not fortunate enough to start their career with an elite researcher could still succeed through hard work and passion. These findings emerged from analyses of two discrete sets of well-known scholars on the career of newcomers, suggesting their strength and validity.

Research paper thumbnail of Unified Author Ranking based on Integrated Publication and Venue Rank

Authors' ranking can be used to determine authenticity of authors in particular domain. Several d... more Authors' ranking can be used to determine authenticity of authors in particular domain. Several different methods for author ranking focusing on number of publications and number of citations are proposed. In this paper, we propose ranking algorithms for publications, conferences, journals and respective authors. In publication ranking, both incoming and outgoing citations are considered. In case a publication is published in a well-reputed venue (conference or journal) then it is expected to have a high number of citations. Resultantly, due importance is given to venues and their scores are computed from popularity of their publications. Both publications' ranking and venue scores are used to rank authors, where authors having published in well reputed venues would have added benefits. We used multiple features to rank publications and venue effectively. These scores are then further used for ranking authors, instead of just using the number of citations for author ranking. Results of comparative study show a significant improvement in author ranking due to the inclusion of proposed features.

Research paper thumbnail of Modelling to identify influential bloggers in the blogosphere: A survey

The user participatory nature of the social web has revolutionized the use of the conventional we... more The user participatory nature of the social web has revolutionized the use of the conventional web. The social web is an integral part of our daily life. Due to the resulting exponential growth of the social web, a number of research domains have emerged, involving research activities that aim to study human nature, to analyse human sentiments and emotions, and to find the impact of various users in the social networks. Recently, the research focus has shifted to identifying a user's influence on other users in a social network. In the recent literature, we find a number of models proposed to find the most influential users in the blogging community. In this paper, we review the models to find these influential bloggers. The existing models are classified into feature-based and network-based categories. The feature-based models consider the salient factors to measure bloggers' influence. The network models, on the other hand, consider the graph-based social network structure of the bloggers to identify those who have the most impact on fellow members. This survey introduces each model with its features, novel aspects, and the datasets used. In addition to the discussion about the model, a comparative analysis of the datasets is presented. We conclude by discussing applications of the relevant literature, exploring open research issues and challenges, and sharing possible future directions in this active area of research.

Research paper thumbnail of Urdu language processing: a survey

Extensive work has been done on different activities of natural language processing for Western l... more Extensive work has been done on different activities of natural language processing for Western languages as compared to its Eastern counterparts particularly South Asian Languages. Western languages are termed as resource-rich languages. Core linguistic resources e.g. corpora, WordNet, dictionaries, gazetteers and associated tools being developed for Western languages are customarily available. Most South Asian Languages are low resource languages e.g. Urdu is a South Asian Language, which is among the widely spoken languages of sub-continent. Due to resources scarcity not enough work has been conducted for Urdu. The core objective of this paper is to present a survey regarding different linguistic resources that exist for Urdu language processing, to highlight different tasks in Urdu language processing and to discuss different state of the art available techniques. Conclusively, this paper attempts to describe in detail the recent increase in interest and progress made in Urdu language processing research. Initially, the available datasets for Urdu language are discussed. Characteristic, resource sharing between Hindi and Urdu, orthography, and morphology of Urdu language are provided. The aspects of the pre-processing activities such as stop words removal, Diacritics removal, Normalization and Stemming are illustrated. A review of state of the art research for the tasks such as Tokenization, Sentence Boundary Detection, Part of Speech tagging, Named Entity Recognition, Parsing and development of WordNet tasks are discussed. In addition, impact of ULP on application areas, such as, Information Retrieval, Classification and plagiarism detection is investigated. Finally, open issues and future directions for this new and dynamic area of research are provided. The goal of this paper is to organize the ULP work in a way that it can provide a platform for ULP research activities in future. Keywords Urdu language processing (ULP) · Datasets · Characteristics · Natural language processing (NLP) · Part-of-speech (POS) · Named entity recognition (NER) · Sentence boundary detection (SBD)

Research paper thumbnail of Expert Ranking using Reputation and Answer Quality of Co-existing Users

Online discussion forums provide knowledge sharing facilities to online communities. Usage of onl... more Online discussion forums provide knowledge sharing facilities to online communities. Usage of online discussion forums has increased tremendously due to the variety of services and their ability of common users to ask question and provide answers. With the passage of time, these forums can accumulate huge contents. Some of these posted discussions may not contain quality contents and may reflect users' personal opinions about topic which may contradict with a relevant answer. These low quality discussions indicate the existence of unprofessional users. Therefore, it is imperative to rank an expert in online forums. Most of the existing expert-ranking techniques consider only user's social network authority and content relevancy features as parameters of evaluating user expertise. But user reputation as a group member of thread repliers is not considered. In this context a novel solution of expert ranking in online discussion forums is proposed. We proposed two expert ranking techniques: The first technique is based on user and their co-existing user's reputation in different threaded discussions, and the second technique is based on user answers' quality and their category specialty features. Furthermore, we extended a technique expertise rank with our proposed features sets. The experimental study based on real dataset shows that the following proposed techniques perform better than existing techniques.

Research paper thumbnail of AUTHOR PRODUCTIVITY INDEXING VIA TOPIC SENSITIVE WEIGHTED CITATIONS

— Different author productivity indexing methods have been proposed in order to rank scientists o... more — Different author productivity indexing methods have been proposed in order to rank scientists on the basis of their research work. The author productivity indexing methods present in literature do not consider the topic based contribution of authors for assigning them the weighted citations in a multi-authored paper. This study proposed TSWC-index which assigns Topic Sensitive Weighted Citations to authors of a paper according to their topic relatedness. Topic of co-authors in each paper against its first author has been checked and more weight is assigned to the co-authors if their topic is same as first author. The results are compared with h-index and k th rank index. Proposed method clearly shows significant difference among author's full citations score, weighted citations score and topic sensitive weighted citations score.

Research paper thumbnail of 2016-PLOS One Jrl-MIIB A Metric to Identify Top Influential Bloggers in a Community.pdf

Research paper thumbnail of 2016-KJS Jrl-Impact of mutual influence while ranking authors in a co-authorship network.pdf

Online bibliographic databases are providing significant resources to conduct analysis of academi... more Online bibliographic databases are providing significant resources to conduct analysis of academic social networks. We believe that work of an author is always influenced by work of his or her co-authors. In this study, we investigate the impact of productivity and quality of work of an author's co-authors on his or her ranking along with his own contribution. We propose mutual influence (MI) based ranking method, which ranks authors based on (1) Publications of an author, along with impact of publications of his or her co-authors, (2) Normalized author position based Citations weight, which is calculated from the citations received by an author with respect to position of his or her name in the co-authors list,(3) MINCC that combines the impact of both factors. A series of experiments has been conducted and results show that proposed approach has capability to ranks authors in a significant way.

Research paper thumbnail of 2016-KJS Journal-A survey on the state-of-the-art machine learning models in the context of NLP.pdf

Machine learning and Statistical techniques are powerful analysis tools yet to be incorporated in... more Machine learning and Statistical techniques are powerful analysis tools yet to be incorporated in the new multidisciplinary field diversely termed as natural language processing (NLP) or computational linguistic. The linguistic knowledge may be ambiguous or contains ambiguity; therefore, various NLP tasks are carried out in order to resolve the ambiguity in speech and language processing.The current prevailing techniques for addressing various NLP tasks as a supervised learning are hidden Markov models (HMM), conditional random field (CRF), maximum entropy models (MaxEnt), support vector machines (SVM), Naïve Bays, and deep learning (DL).The goal of this survey paper is to highlight ambiguity in speech and language processing, to provide brief overview of basic categories of linguistic knowledge, to discuss different existing machine learning models and their classification into different categories and finally to provide a comprehensive review of different state of the art machine learning models with the goal that new researchers look into these techniques and depending on these, develops advance techniques. In this survey we reviewed how avantgrademachine learning models can help in this dilemma.

Research paper thumbnail of 2016-JOS Jrl-A novel framework for social web forums thread ranking based on semantics and post quality features.pdf

Online discussion forums are a valuable source of knowledge. Users may share or exchange ideas by... more Online discussion forums are a valuable source of knowledge. Users may share or exchange ideas by posting content in the form of questions and answers. With the increasing volume of online content in the form of forums, finding relevant information in forums can be a challenging task and knowledge management and quality assurance of this content are of critical importance. Although online discussion forums offer search services, in most cases only keyword search is provided. In keyword search techniques, such as cosine similarity, lexical overlap between query and document terms is considered; however, these techniques do not consider the context or meaning of the terms, thus failed to retrieve the relevant documents. Earlier content-based research efforts for improving the performance of thread retrieval were primarily based on cosine similarity technique. Cosine similarity technique assigns term-weights based on term-frequency and inverse-document frequency; however, this technique does not consider discussion semantics which may lead to less effective document retrieval. To address these issues, we have proposed two thread ranking tech-B Ch. Muhammad Shahzad Faisal niques for online discussion forums: (1) threads are ranked on the basis of a semantic similarity score between posts and (2) threads are ranked based on their participants' reputation and posts' quality. The proposed work provides a performance comparison between semantic similarity techniques and cosine similarity techniques along with reputation and post quality features in thread ranking process. Experimental results obtained using a real online forum dataset demonstrate that the proposed techniques have significantly improved thread ranking performance.