Computing Semantic Similarity between Skill Statements for Approximate Matching (original) (raw)

Computing Similarities between Natural Language Descriptions of Knowledge and Skills

Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g. , payment of royalties). Copies may be requested from IBM T.

Finding Skills through ranked semantic match of descriptions

2003

Abstract: We propose a formal approach to Ontology-Based Semantic Matchmaking between Skills request and offer, devised as a virtual marketplace of knowledge. In such a knowledge market metaphor, skills are a peculiar kind of goods that have distinguishing characteristics with respect to traditional assets. Buyers are entities that need the skills of people, such as projects, departments and organizations; sellers are workers that offer their own skills.

A Graph-Based Approach to Skill Extraction from Text

Proceedings of the TextGraphs-8 Workshop , pages 79–87, Seattle, Washington, USA, 18 October 2013. 2013 Association for Computational Linguistics , 2013

This paper presents a system that performs skill extraction from text documents. It outputs a list of professional skills that are relevant to a given input text. We argue that the system can be practical for hiring and management of personnel in an organization. We make use of the texts and the hyperlink graph of Wikipedia, as well as a list of professional skills obtained from the LinkedIn social network. The system is based on first computing similarities between an input document and the texts of Wikipedia pages and then using a biased, hub-avoiding version of the Spreading Activation algorithm on the Wikipedia graph in order to associate the input document with skills.

Towards A Skills Taxonomy

When evaluating job applications, recruiters and employers try to determine whether the information that is provided by a job seeker is accurate and whether it describes an individual that possesses sufficient skills. These questions are related to the hidden skills and skills resolution problems. In this paper we argue that using a skills taxonomy to identify and resolve unknown relationships between text that describe an applicant and job descriptions is the best way for addressing these problems. Unfortunately, no comprehensive , publicly available taxonomy exists. To this end, this work proposes an automated process for creating a skills taxonomy. Effective and efficient methods for bootstrapping a taxon-omy are critical to any process that names and characterizes the properties and interrelationships of entities. To this end, we present three potential methods for bootstrapping and extending our skills taxonomy. We propose a hybrid scheme that combines the beneficial features of those methods. Our hybrid approach seeds the bootstrapping process with publicly available resources and identifies new skill terms and corresponding entity relationships. In this paper, we focus specifically on using Wikipedia as our corpus and exploiting its structure to populate the taxonomy. We begin by constructing a relationship graph of possible skill terms from Wikipedia. We then use a data mining methodology to identify skill terms. Our results are promising, and we are able to achieve a 98% classification rate.

Integrating Semantic Knowledge into Text Similarity and Information Retrieval

International Conference on Semantic Computing (ICSC 2007), 2007

This paper studies the influence of lexical semantic knowledge upon two related tasks: ad-hoc information retrieval and text similarity. For this purpose, we compare the performance of two algorithms: (i) using semantic relatedness, and (ii) using a conventional extended Boolean model [12]. For the evaluation, we use two different test collections in the German language: (i) GIRT [5] for the information retrieval task, and (ii) a collection of descriptions of professions built to evaluate a system for electronic career guidance in the information retrieval and text similarity task. We found that integrating lexical semantic knowledge improves performance for both tasks. On the GIRT corpus, the performance is improved only for short queries. The performance on the collection of professional descriptions is improved, but crucially depends on the preprocessing of natural language essays employed as topics.

Determination of Professional Competencies Using an Alignment Algorithm of Academic Profiles and Job Advertisements, Based on Competence Thesauri and Similarity Measures

International Journal of Artificial Intelligence in Education, 2019

Describing the competencies required by a profession is essential for aligning online profiles of job seekers and job advertisements. Comparing the competencies described within each context has typically not be done, which has generated a complete disconnect in language between them. This work presents an approach for the alignment of online profiles and job advertisements, according to knowledge and skills, using measures of lexical, syntactic and taxonomic similarity. In addition, we use a ranking that allows the alignment of the profiles to the topics of a thesaurus that define competencies. The results are promising, because the combination of the measures of similarity with the alignment with thesauri of competencies offers robustness to the process of generation of professional competence descriptions. This combination allows dealing with the common problems of synonymy, homonymy, hypernymy/hyponymy and meronymy of the terms in Spanish. This research uses natural language processing to offer a novel approach for assessing the match of the competencies described by the applicants and by the employers, even if they use different terminology. The resulting approach, while developed in Spanish for computer science jobs, can be extended to other languages and domains, such is the case of recruitment, where it will contribute to the creation of better tools that give feedback to job seekers about how to best align their competencies with job opportunities.

On Constructing, Grouping and Using Topical Ontology for Semantic Matching

2009

An ontology topic is used to group concepts from different contexts (or even from different domain ontologies). This paper presents a pattern-driven modeling methodology for constructing and grouping topics in an ontology (PAD-ON methodology), which is used for matching similarities between competences in the human resource management (HRM) domain. The methodology is supported by a tool called PAD-ON. This paper demonstrates our recent achievement in the work from the EC Prolix project. The paper approach is applied to the training processes at British Telecom as the test bed.

Identifying Competences in IT Professionals through Semantics

Analyzing the Future

In current organizations, the importance of knowledge and competence is unquestionable. In Information Technology (IT) companies, which are, by definition, knowledge intensive, this importance is critical. In such organizations, the models of knowledge exploitation include specific processes and elements that drive the production of knowledge aimed at satisfying organizational objectives. However, competence evidence recollection is a highly intensive and time consuming task, which is the key point for this system. SeCEC-IT is a tool based on software artifacts that extracts relevant information using natural language processing techniques. It enables competence evidence detection by deducing competence facts from documents in an automated way. SeCEC-IT includes within its technological components such items as semantic technologies, natural language processing, and human resource communication standards (HR-XML).

A syntactic approach for searching similarities within sentences

Proceedings of the eleventh international conference on Information and knowledge management - CIKM '02, 2002

Textual data is the main electronic form of knowledge representation. Sentences, meant as logic units of meaningful word sequences, can be considered its backbone. In this paper, we propose a solution based on a purely syntactic approach for searching similarities within sentences, named approximate sub 2 sequence matching. This process being very time consuming, efficiency in retrieving the most similar parts available in large repositories of textual data is ensured by making use of new filtering techniques. As far as the design of the system is concerned, we chose a solution that allows us to deploy approximate sub 2 sequence matching without changing the underlying database.