Semi-automatic taxonomy development for research data collections : the case of wind energy (original) (raw)

Trend Topic Analysis for Wind Energy Researches: A Data Mining Approach Using Text Mining

Journal of Technology Innovations in Renewable Energy, 2016

This study reviews and analyses the recent research and development and trends in the applications of wind energy and it also discusses and summarizes the topic. We show the usage and the influence of text mining on the different aspects of wind energy systems especially for hot topics and trends of wind energy area. Text mining provides the state of the art in this area that will be a good guidance for future research work. The main results achieved from the study have shown that the text mining technique are adequate for serving as a proof of concept and as a test-bed for deriving requirements for the development of more generally applicable text mining tools and services within wind energy science.

A Unified Approach for Taxonomy-Based Technology Forecasting

2011

For decision makers and researchers working in a technical domain, understanding the state of their area of interest is of the highest importance. For this reason, we consider in this chapter, a novel framework for Web-based technology forecasting using bibliometrics (i.e. the analysis of information from trends and patterns of scientific publications). The proposed framework consists of a few conceptual stages based on a data acquisition process from bibliographic online repositories: extraction of domainrelevant keywords, the generation of taxonomy of the research field of interests and the development of early growth indicators which helps to find interesting technologies in their first phase of development. To provide a concrete application domain for developing and testing our tools, we conducted a case study in the field of renewable energy and in particular one of its subfields: Waste-to-Energy (W2E). The results on this particular research domain confirm the benefit of our approach.

K-Words Lab: studying the dynamics of a scientific field with keyword analysis

Research subject and hypothesis We argue that it is possible to follow the dynamics of a scientific field as it emerges. Being able to do such allows, for example, industrial actors to position themselves regarding their competences and public policy makers to support the development of the field. For that matter, we follow traces of its emergence through publications analysis. Traditional methods used in bibliometric analysis (journal analysis, co-authors analysis, co-word analysis...) are relevant when the field is structured and bonded. These methods cannot be used when the main authors of the field and its boundaries are unknown. The keywords we use are either determined by the author or given by the Web of Science, or are included in the publication abstract and title. The characterisation of that kind of sources requires programs that are robust enough to handle large amounts of data (over 500 000 publication abstracts for the nano sciences and technology corpus).

The Approach to the Meta-description of the Interdisciplinary Research Terminological Landscape

CEUR Workshop Proceedings, 2020

The paper is aimed at conducting comprehensive research on methods and tools for searching and explicating contexts of arrays of scientific information, visualizing hierarchical and associative relations between terms considering the context. It presents the findings indicated that the both, a terminological landscape and a meta-description construction, are necessary and useful for most of the interdisciplinary research, for further contextual knowledge explication but also forecast of the societal impact of studies. Research has focused on the ontological approach to the structured description of contextual knowledge and the scientific interdisciplinary domain thesaurus. There is proposed an ontological approach to the structured description of contextual knowledge, structure, and meta-description of the thesaurus of individual interdisciplinary are-as as part of the study on the development of an integrated approach to the analysis of the terminological base of developing interdisciplinary research. The structure of the thesaurus and meta-description of its elements are proposed to be formed on the basis of the Dublin Core Metadata elements (Dublin Core Metadata Element Set). It allows using the thesaurus for automated search and identification of contextual knowledge by search engines. In the applied aspect the paper considers the method of creation of open-access electronic archives for the purpose of further replenishment, systematization, and study of contextual knowledge. Description of contextual knowledge based on Dublin Core enables to automate the exchange of metadescriptions by using the standard OAI-PMH protocol.

Determining influential factors and challenges in automatic taxonomy generation: a systematic literature review of techniques 1999-2016

Inf. Res., 2019

Introduction. Taxonomy is an effective mean of managing and accessing a large amount of digital information. Various techniques have been developed to generate taxonomy automatically. The purpose of this study is threefold:(i) review methods and approaches adopted during taxonomy generation, (ii) identify the factors influencing the choice of a particular method or approach, (iii) highlight issues and open challenges. Method. This paper adopts a systematic literature review approach proposed by Kitchenham, and the nature of this review is qualitative. Analysis. A total of thirty techniques were reviewed and categorized into various categories and subcategories. An in-depth analysis of the existing techniques was performed based on this categorization. This ultimately helps in identifying factors influencing the choice of a particular method or approach, and also determines issues and challenges associated with the automatic taxonomy generation. Results. Four major factors influencing the choice of a particular method or approach for generating taxonomy have been identified. Moreover, five major challenges associated with the existing automatic taxonomy generation techniques have also been highlighted. Conclusions. This paper presents a comprehensive review of taxonomy generation so that taxonomy can be used effectively, and it highlights open challenges for future research in the area of taxonomy generation so that new and improved techniques can be developed.

Classification of Keywords Selected from Research Articles on Physics and Development of a Quantitative Subject Access Tool

2013

All research articles begin with a title. Most include an abstract. Several include keywords. All three of these features describe an article's content in details. The title sends an instant reflection of the central theme of the research topic. The abstract summarizes the content. The keywords indicate the core and allied fields of concern. The researchers and indexers quickly and easily locate particular articles within their areas of interest with the aid of keywords. Keywords hold prime importance in abstracting and indexing services. Keywords play major role in information retrieval function. This paper is based on analysis of 14,221 keywords collected from 2,526 research articles published in three journals, viz. Chaos, Physics of Plasmas and Low Temperature Physics since 2006 to 2012. Out of all these author-assigned keywords, the number of distinct bits obtained was 2571. After collection, the lexically close keywords are identified that form clusters. Several such clusters are found and the composition of keywords in nearly all clusters varies over the said time span. Four indicators have been defined on the basis of fluctuating keyword composition within clusters. The name given to these four indicators are stability index, integrated visibility index, momentary visibility index and potency index respectively. These indicators hold different values for different clusters. The value ranges of them are categorized in five groups, viz. very high, high, medium, low and very low. A new quantitative subject access tool has been proposed on the basis of these 2 indicators, which can predict the probable new and obsolete keywords in any subject domain. The name given to this new tool is keysaurus, i.e., keyword-based-thesaurus.

Dynamic Analysis of the Similarity of Objects in Research on the Use of Renewable Energy Resources in European Union Countries

Energies

The energy transformation towards renewable energy sources in the conditions of climate change and the accompanying climate risk is a priority for all countries in the world. However, the degree of advancement of activities in this area varies significantly between countries, which is the result of different activities for renewable energy sources in individual countries. The aim of this article is to determine the trends of changes in the area of the use of renewable energy sources in EU countries. The study uses TMD (taxonomic measure of development) methods and dynamic classification, which allowed to distinguish typological groups of objects with similar dynamics of the studied phenomenon. The EU 28 countries were analyzed. Statistics (Eurostat database) are provided for the period 2004–2019. As a result of the research, it was found that the Scandinavian countries and the countries of Western Europe were characterized by the highest stability in terms of the use of renewable en...

Mapping research topics using word-reference co-occurrences: A method and an exploratory case study

Scientometrics, 2006

Mapping of science and technology can be done at different levels of aggregation, using a variety of methods. In this paper, we propose a method in which title words are used as indicators for the content of a research topic, and cited references are used as the context in which words get their meaning. Research topics are represented by sets of papers that are similar in terms of these word-reference combinations. In this way we use words without neglecting differences and changes in their meanings. The method has several advantages, such as high coverage of publications. As an illustration we apply the method to produce knowledge maps of information science.

Why IEEE Xplore Matters for Research Trend Analysis in the Energy Sector

Energy Systems Research, 2021

The paper aims to briefly compare and analyze the results of queries to IEEE Xplore and the leading abstract databases Scopus and Web of Science to identify research trends. Some errors were revealed in the Author Keywords in Web of Science. Therefore, a more detailed analysis that involved comparing various types of key terms was made only for IEEE Xplore and Scopus platforms. The study employed IEEE Access journal metadata as indexed on both platforms. Sample matching for IEEE Xplore and Scopus was achieved by comparing DOI. The IEEE Xplore metadata contains more key term types, which provides an advantage in analyzing research trends. Using NSPEC Controlled Terms from expert-compiled vocabulary provides more stable data, which gives an advantage when considering the change of terms over time. Apriori, an algorithm for finding association rules, was used to compare the cooccurrence of the terms for a more detailed description of sample subjects on both platforms. VOSviewer was used to analyze trends in scientific research based on IEEE Xplore data. The 2011-2021 ten-year period was divided into two sub-intervals for comparing the occurrence of Author Keywords, IEEE Terms, and NSPEC Controlled Terms. Bibliometric data of the IEEE conference proceedings was used to illustrate the importance of context in estimating the growth rate of publishing activity on a topic of interest.