User-Oriented Approach to Data Quality Evaluation (original) (raw)
Related papers
Executable Data Quality Models
Procedia Computer Science, 2017
The paper discusses an external solution for data quality management in information systems. In contradiction to traditional data quality assurance methods, the proposed approach provides the usage of a domain specific language (DSL) for description data quality models. Data quality models consists of graphical diagrams, which elements contain requirements for data object's values and procedures for data object's analysis. The DSL interpreter makes the data quality model executable therefore ensuring measurement and improving of data quality. The described approach can be applied: (1) to check the completeness, accuracy and consistency of accumulated data; (2) to support data migration in cases when software architecture and/or data models are changed; (3) to gather data from different data sources and to transfer them to data warehouse.
Domain-Specific Characteristics of Data Quality
Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, 2017
The research discusses the issue how to describe data quality and what should be taken into account when developing an universal data quality management solution. The proposed approach is to create quality specifications for each kind of data objects and to make them executable. The specification can be executed step-by-step according to business process descriptions, ensuring the gradual accumulation of data in the database and data quality checking according to the specific use case. The described approach can be applied to check the completeness, accuracy, timeliness and consistency of accumulated data.
Riga , Latvia Executable Data Quality Models
2018
The paper discusses an external solution for data quality management in information systems. In contradiction to traditional data quality assurance methods, the proposed approach provides the usage of a domain specific language (DSL) for description data quality models. Data quality models consists of graphical diagrams, which elements contain requirements for data object’s values and procedures for data object’s analysis. The DSL interpreter makes the data quality model executable therefore ensuring measurement and improving of data quality. The described approach can be applied: (1) to check the completeness, accuracy and consistency of accumulated data; (2) to support data migration in cases when software architecture and/or data models are changed; (3) to gather data from different data sources and to transfer them to data warehouse. © 2016 The Authors. Published by Elsevier B.V. Peer-review under responsibility of organizing committee of the scientific committee of the internat...
Developing Data Quality Aware Applications
2009 Ninth International Conference on Quality Software, 2009
Inadequate levels of Data Quality (DQ) in Information Systems (IS) suppose a very important problem for organizations. In any case, they look for to assure data quality from earlier stages on information system developments. This paper proposes to incorporate mechanisms into software development methodologies, in order to integrate users DQ requirements aimed at assuring the data quality from the beginning of development. It brings a framework consisting of processes, activities and tasks, well defined, which would be incorporated in existent software development methodology, as METRICA V3; and therefore, to assure software product data quality created according to this methodology. The extension presented, is a guideline, and this can be extended and applied to other development methodologies like Unified Development Process.
An Extended Data Object-driven Approach to Data Quality Evaluation: Contextual Data Quality Analysis
2019
This research is an extension of a data object-driven approach to data quality evaluation allowing to analyse data object quality in scope of multiple data objects. Previously presented approach was used to analyse one particular data object, mainly focusing on syntactic analysis. It means that the primary data object quality can be analysed against secondary data objects of unlimited number. This opportunity allows making more comprehensive, in-depth contextual data object analysis. The given analysis was applied to open data sets, making comparison between previously obtained results and results of application of the extended approach, underlying importance and benefits of the given extension.
A Software Engineering View of Data Quality
Thirty years ago, software was not considered a concrete value. Everyone agreed on its importance, but it was not considered as a good or possession. Nowadays, software is part of the balance of an organization. Data is slowly following the same process. The information owned by an organization is an important part of its assets. Information can be used as a competitive advantage. However, data has long been underestimated by the software community. Usually, methods and techniques apply to software (including data schemata), but the data itself has often been considered as an external problem. Validation and verification techniques usually assume that data is provided by an external agent and concentrate only on software.
Data quality assessment and improvement
International Journal of Business Information Systems, 2016
Data quality has significance to companies, but is an issue that can be challenging to approach and operationalise. This study focuses on data quality from the perspective of operationalisation by analysing the practices of a company that is a world leader in its business. A model is proposed for managing data quality to enable evaluation and operationalisation. The results indicate that data quality is best ensured when organisation specific aspects are taken into account. The model acknowledges the needs of different data domains, particularly those that have master data characteristics. The proposed model can provide a starting point for operationalising data quality assessment and improvement. The consequent appreciation of data quality improves data maintenance processes, IT solutions, data quality and relevant expertise, all of which form the basis for handling the origins of products.
A formal definition of data quality problems
2005
The exploration of data to extract information or knowledge to support decision making is a critical success factor for an organization in today's society. However, several problems can affect data quality. These problems have a negative effect in the results extracted from data, affecting their usefulness and correctness. In this context, it is quite important to know and understand the data problems. This paper presents a taxonomy of data quality problems, organizing them by granularity levels of occurrence. A formal definition is presented for each problem included. The taxonomy provides rigorous definitions, which are information-richer than the textual definitions used in previous works. These definitions are useful to the development of a data quality tool that automatically detects the identified problems.
A Classification of Data Quality Assessment Methods
Proceedings of the 16th International …, 2011
Data quality (DQ) assessment can be significantly enhanced with the use of the right DQ assessment methods, which provide automated solutions to assess DQ. The range of DQ assessment methods is very broad: from data profiling and semantic profiling to data matching and data validation. This paper gives an overview of current methods for DQ assessment and classifies the DQ assessment methods into an existing taxonomy of DQ problems. Specific examples of the placement of each DQ method in the taxonomy are provided and illustrate why the method is relevant to the particular taxonomy position. The gaps in the taxonomy, where no current DQ methods exist, show where new methods are required and can guide future research and DQ tool development.
A process for assessing data quality
Proceedings of the 8th international workshop on Software quality - WoSQ '11, 2011
This industrial contribution describes a tool support approach to assessing the quality of relational databases. The approach combines two separate audits-an audit of the database structure as described in the schema and an audit of the database content at a given point in time. The audit of the database schema checks for design weaknesses, data rule violations and deviations from the original data model. It also measures the size, complexity and structural quality of the database. The audit of the database content compares the state of selected data attributes to identify incorrect data and checks for missing and redundant records. The purpose is to initiate a data clean-up process to ensure or restore the quality of the data.