An improvement on fragmentation in Distribution Database Design Based on Knowledge-Oriented Clustering Techniques (original) (raw)
Related papers
Review on Fragmentation in Distributed Database Environment
IOSR Journal of Engineering, 2014
In Traditional environments, there are many advantages of distributed data warehouses. Distributed processing is the efficient way to increase efficiency of data. But the efficiency of query processing is a critical issue in data warehousing system, as decision support applications require minimum response times to answer complex, ad-hoc queries having aggregations, multi-ways joins overvast repositories of data. To achieve this, the fragmentation of data warehouse is the best to reduce the query execution time. The execution time reduces when queries runs over smaller datasets. The system performance is increased by allowing data to be spread across datamarts. So, it is very important to manage an appropriate methodology for data fragmentation and fragment allocation. Here focus is on the distributed data warehouses, which combines the known predicate construction techniques with a clustering method to fragment data warehouse relations by using the data miningbased horizontal fragmentation methodology for a relational DDW environment. DW decentralization gives the better performance; in the fragments are allocated to the corresponding site according to their frequency.
A Modified Vertical Fragmentation Strategy for Distributed Database
In the last decade; distributed database (DDB) has been developed for solving the problem of huge amount of uploaded data over different network sites. Fragmentation is one of the most important strategies/solutions in order to distribute the organization's database according to its three type (vertical, horizontal, mixed/hybrid). Furthermore, it reduces the amount of irrelevant data, disk access, and a single source bottleneck. In this paper, a modified vertical fragmentation strategy that is divided any single relation's attributes into two or more partitions but by using the advantages of fuzzy logic concept. The modified strategy allows the distributed database designer to specify the weight (membership function) of all attributes' importance before partitioning it in order to get accurate projection. The application of this strategy will be led to accurate fragmentation strategy.
Fragmentation in Distributed Database Design Based on KR Rough Clustering Technique
Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 2018
Distributeddatabasedesignsolutionsdependheavilyontheexploitationofinputdatasourcesby usingclusteringtechniquesindatamining.Anewapproachofbiomimeticcomputationsystemssuch asantcolonyoptimization(ACO)forthissolutionisofinteresttoinformaticsexperts.UsingACO techniquesforthissolutionhastheadvantagessuchasfasteralgorithmsthankstotherandomness ofantcolonybehavior.Theuseofrandomnumbersbasedonheuristicinformationtopickup(drop) pointswillfacilitatetheflexiblesearchonalargedataspace,sothatitprovidesuswithabetteranswer. Inthisarticle,theauthorspresentACOalgorithmsapplicationsolutionstoclusteringtechniquesfor theproblemofverticalfragmentationofdistributeddata.
A Survey on Distributed Databases Fragmentation, Allocation and Replication Algorithms
Current Journal of Applied Science and Technology, 2018
Due to the huge amount of computer data stored in databases, one centralized database cannot support and provide good performance and availability when contains huge data which used by large number of users. Thus, the distributed database is a good technique to overcome this problem by fragmenting the database and allocating the right database fragmentation in the right site. Many researches present static optimized algorithms of distributed database fragmentation, allocation and replication (Horizontal/ Vertical) at the initial stage of the distributed database design using different or similar techniques, which affect the performance of database system. Therefore, this study aims at reviewing and comparing the best-presented algorithms from the design perspective, with the aim of identifying the strength and weakness points of each algorithm. Furthermore, this study could be considered as the first study that attempts to identify the most critical criteria that were used for comparing the optimized algorithms that have been proposed and used in distributed database fragmentation and allocation.
Initial horizontal fragmentation and allocation in a distributed database
Both in research and in the production activity, the distributed technologies have some advantages over centralized databases. But both contradictory tendencies (centralization and distribution) have beside some advantages also some disadvantages. The activity of data fragmentation, operation specific to the distributed databases, involves partitioning the database in disjoint fragments and the allocation of each fragment on a single node. The advantages in this case are reduced storage and communication costs. However, availability and data security are reduced, even if are better than in case of centralization. The decision to use a fragmented database is very important because it determines the execution performance of a distributed query. This paper presents a technique for horizontal fragmentation of a relation based on priority location of its attributes, by which we can take correct decisions of fragmentation in the initial phase based on the results gathered during the analysis phase without the help of empirical data about query execution. This technique can also be applied in later phases of a distributed database system for partitioning relations. Fragmentation is synchronized with the allocation and thus does not add additional complexity to allocate fragments on DDBMS's sites.
A New Technique for Database Fragmentation in Distributed Systems
Improving the performance of a database system is one of the key research issues now a day. Distributed processing is an effective way to improve reliability and performance of a database system. Distribution of data is a collection of fragmentation, allocation and replication processes. Previous research works provided fragmentation solution based on empirical data about the type and frequency of the queries submitted to a centralized system. These solutions are not suitable at the initial stage of a database design for a distributed system. In this paper we have presented a fragmentation technique that can be applied at the initial stage as well as in later stages of a distributed database system for partitioning the relations. Allocation of fragments is done simultaneously in our algorithm. Result shows that proposed technique can solve initial fragmentation problem of relational databases for distributed systems properly.
A Review on Fragmentation Techniques in Distributed Database
International Journal of Modern Trends in Engineering and Research, 2014
The distributed database systems are developed for balancing the load and scattering the data over different sites on an organization. So in order to distribute the database on different sites of an organization, fragmentation methods are used. There are several fragmentation methods reviewed in this article.
Automatic Database Clustering: Issues and Algorithms
International Journal of Computer Trends and Technology, 2014
Clustering is the process of grouping of data, where the grouping is established by finding similarities between data based on their characteristics. Such groups are termed as Clusters. Clustering is an unsupervised learning problem that group objects based upon distance or similarity. While a lot of work has been published on clustering of data on storage medium, little has been done about automating this process. There should be an automatic and dynamic database clustering technique that will dynamically re-cluster a database with little intervention of a database administrator (DBA) and maintain an acceptable query response time at all times. A good physical clustering of data on disk is essential to reducing the number of disk I/Os in response to a query whether clustering is implemented by itself or coupled with indexing, parallelism, or buffering. In this paper we describe the issues faced when designing an automatic and dynamic database clustering technique for relational dat...
Comparative Analysis of Vertical Fragmentation Techniques in Distributed Environment
2018
Distributed database management system is a software system that manages the distributed database system and makes distribution transparent to the user. Efficient distributed databases can be designed using database fragmentation, allocation and replication. Fragmentation can be applied horizontally or vertically. In this paper, various algorithms used in vertical fragmentation techniques are reviewed and compared. Algorithms reviewed are Apriori algorithm, Enhanced Minimum Spanning Tree algorithm, Modified Bond energy algorithm and Knowledge Oriented clustering technique. These algorithms result in fragmented database and further allocation and replication can be applied to these fragments.
Towards Vertical Fragmentation in Distributed Databases
2007
The design of distributed database is an optimization problem and the resolution of several sub problems as data fragmentation (horizontal, vertical, and hybrid), data allocation (with or without redundancy), optimization and allocation of operations (request transformation, selection of the best execution strategy, and allocation of operations to sites). There are some different approaches to solve each problem, so this means that the design of the distributed databases is become hard enough. There are many researches connected to the dates fragmentation and they are presented both in the case of relational database and in the case of object-oriented database. In this paper is presented the implementation of a heuristic algorithm conceived before that uses an objective function who takes over information about the administrated dates in a distributed database and it evaluates all the scheme of the database vertical fragmentation.