Distributed Database Research Papers - Academia.edu (original) (raw)
2025, International Joint Conference on Artificial Intelligence
We investigate three parameterized algorithmic schemes for graphical models that can accommodate trade-offs between time and space: 1) AND/OR Adaptive Caching (AOC(i)); 2) Variable Elimination and Conditioning (VEC(i)); and 3) Tree... more
We investigate three parameterized algorithmic schemes for graphical models that can accommodate trade-offs between time and space: 1) AND/OR Adaptive Caching (AOC(i)); 2) Variable Elimination and Conditioning (VEC(i)); and 3) Tree Decomposition with Conditioning (TDC(i)). We show that AOC(i) is better than the vanilla versions of both VEC(i) and TDC(i), and use the guiding principles of AOC(i) to improve the other two schemes. Finally, we show that the improved versions of VEC(i) and TDC(i) can be simulated by AOC(i), which emphasizes the unifying power of the AND/OR framework.
2025
Sentiment analysis on Twitter data has attracted much attention recently. One of the system’s key features, is the immediacy in communication with other users in an easy, user-friendly and fast way. Consequently, people tend to express... more
Sentiment analysis on Twitter data has attracted much attention recently. One of the system’s key features, is the immediacy in communication with other users in an easy, user-friendly and fast way. Consequently, people tend to express their feelings freely, which makes Twitter an ideal source for accumulating a vast amount of opinions towards a wide diversity of topics. This amount of information oers huge potential and can be harnessed to receive the sentiment tendency towards these topics. However, since none can invest an innite amount of time to read through these tweets, an automated decision making approach is necessary. Nevertheless, most existing solutions are limited in centralized environments only. Thus, they can only process at most a few thousand tweets. Such a sample, is not representative to dene the sentiment polarity towards a topic due to the massive number of tweets published daily. In this paper, we go one step further and develop a novel method for sentiment le...
2025, ArXiv
Sentiment analysis (or opinion mining) on Twitter data has attracted much attention recently. One of the system's key features, is the immediacy in communication with other users in an easy, user-friendly and fast way. Consequently,... more
Sentiment analysis (or opinion mining) on Twitter data has attracted much attention recently. One of the system's key features, is the immediacy in communication with other users in an easy, user-friendly and fast way. Consequently, people tend to express their feelings freely, which makes Twitter an ideal source for accumulating a vast amount of opinions towards a wide diversity of topics. This amount of information offers huge potential and can be harnessed to receive the sentiment tendency towards these topics. However, since none can invest an infinite amount of time to read through these tweets, an automated decision making approach is necessary. Nevertheless, most existing solutions are limited in centralized environments only. Thus, they can only process at most a few thousand tweets. Such a sample, is not representative to define the sentiment polarity towards a topic due to the massive number of tweets published daily. In this paper, we go one step further and develop a...
2025, Very Large Data Bases
2025, INTERNATIONAL JOURNAL OF COMPUTER APPLICATION (IJCA)
Database security has gained wide notoriety among many individuals worldwide due to the increase in publicized incidents of the loss of or unauthorized exposure to sensitive or confidential data from major corporations, government... more
Database security has gained wide notoriety among many individuals worldwide due to the increase in publicized incidents of the loss of or unauthorized exposure to sensitive or confidential data from major corporations, government agencies, and academic institutions. The amount of data collected, retained, and shared electronically by many institutions is steadily increasing. Consequently, the need for individuals to understand the issues, challenges, and available solutions of database security is being realized. At its core, database security strives to ensure that only authenticated users perform authorized activities at authorized times. A useful summary of the essence of database security is provided in the paper by. More formally, database security encompasses the constructs of confidentiality, integrity, and availability, designated as the CIA triad. In computing discipline curricula, database security is often a topic covered in either an introductory database or an introductory computer security course. This paper proposed an outline of a database security component to be included in computer science or computer engineering undergraduate or early graduate curricula by mapping a number of sub-topics to the three constructs of data security. The sub-topics include mechanisms for the individual access control, application access, vulnerability, inference, and auditing sub-topics.
2025
Alhamdulillah, thanks God, without H~m I am nothing. I would like to express my sillcere thanks to my major adviser Dr. Mitchell L. Neilsen for his guidanceẽ ncouragement, help, and support for the completion of my thesis. Without his... more
Alhamdulillah, thanks God, without H~m I am nothing. I would like to express my sillcere thanks to my major adviser Dr. Mitchell L. Neilsen for his guidanceẽ ncouragement, help, and support for the completion of my thesis. Without his constant support and inspiring ideas, this thesis would have been impossible. I also would like to thank Dr. K.M. George and Dr. H. Lu for serving on my graduate committee and for their precious time spending on my thesis. I also want to thank Dr. J.P. Chandler for his generous support a.nd help. I would also like to thank to my Indonesian Government represented by The Agency for The Assessment and Application of Technology (BPP Teknologi) for giving me an opportunity to pursue my education and sponsoring me. To Nusantara Aircraft Industries Limited (PT IPTN) where I have been working for, I want to thank for aU support that has been given to me and for giving me an opportunity to get my dream pursuing to the higher education. Last~but certainly not least, I wish to express with the deepest of my h art my sincere thanks to my parents, the ones who always instil in me the important of learning and always pray to God for me. My thanks also go to all of my friends who have helped, encouraged, and advised me during my stay in Stillwater.
2025
In today's data-driven world, the integrity, confidentiality, and availability of enterprise data are critical. Oracle databases, widely adopted by large-scale organizations across various sectors, play a fundamental role in managing and... more
In today's data-driven world, the integrity, confidentiality, and availability of enterprise data are critical. Oracle databases, widely adopted by large-scale organizations across various sectors, play a fundamental role in managing and storing this sensitive information. However, as the reliance on Oracle systems grows, so does the surface area for potential security threats. This paper explores the key security challenges faced in Oracle database environments and presents a comprehensive analysis of best practices and solutions to mitigate these risks effectively.Several factors contribute to the growing concern around Oracle database security, including the rise of sophisticated cyberattacks, increased regulatory demands, and the shift toward cloud-based infrastructure. Oracle databases are often targeted due to misconfigurations, unpatched vulnerabilities, over-privileged accounts, and insufficient auditing mechanisms. Among the most pressing security issues are SQL injection attacks, privilege escalation, data leakage, and insider threats. Furthermore, the complexity of Oracle's architecture often leads to implementation challenges, especially when integrating advanced security tools like Oracle Database Vault, Transparent Data Encryption (TDE), and Label Security.This paper provides an in-depth literature survey of previous research on Oracle security, highlighting known vulnerabilities and industry-recommended countermeasures. It discusses the architectural principles of Oracle's built-in security mechanisms, including authentication models, access controls, encryption, and auditing frameworks. Real-world case studies are also presented to emphasize the practical implications of these threats and the effectiveness of implemented solutions.To overcome the identified challenges, the paper recommends a layered defense strategy encompassing secure configuration, regular patching, role-based access control, data encryption, and continuous monitoring. It also stresses the importance of compliance with international security standards such as GDPR and ISO/IEC 27001.Looking ahead, emerging technologies such as machine learning for anomaly detection, blockchain for immutable auditing, and automation in security patch deployment present promising directions for strengthening Oracle database security. The paper concludes by suggesting areas for future research and enhancement, particularly in the context of Oracle's evolving cloud infrastructure and hybrid deployments.By understanding the multifaceted security landscape of Oracle databases and implementing the strategies outlined in this study, organizations can significantly reduce the risk of data breaches, ensure compliance, and maintain trust in their information systems.
2025
In today's data-driven world, the integrity, confidentiality, and availability of enterprise data are critical. Oracle databases, widely adopted by large-scale organizations across various sectors, play a fundamental role in managing and... more
In today's data-driven world, the integrity, confidentiality, and availability of enterprise data are critical. Oracle databases, widely adopted by large-scale organizations across various sectors, play a fundamental role in managing and storing this sensitive information. However, as the reliance on Oracle systems grows, so does the surface area for potential security threats. This paper explores the key security challenges faced in Oracle database environments and presents a comprehensive analysis of best practices and solutions to mitigate these risks effectively.Several factors contribute to the growing concern around Oracle database security, including the rise of sophisticated cyberattacks, increased regulatory demands, and the shift toward cloud-based infrastructure. Oracle databases are often targeted due to misconfigurations, unpatched vulnerabilities, over-privileged accounts, and insufficient auditing mechanisms. Among the most pressing security issues are SQL injection attacks, privilege escalation, data leakage, and insider threats. Furthermore, the complexity of Oracle's architecture often leads to implementation challenges, especially when integrating advanced security tools like Oracle Database Vault, Transparent Data Encryption (TDE), and Label Security.This paper provides an in-depth literature survey of previous research on Oracle security, highlighting known vulnerabilities and industry-recommended countermeasures. It discusses the architectural principles of Oracle's built-in security mechanisms, including authentication models, access controls, encryption, and auditing frameworks. Real-world case studies are also presented to emphasize the practical implications of these threats and the effectiveness of implemented solutions.To overcome the identified challenges, the paper recommends a layered defense strategy encompassing secure configuration, regular patching, role-based access control, data encryption, and continuous monitoring. It also stresses the importance of compliance with international security standards such as GDPR and ISO/IEC 27001.Looking ahead, emerging technologies such as machine learning for anomaly detection, blockchain for immutable auditing, and automation in security patch deployment present promising directions for strengthening Oracle database security. The paper concludes by suggesting areas for future research and enhancement, particularly in the context of Oracle's evolving cloud infrastructure and hybrid deployments.By understanding the multifaceted security landscape of Oracle databases and implementing the strategies outlined in this study, organizations can significantly reduce the risk of data breaches, ensure compliance, and maintain trust in their information systems.
2025, Science, Technology and Development
In today's data-driven world, the integrity, confidentiality, and availability of enterprise data are critical. Oracle databases, widely adopted by large-scale organizations across various sectors, play a fundamental role in managing and... more
In today's data-driven world, the integrity, confidentiality, and availability of enterprise data are critical. Oracle databases, widely adopted by large-scale organizations across various sectors, play a fundamental role in managing and storing this sensitive information. However, as the reliance on Oracle systems grows, so does the surface area for potential security threats. This paper explores
2025, International Journal of Innovative Research in Science, Engineering and Technology
In a rising era of information and communication technology, data plays a crucial role in all types of crossorganizational research and business applications. Data grids rely on the coordinated sharing of and interaction across multiple... more
In a rising era of information and communication technology, data plays a crucial role in all types of crossorganizational research and business applications. Data grids rely on the coordinated sharing of and interaction across multiple autonomous database management systems to provide transparent access to heterogeneous and autonomous data resources stored in grid nodes. In this paper, we present a grid-based model which provides a uniform access interface and distributed query mechanism to access heterogeneous and geographically distributed educational digital resources. We first present the overview of the grid-based model and then discussed the architectural view and implementation details in regards of educational resources.
2025, International Journal of Computer Application, 15(3), 1-14, ISSN: 2250-1797, 2025.
Database security has gained wide notoriety among many individuals worldwide due to the increase in publicized incidents of the loss of or unauthorized exposure to sensitive or confidential data from major corporations, government... more
Database security has gained wide notoriety among many individuals worldwide due to the increase in publicized incidents of the loss of or unauthorized exposure to sensitive or confidential data from major corporations, government agencies, and academic institutions. The amount of data collected, retained, and shared electronically by many institutions is steadily increasing. Consequently, the need for individuals to understand the issues, challenges, and available solutions of database security is being realized. At its core, database security strives to ensure that only authenticated users perform authorized activities at authorized times. A useful summary of the essence of database security is provided in the paper by. More formally, database security encompasses the constructs of confidentiality, integrity, and availability, designated as the CIA triad. In computing discipline curricula, database security is often a topic covered in either an introductory database or an introductory computer security course. This paper proposed an outline of a database security component to be included in computer science or computer engineering undergraduate or early graduate curricula by mapping a number of sub-topics to the three constructs of data security. The sub-topics include mechanisms for the individual access control, application access, vulnerability, inference, and auditing sub-topics.
2025, Agrekon
University Linear programming models are widely used for farm-level investment decisions. The particular advantage of using this spatial decision support system is its ability to include region-wide competitive forces and local, national... more
University Linear programming models are widely used for farm-level investment decisions. The particular advantage of using this spatial decision support system is its ability to include region-wide competitive forces and local, national and international market constraints. The most apparent advantages of the optimisation technique can be summarised as follows: .:. The technique integrated resource potential and economic determinants in predicting land-use patterns. This interactive capability determined the relative profitability and competitive advantage of each of the selected crops vis-a-vis the resource units.
2025
Many challenges facing urban and built environment researchers stem from the complexity and diversity of the urban data landscape. This landscape is typified by multiple independent organizations each holding a variety of heterogeneous... more
Many challenges facing urban and built environment researchers stem from the complexity and diversity of the urban data landscape. This landscape is typified by multiple independent organizations each holding a variety of heterogeneous data sets of relevance to the urban community. Furthermore, urban research itself is diverse and multi-faceted covering areas as disparate as health, population demographics, logistics, energy and water usage, through to socio-economic indicators associated with communities. The Australian Urban Research Infrastructure Network (AURIN) project (www.aurin.org.au) is tasked with developing an e-Infrastructure through which a range of urban and built environment research areas will be supported. This will be achieved through development and support of a common (underpinning) e-Infrastructure. This paper outlines the requirements and design principles of the e-Infrastructure and how it aims to provide seamless, secure access to diverse, distributed data sets and tools of relevance to the urban research community. We also describe the initial case studies and their implementation that are currently shaping this e-Infrastructure.
2025, IBM Systems Journal
interfaces and two sets of error messages and their codes. On retrieval, it also has to perform the processing needed to combine data from the two databases. Application development is simplified if the two database systems support a... more
interfaces and two sets of error messages and their codes. On retrieval, it also has to perform the processing needed to combine data from the two databases. Application development is simplified if the two database systems support a common interface. An MDBS provides an integrated view of data from multiple, autonomous, heterogeneous, distributed sources. Examples of such interfaces are the Microsoft Open Database Connectivity (ODBC)' suite of functions, the X/Open SQL (Structured Query Language) Call Level Interface ( C L I ) , ~' ~ and the IBM Distributed Relational Database Architecture* (DRDA*). The application still recognizes that it is dealing with multiple data sources, but now their interfaces are the same. Integration processing, however, is still the responsibility of the application. Application development is simplified even further if all details of how to access the two database systems are delegated to a separate system. The term multidatabase system (MDBS) describes systems with this capability. The objective is to provide the application with the view that it is dealing with a single data source. If a request requires data from multiple sources, the multidatabase system will determine what data are required from each source, retrieve the data, and perform any integration processing needed. Large user organizations consistently express a strong need for systems that provide better data connectivity and data integration. We believe that the data connectivity problem is more or less solved: applications are now able to retrieve or update data in several different databases on several different platforms. However, simply being able to "get at" the data is not enough. 40 AITALURI ET AL. CORDS (a name stemming from an early group called "COnsortium for Research on Distributed Systems") is a research project focused on distributed applications. It is a collaborative effort involving IBM and several universities. More information about the project can be found in Reference 5. As part of this project, we have designed and prototyped an MDBS, called the CORDS-MDBS, that provides an integrated, relational view of multiple heterogeneous database systems. Currently, five data sources are supported: three different relational database systems, a network database system, and a hierarchical database system. In this paper, we present an overview of the architecture of the CORDS-MDBS and the current state of the prototype implementation. We describe the approaches taken in managing catalog information, schema integration, global query optimization, distributed transaction management, and interfacing to heterogeneous data sources. We also recommend that a few additional facilities be provided by database systems to ease the integration task. The objective of an MDBS is to provide an integrated view of data from multiple, autonomous, heterogeneous, distributed sources. Although an MDBS resembles a "traditional" distributed database system, there are major differences, mainly caused by the autonomy and heterogeneity of the underlying data sources. Autonomy implies that, to a component data source (CDS), the multidatabase system is just another application with no special privileges. It has no control over, or influence on, how the data are modeled by the CDS, how requests are processed, how transaction management is handled, and so on. Simply put, when developing a multidatabase system, we cannot rely on being able to change a CDS; we have to use whatever interface and capabilities a target CDS provides. Heterogeneity implies that the CDSS may differ in terms of data models, data representation, capabilities, and interfaces. Commonly used models include flat (indexed) files, hierarchical, network, relational, or object-oriented models. Different data models provide different primitives for structuring data, but many other properties and features are typically associated with a data model. These are, for example, the constraints that can
2025, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064)
In this paper, we present the implementation issues of a virtual backbone that supports the operations of the Uniform Quorum System (UQS) and the Randomized Database Group (RDG) mobility management schemes in an ad hoc network. The... more
In this paper, we present the implementation issues of a virtual backbone that supports the operations of the Uniform Quorum System (UQS) and the Randomized Database Group (RDG) mobility management schemes in an ad hoc network. The virtual backbone comprises nodes that are dynamically selected to contain databases that store the location information of the network nodes. Together with the UQS and RDG schemes, the virtual backbone allows both dynamic database residence and dynamic database access, which provide high degree of location data availability and reliability. We introduce a Distributed Database Coverage Heuristic (DDCH), which is equivalent to the centralized greedy algorithm for virtual backbone generation, but only requires local information exchange and local computation. We show how DDCH can be employed to dynamically maintain the structure of the virtual backbone, along with database merging, as the network topology changes. We also provide means to maintain connectivity among the virtual backbone nodes. We discuss optimization issues of DDCH through simulations. Simulation results suggest that the cost of ad hoc mobility management with a virtual backbone can be far below that of the conventional link-state routing.
2025, IEEE Transactions on Wireless Communications
Virtual Backbone Routing (VBR) is a scalable hybrid routing framework for ad hoc networks, which combines local proactive and global reactive routing components over a variable-sized zone hierarchy. The zone hierarchy is maintained... more
Virtual Backbone Routing (VBR) is a scalable hybrid routing framework for ad hoc networks, which combines local proactive and global reactive routing components over a variable-sized zone hierarchy. The zone hierarchy is maintained through a novel distributed virtual backbone maintenance scheme, termed the Distributed Database Coverage Heuristic (DDCH), also presented in this paper. Borrowing from the design philosophy of the Zone Routing Protocol, VBR limits the proactive link information exchange to the local routing zones only. Furthermore, the reactive component of VBR restricts the route queries to within the virtual backbone only, thus improving the overall routing efficiency. Our numerical results suggest that the cost of the hybrid VBR scheme can be a small fraction of that of either one of the purely proactive or purely reactive protocols, with or without route caching. Since the data routes do not necessarily pass through the virtual backbone nodes, traffic congestion is considerably reduced. Yet, the average length of the VBR routes tends to be close to optimal. Compared with the traditional one-hop hierarchical protocols, our results indicate that, for a network of moderate to large size, VBR with an optimal zone radius larger than one can significantly reduce the routing traffic. Furthermore, we demonstrate VBR's improved scalability through analysis and simulations.
2025
trademark and document use rules apply. This document, developed by the Rule Interchange Format (RIF) Working Group, specifies the Basic Logic Dialect, RIF-BLD, a format that allows logic rules to be exchanged between rule systems. The... more
trademark and document use rules apply. This document, developed by the Rule Interchange Format (RIF) Working Group, specifies the Basic Logic Dialect, RIF-BLD, a format that allows logic rules to be exchanged between rule systems. The RIF-BLD presentation syntax and semantics are specified both directly and as specializations of the RIF Framework for Logic Dialects, or RIF-FLD. The XML serialization syntax of RIF-BLD is specified via a mapping from the presentation syntax. A normative XML schema is also provided. Status of this Document May Be Superseded This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications
2025, arXiv (Cornell University)
This paper presents Odyssey, a novel distributed data-series processing framework that efficiently addresses the critical challenges of exhibiting good speedup and ensuring high scalability in data series processing by taking advantage of... more
This paper presents Odyssey, a novel distributed data-series processing framework that efficiently addresses the critical challenges of exhibiting good speedup and ensuring high scalability in data series processing by taking advantage of the full computational capacity of modern distributed systems comprised of multi-core servers. Odyssey addresses a number of challenges in designing efficient and highly-scalable distributed data series index, including efficient scheduling, and loadbalancing without paying the prohibitive cost of moving data around. It also supports a flexible partial replication scheme, which enables Odyssey to navigate through a fundamental trade-off between data scalability and good performance during query answering. Through a wide range of configurations and using several real and synthetic datasets, our experimental analysis demonstrates that Odyssey achieves its challenging goals. This paper appeared in PVLDB 2023, Volume 16.
2025, International Journal of Computer Trends and Technology
Database development in web development has significantly evolved, particularly over the past few decades, driven by the internet's growth and user needs. Initially rooted in the relational database model from the 1970s, advancements were... more
Database development in web development has significantly evolved, particularly over the past few decades, driven by the internet's growth and user needs. Initially rooted in the relational database model from the 1970s, advancements were influenced by major companies like Oracle and IBM. The complexity of databases has increased due to big data requirements, leading to the creation of NoSQL databases. Open-source systems like PostgreSQL, MySQL, and MongoDB have further accelerated web application development by offering flexible, cost-effective solutions. However, challenges like big data management and security remain prevalent. Future trends indicate that databases will become smarter through AI and machine learning, with technologies like blockchain potentially reshaping the landscape. This study seeks to explore these developments in depth.
2025
Genetic Algorithm (GA) has been widely used in many fields of optimization; one of them is Traveling Salesman Problem (TSP). GA in the TSP is primarily used in cases involving a lot of vertices, which is not possible to enumerate the... more
Genetic Algorithm (GA) has been widely used in many fields of optimization; one of them is Traveling Salesman Problem (TSP). GA in the TSP is primarily used in cases involving a lot of vertices, which is not possible to enumerate the shortest route. One of stages in GA is crossover operation to generate offspring’s chromosome based on parent’s. Example of some crossover operators in GA for TSP are Partially Mapped Crossover (PMX), Order Crossover (OX), Cycle Crossover (CX), and some others. However on constructing the route, they are not considering length of the route to maximize its fitness. The use of random numbers on constructing the route likely produces offspring (a new route) that is not better than its parent. Sequence of nodes in the route affects the length of the route. To minimize uncertainty, then the crossover operation should consider a method to arrange the chromosomes. This article studied incorporating two methods into crossover stage, in order to ensure the offsp...
2025
Cheese whey is a dairy industry effluent with a strong organic and saline content. The growing concern about the pollution and the environmental control as well as greater knowledge about its nutritional value has lead to the addition of... more
Cheese whey is a dairy industry effluent with a strong organic and saline content. The growing concern about the pollution and the environmental control as well as greater knowledge about its nutritional value has lead to the addition of the whey in the food chain. The purpose of this study was to develop a whey-based fruit beverage, and to compare the proximate composition and mineral content of experimental and commercial brands. From the analysis of raw materials, the proximate composition and mineral content of experimental whey-based fruit beverage was calculated. The information related to the commercial brand was obtained in the label. The experimental beverage presented proximate composition and mineral content similar to the commercial brand, except for the high content of selenium (70 μg/100g), which could be attributed to the proximate composition of whey. The production of whey-based fruit beverages is a good source of nutrients and a viable alternative to use the whey i...
2025
Storage has been extensively studied during the past few decades (Foster et al., 1997; Jose Guimaraes, 2001). However, the emerging trends on distributed computing bring new solutions for existent problems. Grid computing proposes a... more
Storage has been extensively studied during the past few decades (Foster et al., 1997; Jose Guimaraes, 2001). However, the emerging trends on distributed computing bring new solutions for existent problems. Grid computing proposes a distributed approach for data storing. In this paper, we introduce a Grid-based system (ARCO) developed for multimedia storage of large ammounts of data. The system is being developed for Biblioteca Nacional, the National Library of Portugal. Using Grid informational system and resources management, we propose a transparent system where TeraBytes of data are stored in a beowulf cluster built of commodity components with backup solution and error recover mechanisms.
2025, ijecce.org
Database is not static but rapidly grows in size. These issues include how to allocate data, communication of the system, the coordination among the individual system, distributed transition control and query processing, concurrency... more
Database is not static but rapidly grows in size. These issues include how to allocate data, communication of the system, the coordination among the individual system, distributed transition control and query processing, concurrency control over distributed relation, design of global user interface, design of component system in different physical location, integration of existing database system security. The system architecture makes use of software portioning of the database based on data clustering, SQMD (Single Query Multiple Database) architecture, a web services interface and virtualization software technologies. The system allows uniform access to concurrently distributed database, using SQMD architecture. In this Paper explain Design Strategies of Distributed Database for SQMD architecture.
2025, ACM Transactions on Database Systems
Many algorithms have been devised for minimizing the costs associated with obtaining the answer to a single, isolated query in a distributed database system. However, if more than one query may be processed by the system at the same time... more
Many algorithms have been devised for minimizing the costs associated with obtaining the answer to a single, isolated query in a distributed database system. However, if more than one query may be processed by the system at the same time and if the arrival times of the queries are unknown, the determination of optimal query-processing strategies becomes a stochastic optimization problem. In order to cope with such problems, a theoretical state-transition model is presented that treats the system as one operating under a stochastic load. Query-processing strategies may then be distributed over the processors of a network as probability distributions, in a manner which accommodates many queries over time. It is then shown that the model leads to the determination of optimal query-processing strategies as the solution of mathematical programming problems, and analytical results for several examples are presented. Furthermore, a divide-and-conquer approach is introduced for decomposing ...
2025, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339)
Most of the existing nonlinear data analysis and modelling techniques including neural networks become computationally prohibitively expensive when the available data set exceeds the capacity of the computer main memory due to the slow... more
Most of the existing nonlinear data analysis and modelling techniques including neural networks become computationally prohibitively expensive when the available data set exceeds the capacity of the computer main memory due to the slow disc access operations [ l ] . For the data received on-line from a source with an unknown probability distribution, the question addressed in this article is how to eficiently partition it to smaller representative subsets {data bases) and how to organize these data subsets in order to minimize the computational cost of the later data analysis. The proposed linear-time, on-line problem decomposition method achieves these objectives through balancing probability distributions of the individual disjoint data subsets, each aimed at approximating the original data-source distribution. Consequently, computationally eficient statistical data analysis and neural network modelling on data subsets fitting into a computer central memory will produce results similar to these obtained through a global, computationally infeasible data analysis. In
2025, Net-Centric Approaches to Intelligence and National Security
On both the public Internet and private Intranets, there is a vast amount of data available that is owned and maintained by different organizations, distributed all around the world. These data resources are rich and recent; however,... more
On both the public Internet and private Intranets, there is a vast amount of data available that is owned and maintained by different organizations, distributed all around the world. These data resources are rich and recent; however, information gathering and knowledge discovery from them, in a particular knowledge domain, confronts major difficulties. The objective of this article is to introduce an autonomous methodology to provide for domain-specific information gathering and integration from multiple distributed sources.
2025, 2011 IEEE World Haptics Conference
2025, Lecture Notes in Computer Science
GlobData is a project that aims to design and implement a middleware tool offering the abstraction of a global object database repository. This tool, called Copla, supports transactional access to geographically distributed persistent... more
GlobData is a project that aims to design and implement a middleware tool offering the abstraction of a global object database repository. This tool, called Copla, supports transactional access to geographically distributed persistent objects independent of their location. Additionally, it supports replication of data according to different consistency criteria. For this purpose, Copla implements a number of consistency protocols offering different tradeoffs between performance and fault-tolerance. This paper presents the work on strong consistency protocols for the Glob-Data system. Two protocols are presented: a voting protocol and a nonvoting protocol. Both these protocols rely on the use of atomic broadcast as a building block to serialize conflicting transactions. The paper also introduces the total order protocol being developed to support large-scale replication.
2025, IEEE Transactions on Systems, Man, and Cybernetics
A graphical representation tool-updated Petri nets (UPN)-has been developed to model rule-base specifications for CIM databases. UPN facilitates the modeling of relationships between operations of various manufacturing application systems... more
A graphical representation tool-updated Petri nets (UPN)-has been developed to model rule-base specifications for CIM databases. UPN facilitates the modeling of relationships between operations of various manufacturing application systems and the database updates and retrievals among all the respective distributed databases. Based on this representation, a hierarchical modeling technique which includes refining and aggregating rules has also been developed. An application of the UPN is demonstrated in designing rule-based systems for controlling and integrating the information flow between manufacturing applications, including computer aided design, computer aided process planning, manufacturing resources planning, and shop floor control.
2025, 2011 11th International Conference on Intelligent Systems Design and Applications
Due to the dramatic increase of data volumes in different applications, it is becoming infeasible to keep these data in one centralized machine. It is becoming more and more natural to deal with distributed databases and networks. That is... more
Due to the dramatic increase of data volumes in different applications, it is becoming infeasible to keep these data in one centralized machine. It is becoming more and more natural to deal with distributed databases and networks. That is why distributed data mining techniques have been introduced. One of the most important data mining problems is data clustering. While many clustering algorithms exist for centralized databases, there is a lack of efficient algorithms for distributed databases. In this paper, an efficient algorithm is proposed for clustering distributed databases. The proposed methodology employs an iterative optimization technique to achieve better clustering objective. The experimental results reported in this paper show the superiority of the proposed technique over a recently proposed algorithm based on a distributed version of the well known K-Means algorithm (Datta et al. 2009) [1].
2025, Annales de Limnologie - International Journal of Limnology
2025, Semiconductor science and information devices
The data and internet are highly growing which causes problems in management of the big-data. For these kinds of problems, there are many software frameworks used to increase the performance of the distributed system. This software is... more
The data and internet are highly growing which causes problems in management of the big-data. For these kinds of problems, there are many software frameworks used to increase the performance of the distributed system. This software is used for the availability of large data storage. One of the most beneficial software frameworks used to utilize data in distributed systems is Hadoop. This paper introduces Apache Hadoop architecture, components of Hadoop, their significance in managing vast volumes of data in a distributed system. Hadoop Distributed File System enables the storage of enormous chunks of data over a distributed network. Hadoop Framework maintains fsImage and edits files, which supports the availability and integrity of data. This paper includes cases of Hadoop implementation, such as monitoring weather, processing bioinformatics.
2025
Resumo. O projeto de banco de dados distribuído é um processo bastante complexo que envolve aspectos distintos para a realização de uma adequada distribuição dos dados (Buretta 1997). Muitos desses aspectos correspondem a requisitos não... more
Resumo. O projeto de banco de dados distribuído é um processo bastante complexo que envolve aspectos distintos para a realização de uma adequada distribuição dos dados (Buretta 1997). Muitos desses aspectos correspondem a requisitos não funcionais (propriedades de qualidade ou restrições dos sistemas), como por exemplo, disponibilidade, custos e desempenho. Esses requisitos normalmente são pouco explorados nas etapas do projeto, deixando assim de auxiliar no processo de distribuição. Este artigo tem como objetivo representar os requisitos não funcionais nas fases iniciais do projeto de banco de dados distribuído, através da integração de estratégias propostas pela área de Engenharia de Requisitos. O trabalho foca em especial o uso do Framework NFR e faz uma extensão de seus catálogos de requisitos não funcionais, a fim de integrar os principais aspectos relacionados com a distribuição dos dados.
2025, Databases, Knowledge, and Data Applications
Every simulation is based on an appropriate model. Particularly in 3D simulation, models are often large and complex recommending the usage of database technology for an efficient data management. However, the predominant and well-known... more
Every simulation is based on an appropriate model. Particularly in 3D simulation, models are often large and complex recommending the usage of database technology for an efficient data management. However, the predominant and well-known relational databases are less suitable for the hierarchical structure of 3D models. In contrast, graph databases from the NoSQL field store their contents in the nodes and edges of a mathematical graph. The open source Neo4j is such a graph database. In this paper, we introduce an approach to use Neo4j as persistent storage for 3D simulation models. For that purpose, a runtime in-memory simulation database is synchronized with the graph database back end.
2025, Indian Engineering Journal
In this paper, the author seeks to determine the extent to which generative AI particularly the Large Language Models can redefine Database Migration. The conventional techniques that are used for migrating data to next generation... more
In this paper, the author seeks to determine the extent to which generative AI particularly the Large Language Models can redefine Database Migration. The conventional techniques that are used for migrating data to next generation databases entail scripting as well as mapping manual work which are prone to errors, cumbersome and demand the services of an expert. This research aims at developing an integral solution based on LLMs that can assist at specific and critical phases of the migration process, especially for heterogeneous migration between distinct platforms of databases. The authors specifically point out how LLMs are used for analyzing the source database schema, for handling schema translation and data type mapping automatically and for interpreting and converting other database-dependent code like stored procedures and functions. The use of LLMs in the research also seeks to achieve a major reduction in manual work, enhancement of accuracy, and the general time taken in the migration processes. The paper also considers the position of LLMs within the performance enhancement, security. Experimentations on a modified version of a Gemini model on a sample Oracle to PostgreSQL database migration justify the proposed approach. The analysis points out significant gains in precision and performance besides noticeable reduction in the likelihood of errors from the use of traditional techniques.
2025, Journal of Parallel and Distributed Computing
This paper addresses the processing of a query in distributed database systems using a sequence of semijoins. The objective is to minimize the intersite data traffic incurred by a distributed query. A method is developed which accurately... more
This paper addresses the processing of a query in distributed database systems using a sequence of semijoins. The objective is to minimize the intersite data traffic incurred by a distributed query. A method is developed which accurately and efficiently estimates the size of an intermediate result of a query. This method provides the basis of the query optimization algorithm. Since the distributed query optimization problem is known to be intractable, a heuristic algorithm is developed to determine a low-cost sequence of semijoins. The cost comparison with an existing algorithm is provided. The complexity of the main features of the algorithm is analytically derived. The scheduling time for sequences of semijoins is measured for example queries using the PASCAL program which implements the algorithm.
2025, Communications of the ACM
Diverse database management systems are used in large organizations. The heterogeneous distributed database system (DDS) can provide a flexible intergration of diverse databases for users and applications. This is because it allows for... more
Diverse database management systems are used in large organizations. The heterogeneous distributed database system (DDS) can provide a flexible intergration of diverse databases for users and applications. This is because it allows for retrieval and update of distributed data under different data systems giving the illusion of accessing a single centralized database system.
2025, Indian Scientific Journal Of Research In Engineering And Management
In both Distributed and Real Time Databases Systems replication are interesting areas for the new researchers. In this paper, we provide an overview to compare replication techniques available for these database systems. Data consistency... more
In both Distributed and Real Time Databases Systems replication are interesting areas for the new researchers. In this paper, we provide an overview to compare replication techniques available for these database systems. Data consistency and scalability are the issues that are considered in this paper. Those issues are maintaining consistency between the actual state of the real-time object of the external environment and its images as reflected by all its replicas distributed over multiple nodes. We discuss a frame to create a replicated real-time database and preserve all timing constrains. In order to enlarge the idea for modelling a large scale database, we present a general outline that consider improving the Data consistency and scalability by using and accessible algorithm applied on the both database, with the goal to lower the degree of replication enables segments to have individual degrees of replication with the purpose of avoiding extreme resource usage, which all together contribute in solving the scalability problem for Distributed Real Time Database Systems.
2025
Concurrency manipulates the control of concurrent transaction execution. Distributed database management system enforce concurrency manipulate to make sure serializability and isolation of transaction. Lots of research has been done on... more
Concurrency manipulates the control of concurrent transaction execution. Distributed database management system enforce concurrency manipulate to make sure serializability and isolation of transaction. Lots of research has been done on this area and a number of algorithms have been purposed. In this article, we are comparing few algorithms for preserving the ACID property (atomicity, consistency, isolation, and durability) of transactions in DDBMS.
2025, CEUR Workshop Proceedings, 2024, 3668, рр 120–132
This research article presents an approach to performance tuning in distributed data streaming systems through the development of the Holistic Adaptive Optimization Technique (HAOT). The importance of parameter tuning is underscored by... more
This research article presents an approach to performance tuning in distributed data streaming systems through the development of the Holistic Adaptive Optimization Technique (HAOT). The importance of
parameter tuning is underscored by its potential to significantly improve system performance without altering the existing design, thereby saving costs and avoiding the expenses associated with system redesign. However, traditional tuning methods often fall short by failing to optimize all components of the streaming architecture, leading to suboptimal performance. To address these shortcomings, our study introduces HAOT, a comprehensive optimization framework that dynamically integrates machine learning techniques to continuously analyze and adapt the configurations of sources, streaming engines, and sinks in real-time. This holistic approach not only aims to overcome the limitations of existing
parameter tuning methods but also reduces the reliance on skilled engineers by automating the optimization process. Our results demonstrate the effectiveness of HAOT in enhancing the performance
of distributed data streaming systems, thereby offering significant improvements over traditional tuning methods.
2025, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)
2025, Proceedings. 15th International Workshop on Database and Expert Systems Applications, 2004.
The needs for various forms of information systems relating to the European environment and ecosystem are reviewed, and limitations indicated.. Existing information systems are reviewed and compared in terms of aims and functionalities.... more
The needs for various forms of information systems relating to the European environment and ecosystem are reviewed, and limitations indicated.. Existing information systems are reviewed and compared in terms of aims and functionalities. We consider TWO technical challenges involved in attempting to develop an IEEICS. First, there is the challenge of developing an internet-based communication system which allows fluent access to information stored in a range of distributed databases. Some of the currently available solutions are considered, i.e. Web Service Federations. The second main challenge arises from the fact that there is general intra-national heterogeneity in the definitions adopted, and the measurement systems used throughout the nations of Europe. Integrated strategies are needed.
2025, International Journal of Computer Trends and Technology
Data migration is essential in modern data management systems. An effective data migration strategy should enable seamless integrations across diverse database systems. This paper introduces advanced SQL techniques for migrating data... more
Data migration is essential in modern data management systems. An effective data migration strategy should enable seamless integrations across diverse database systems. This paper introduces advanced SQL techniques for migrating data between heterogeneous systems, such as MySQL and PostgreSQL, ensuring data integrity and minimizing inconsistencies. The key concept is to develop strategies for data transformation, parallel query execution, and batch processing so that an automated framework is developed to reduce manual intervention. The proposed approach achieves impressive performance metrics, with precision at 92%, recall at 90%, and accuracy at 95%, showcasing its effectiveness in detecting positive migrations and minimizing errors. By combining optimization techniques with validation mechanisms, this study offers a robust, scalable solution for efficient and reliable data migration, emphasizing the importance of metric-driven evaluations in achieving seamless system integration.
2025, Conference on Parallel andDistributed Information Systems
The concept of serializability has been the traditionally accepted notion of correctness in database systems. However, in a heterogeneous distributed database system (HDBMS) environment, ensuring serializability is a difficult task mainly... more
The concept of serializability has been the traditionally accepted notion of correctness in database systems. However, in a heterogeneous distributed database system (HDBMS) environment, ensuring serializability is a difficult task mainly due to the desire of preserving the local autonomy of the participating local database systems. In this paper, we introduce a new correctness criterion for HDBMSs, two-level serializability (2LSR),
2025, Nutrition Bulletin
The UK national food composition dataset, maintained at the Quadram Institute Bioscience, is a valuable national resource for a variety of users. The UK has a long history of compiling and utilising food composition data, which started... more
The UK national food composition dataset, maintained at the Quadram Institute Bioscience, is a valuable national resource for a variety of users. The UK has a long history of compiling and utilising food composition data, which started for the specific purpose of understanding war-time nutrition, and is now fundamental to multiple areas of research, policy, food manufacturing and consumer behaviour. The rise of mHealth technologies has brought food and nutrition data direct to the consumer and presents new challenges for food data compilers relating to coverage of foods and nutrients, and accessibility and transparency of data. In addition, emerging efforts in sustainable food production, changing diets and the ever-increasing burden of non-communicable diseases requires an integrated approach that will span the agri-food, nutrition and health space. In order to achieve this, there needs to be continued efforts in food data standardisation, international collaboration and stronger emphasis in making food and nutrition data FAIR (findable, accessible, interoperable and reusable). The UK national food composition data and the emerging initiatives in food and nutrition it supports are playing an important role in the future development of healthy and sustainable UK diets.
2024, Інфокомунікаційні та комп’ютерні технології
Methods of increasing the effectiveness of threat detection in distributed databases using the event monitoring system are considered in the work. It is noted that the system works on the basis of the event monitoring model of... more
Methods of increasing the effectiveness of threat detection in distributed databases using the event monitoring system are considered in the work. It is noted that the system works on the basis of the event monitoring model of heterogeneous distributed databases. This model involves three stages of event processing, which are based on appropriate methods. It is highlighted that the mechanisms underlying the functioning of the mentioned methods should be reduced to a single data format to eliminate the possible appearance of incorrect work in future calculations. The event processing method on the monitoring server allows for processing event matrices and transferring the function to control tools, on the basis of which appropriate decisions are made to improve reliability. The article develops a modified method of analyzing and monitoring events in non-relational distributed databases. Event monitoring options for lookup operations in distributed databases are offered. To confirm th...
2024, ACM Computing Surveys
It is increasingly important for organizations to achieve additional coordination of diverse computerized operations. To do so, it is necessary to have database systems that can operate over a distributed network and can encompass a... more
It is increasingly important for organizations to achieve additional coordination of diverse computerized operations. To do so, it is necessary to have database systems that can operate over a distributed network and can encompass a heterogeneous mix of computers, operating systems, communications links, and local database management systems. This paper outlines approaches to various aspects of heterogeneous distributed data management and describes the characteristics and architectures of seven existing heterogeneous distributed database systems developed for production use. The objective is a survey of the state of the art in systems targeted for production environments as opposed to research prototypes.
2024, International Journal of INTELLIGENT SYSTEMS AND APPLICATIONS IN ENGINEERING
AWS, or Amazon Web Services, is a cloud computing platform that is adaptable, affordable, and simple to use. Relational database management systems, or RDBMS, are frequently used in the Amazon cloud. Derive how to set up Oracle Database... more
AWS, or Amazon Web Services, is a cloud computing platform that is adaptable, affordable, and simple to use.
Relational database management systems, or RDBMS, are frequently used in the Amazon cloud. Derive how to set up Oracle
Database on AW and Oracle Database may be operated on Relational Database Service (Amazon RDS). To show how you
can operate Oracle Database on Amazon RDS, as well as to inform you of the benefits of each strategy and how to deploy and
monitor your Oracle database, as well as how to handle scalability, performance, backup and recovery, high availability, and
security in Amazon RDS. In this paper, proposed the DM-DATA Model to establish an Emergency Recovery solution with an
onsite Oracle system and AWS and to migrate your existing Oracle database to AWS. We provide a strategy for designing an
architecture that protects you against hardware failures, datacenter issues, and disasters by using replication technologies stock market data. In the performance analysis, there are several alternatives are choose to optimize the performance of the propose
infrastructure with Oracle database based on certain metrics like, disk I/O management, sizing, database replicas, etc.