COCCUS: Self-Configured Cost-Based Query Services in the (original) (raw)

Admission control mechanisms for continuous queries in the cloud

2010

Amazon, Google, and IBM now sell cloud computing services. We consider the setting of a for-profit business selling data stream monitoring/management services and we investigate auction-based mechanisms for admission control of continuous queries. When submitting a query, each user also submits a bid of how much she is willing to pay for that query to run. The admission control auction mechanism then determines which queries to admit, and how much to charge each user in a way that maximizes system revenue while being strategyproof and sybil immune, incentivizing users to use the system honestly. Specifically, we require that each user maximizes her payoff by bidding her true value of having her query run. We design several payment mechanisms and experimentally evaluate them. We describe the provable game theoretic characteristics of each mechanism alongside its performance with respect to maximizing profit and total user payoff.

Coccus

Proceedings of the 2013 international conference on Management of data - SIGMOD '13, 2013

Recently, a large number of pay-as-you-go data services are offered over cloud infrastructures. Data service providers need appropriate and flexible query charging mechanisms and query optimization that take into consideration cloud operational expenses, pricing strategies and user preferences. Yet, existing solutions are static and non-configurable. We demonstrate COCCUS a modular system for cost-aware query execution, adaptive query charge and optimization of cloud data services. The audience can set their queries along with their execution preferences and budget constraints, while COCCUS adaptively determines query charge and manages secondary data structures according to various economic policies. We demonstrate COCCUS 's operation over centralized and shared nothing CloudDBMS architectures on top of public and private IaaS clouds. The audience is enabled to set economic policies and execute various workloads through a comprehensive GUI. COCCUS 's adaptability is showcased using real-time graphs depicting a number of key performance metrics.

PolarDBMS: Towards a Cost-Effective and Policy-Based Data Management in the Cloud

The proliferation of Cloud computing has attracted a large variety of applications which are completely deployed on resources of Cloud providers. As data management is an essential part of these applications, Cloud providers have to deal with many different requirements for data management, depending on the characteristics and guarantees these applications are supposed to have. The objective of a Cloud provider is to support these diverse requirements with a basic set of customizable modules and protocols that can be (dynamically) combined. With the pay-as-you-go cost model of the Cloud, literally each user action and resource usage has a price tag attached to it. Thus, for the application providers, it is essential that the needs of their applications are provided in a cost-optimized manner. In this paper, we present the work in progress PolarDBMS, a flexible and dynamically adaptable system for managing data in the Cloud. PolarDBMS derives policies from application and service objectives. Based on these policies, it will automatically deploy the most efficient and cost-optimized set of modules and protocols and monitor their compliance. If necessary, the modules and/or their customization is changed dynamically at run-time. Several modules and protocols that have already been developed are presented. Additionally, we discuss the challenges that have to be met to fully implement PolarDBMS.

Cost- and workload-driven data management in the cloud

2016

This thesis deals with the challenge of finding the right balance between consistency, availability, latency and costs, captured by the CAP/PACELC trade-offs, in the context of distributed data management in the Cloud. At the core of this work, cost and workload-driven data management protocols, called CCQ protocols, are developed. First, this includes the development of C3, which is an adaptive consistency protocol that is able to adjust consistency at runtime by considering consistency and inconsistency costs. Second, the development of Cumulus, an adaptive data partitioning protocol, that can adapt partitions by considering the application workload so that expensive distributed transactions are minimized or avoided. And third, the development of QuAD, a quorum-based replication protocol, that constructs the quorums in such a way so that, given a set of constraints, the best possible performance is achieved. The behavior of each CCQ protocol is steered by a cost model, which aims ...

A Cost Model for Query Execution in Cloud Computing Based on Both Shared-Disk and Shared-Nothing Architecture

Cloud system is an emerging paradigm and a large pool of computing resources that makes use of existing technologies such as virtualization, service-orientation and grid computing. In cloud computing, on demand services are delivered to users depending on the service level agreements (SLA) between the service provider and the user. In order to provide scalable business services and cost allocation flexibility for customers so that they can choose their preferred services according to their budget, defining a cost-effective query execution strategy within cloud environment is utmost important. Towards this effort, a cost model for query execution at cloud computing based on both shared-disk and shared-nothing architecture is introduced in this paper. The complexity of the cost model has been analyzed considering different cost factors that are demanding for cloud computing. Finally, the cost model is illustrated with the help of an example.

Query-based data pricing

Proceedings of the 31st symposium on Principles of Database Systems - PODS '12, 2012

In this demonstration, we show-case a database management system extended with a new type of component that we call a Data Use Manager (DUM). The DUM enables DBAs to attach policies to data loaded into the DBMS. It then monitors how users query the data, flags potential policy violations, recommends possible fixes, and supports offline analysis of user activities related to data policies. The demonstration uses real healthcare data.

Towards Differential Query Services in Cost-Efficient Clouds

Cloud computing as an emerging technology trend is expected to reshape the advances in information technology an Efficient Information Retrieval for Ranked Queries (EIRQ) scheme is recovery of ranked files on user demand. An EIRQ worked based on the Aggregation and Distribution Layer (ADL). An ADL is act as mediator between cloud and end-users. An EIRQ scheme reduces the communication cost and communication overhead. Mask Matrix is used to filter out as what user really wants matched data before recurring to the Aggregation and Distribution Layer (ADL). A user can retrieve files on demand by choosing queries of different ranks. This feature is useful when there are a large number of matched files, but the user only needs a small subset of them. Under different parameter settings, extensive evaluations have been conducted on both analytical models and on a real cloud environment, in order to examine the effectiveness of our schemes To avoid small scale of interruptions in cloud computing, follow two essential issues: -Privacy and Efficiency. Private keyword based file retrieval scheme was anticipated by Ostrovsky.

Adaptive query execution for data management in the cloud

Proceedings of the second international workshop on Cloud data management - CloudDB '10, 2010

A major component of many cloud services is query processing on data stored in the underlying cloud cluster. The traditional techniques for query processing on a cluster are those offered by parallel DBMS. These techniques, however, cannot guarantee high performance for cloud; parallel DBMS lack adequate fault tolerance mechanisms in order to deal with non-negligible software and hardware failures. MapReduce, on the other hand, allows query processing solutions that are fault tolerant, but imposes substantial overheads. Therefore, existing technology provides only for edge cases of query processing: (i) at one end, parallel DBMS are appropriate only for very short-running queries, as the likelihood of failure is low during the execution, (ii) and, at the other, MapReduce is appropriate for very long-running queries, as the overhead is relatively small and failures may be frequent during execution. In this paper, we propose an adaptive software architecture, which can effortlessly switch between MapReduce and parallel DBMS in order to efficiently process queries regardless of their response times. Switching between the two architectures is performed in a transparent manner based on a straightforward cost model, which computes the expected execution time in presence of failures. The experimental results show that the adaptive architecture achieves the lowest possible query execution time for various scenarios.

An Economic Model for Self-Tuned Cloud Caching

2009 IEEE 25th International Conference on Data Engineering, 2009

Cloud computing, the new trend for service infrastructures requires user multi-tenancy as well as minimal capital expenditure. In a cloud that services large amounts of data that are massively collected and queried, such as scientific data, users typically pay for query services. The cloud supports caching of data in order to provide quality query services. User payments cover query execution costs and maintenance of cloud infrastructure, and incur cloud profit. The challenge resides in providing efficient and resource-economic query services while maintaining a profitable cloud. In this work we propose an economic model for self-tuned cloud caching targeting the service of scientific data. The proposed economy is adapted to policies that encourage high-quality individual and overall query services but also brace the profit of the cloud. We propose a cost model that takes into account all possible query and infrastructure expenditure. The experimental study proves that the proposed solution is viable for a variety of workloads and data.

Profit-driven service request scheduling in clouds

2010

Abstract A primary driving force of the recent cloud computing paradigm is its inherent cost effectiveness. As in many basic utilities, such as electricity and water, consumers/clients in cloud computing environments are charged based on their service usage, hence the term 'pay-per-use'. While this pricing model is very appealing for both service providers and consumers, fluctuating service request volume and conflicting objectives (eg, profit vs.