DQE: Distributed Quota Enforcement (original) (raw)

DQE is a proposal based on quotas (or bankable postage; the term was introduced by Microsoft Research's Penny Black project in this paper). The goal is to limit the number of messages any sender can send by making it expensive to send mail in bulk. In general, quota-based proposals work as follows: every sender gets a quota of_stamps_. How this quota is determined varies among proposals; options include proof of CPU or memory cycles (as in the Penny Black project), having an email account with an ISP (as in theSHRED project), having a driver's license, etc. The sending host or its email server attaches a stamp to each email message, and the receiving host or its email server tests the incoming stamp by asking a quota enforcer whether the enforcer has seen the stamp before. If not, the receiving host infers that the stamp is "fresh" and then cancels it by asking the enforcer to store a record of the stamp. The receiving host delivers only messages with fresh stamps to the human user; messages with used stamps are assumed to be spam. The hope is that allocating reasonable quotas to everyone and then enforcing those quotas would cripple spammers, who need huge volumes to be profitable, while leaving legitimate users largely unaffected.

Here is a depiction of the DQE system architecture (TEST and SET are the messages that the recipient uses to test stamps for freshness and then to cancel them):

DQE Architecture

The Quota Allocators (QAs) have distinct public/private key pairs (QA pub,QA priv); the QA pub are well known. A participant S constructs public/private key pair (S_pub_,S_priv_) and presents S_pub_ to a QA. The QA determines a quota for S and returns the signed certificate to S (the notation {A}B means that string A is signed with key B):

CS={ S__pub_,expiration time,quota}_QA priv.

This certificate expresses to the world that sender S has the right to "mint" quota stamps in a given era (e.g., per day).

Each stamp has the form

Each i in [1,_quota_] is supposed to be used no more than once in the current epoch. t is a unique identifier of the current epoch.

Given a stamp and given QA pub, a receiver can check whether a given stamp is authentic and "below quota". To check that the stamp has not been used before, the receiver communicates with the enforcer, as mentioned above.

It differs in two major ways (and many minor ways):

Strict separation of allocation and enforcement. What we mean is that the allocator of the quota is a separate entity from the enforcer (which, recall, stores records of canceled stamps). This separation is important because the allocation function requires care and should happen on, for example, yearly intervals whereas the enforcement function requires much higher "throughput" and -- as long as the system never makes a fresh stamp appear to be canceled -- much less care.
Design, implementation, and evaluation of an enforcer that is scalable and practical. We believe that DQE is the first research project to design, build, and experimentally evaluate an enforcer that can scale to the volume of the world's email (between 100 and 200 billion email messages daily, by some estimates). In addition, DQE contrasts with previous work in that DQE addresses the challenges mentioned below.

Of course, there are many other spam-fighting proposals. Our conference paper, below, mentions some of them. (A related comment is that DQE probably fits into this litany somewhere!)

We separate the goals and technical challenges into two categories: general requirements for DQE and specific challenges for the enforcer. These requirements all concern quota enforcement; indeed, in this work we mostly do not address quota allocation. The reason for this focus is that these two are different concerns: the former is a purely technical matter while the latter involves social, economic, and policy factors.

General DQE Requirements

No false positives Our high-level goal is reliable email. We assume reused stamps indicate spam. Thus, a fresh stamp must never appear to have been used before.Untrusted enforcer We do not know the likely economic model of the enforcer, whether monolithic (i.e., owned and operated by a single entity) or federated (i.e., many organizations with an interest in spam control donate resources to a distributed system). No matter what model is adopted, it would be wise to design the system so that clients place minimal trust in the infrastructure.Privacy To reduce (already daunting) deployment hurdles, we seek to preserve the current "semantics" of email. In particular, queries of the quota enforcer should not identify email senders (otherwise, the enforcer knows which senders are communicating with which receivers, violating email's privacy model), and a receiver should not be able to use a stamp to prove to a third party that a sender communicated with it.

Challenges for the Enforcer

Scalability The enforcer must scale to current and future email volumes. Studies estimate that 80-90 billion emails will be sent daily this year (see, for example,

a report from IDC). We set an initial target of 100 billion daily messages (an average of about 1.2 million stamp checks per second) and strive to keep pace with future growth. To cope with these rates, the enforcer must be composed of many hosts.

Fault-tolerance Given the required number of hosts, it is highly likely that some subset will experience crash faults (e.g., be down) or Byzantine faults (e.g., become subverted). The enforcer should be robust to these faults. In particular, it should guarantee no more than a small amount of stamp reuse, despite such failures.

High throughput To control management and hardware costs, we wish to minimize the required number of machines, which requires maximizing throughput.

Attack-resilience Spammers will have a strong incentive to cripple the enforcer; it should thus resist denial-of-service (DoS) and resource exhaustion attacks.

Mutually untrusting nodes In both federated and monolithic enforcer organizations, nodes could be compromised. In the federated case, even when the nodes are uncompromised, they may not trust each other. Thus, in either case, besides being untrust_ed_ (by clients), nodes should also be untrust_ing_ (of other nodes), even as they do storage operations for each other.

Our conference paper at NSDI 2006, below, describes how DQE addresses the above challenges.

Here are snapshots of the code that implements the components described in Sections 5 and 6 of our NSDI paper. Here is a HOWTO for getting the code working. The HOWTO is still quite rough, so please email us with questions, feedback, bugs, etc.