Design Dropbox A System Design Interview Question (original) (raw)

Last Updated : 1 Apr, 2026

Most of us use file hosting services in our daily life to store, access, and share files like documents, images, and videos from anywhere. Platforms like Dropbox make this process simple by allowing users to upload files to the cloud and access them across multiple devices seamlessly.

In system design interviews, designing a system like Dropbox is a very common and important question. It helps interviewers evaluate your understanding of scalability, storage systems, synchronization, and distributed architecture.

420851554

1. System Requirements

This section defines what features the system should provide and how well it should perform under different conditions.

Functional Requirements

These describe the core features and operations that users can perform in the system.

Non Functional Requirements

These define the performance, reliability, and quality attributes of the system.

2. Capacity Estimation

Storage Estimations

This section estimates how much storage the system will require based on user activity and data usage.

**Assumptions:

The total number of users = 500 million.
Total number of daily active users = 100 million
The average number of files stored by each user = 200
The average size of each file = 100 KB
Total number of active connections per minute = 1 million

**Calculations:

3. High-Level Design(HLD)

This section describes the overall architecture of the system, including key components and how they interact to handle file storage, synchronization, and user requests efficiently.

420851555

HLD

1. User Uploading

Users interact with the client application or web interface to initiate file uploads. The client application communicates with the Upload Service on the server side. Large files may be broken into smaller chunks for efficient transfer.

2. Upload Service

Receives file upload requests from clients. Generates Presigned URLs for S3 to allow clients to upload directly. Coordinates the upload process, ensuring data integrity and completeness. After successful upload, it updates the Metadata Database with file details. Coordinates the upload process, breaking down large files into manageable chunks if necessary.

3. Getting Presigned URL

The client application requests a Presigned URL from the Upload Service. The server generates the Presigned URL by interacting with the S3 service, creating a unique token for the specific upload operation. These URLs grant temporary, secure access to upload a specific file to a designated S3 bucket. Allows clients to bypass the server for direct communication with the storage layer.

4. S3 Bucket

S3 serves as the scalable and durable storage backend. Presigned URLs allow clients to upload directly to S3, minimizing server involvement in the actual file transfer. The bucket structure may organize files based on user accounts and metadata.

5. Metadata Database

Stores metadata associated with each file, including details like name, size, owner, access permissions, and timestamps. Enables quick retrieval of file details without accessing S3. Ensures that file metadata is consistent with the actual content in S3.

6. Uploading to S3 using Presigned URL and Metadata

The client uses the Presigned URL to upload the file directly to the designated S3 bucket. Metadata associated with the file, such as file name and owner, is included in the upload process. This ensures that the file's metadata is synchronized with its corresponding data in S3.

7. Role of Task Runner

After the file is successfully uploaded to S3, a task runner process is triggered. The task runner communicates with the Metadata Database to update or perform additional tasks related to the uploaded file. This may include updating file status, triggering indexing for search functionality, or sending notifications.

8. Downloading Services

Clients initiate file download requests through the client application. The Download Service queries the Metadata Database for file details. The server's Download Service retrieves metadata from the Metadata Database. Metadata includes information such as file name, size, owner, and access permissions.

4. Low-Level Design(LLD)

A lot of people assume designing a Dropbox is that all they just need to do is to use some cloud services, upload the file, and download the file whenever they want but that's not how it works. The core problem is "Where and how to save the files? ". Suppose you want to share a file that can be of any size (small or big) and you upload it into the cloud.

Everything is fine till here but later if you have to make an update in your file then it's not a good idea to edit the file and upload the whole file again and again into the cloud. The reason is:

**1. High Bandwidth and Storage Usage: Maintaining file history requires storing multiple versions of the same file. Even small changes force the system to re-upload the entire file, leading to unnecessary bandwidth consumption and increased cloud storage usage.

**2. Increased Latency: Uploading the entire file for minor updates increases the time required for each operation. This results in higher latency and slower user experience.

**3. Poor Concurrency Utilization: Since files are uploaded as a whole, it is difficult to leverage parallelism. The system cannot efficiently use multi-threading or multi-processing to upload or download file parts concurrently.

**Solution this problem:

uploading_

We can break the files into multiple chunks to overcome the problem we discussed above. There is no need to upload/download the whole single file after making any changes in the file.

Various components for the complete low level design solution of the Dropbox.

frame

Assume we have a client installed on our computer (an app installed on your computer) and this client has 4 basic components. These basic components are Watcher, Chunker, Indexer, and Internal DB. We have considered only one client but there can be multiple clients belonging to the same user with the same basic components.

1. Client Components

These components run on the user’s device and handle file monitoring, processing, and synchronization with the cloud.

2. Metadata Database

The metadata database maintains the indexes of the various chunks. The information contains files/chunks names, and their different versions along with the information of users and workspace.

Lets understand how we can efficientlt do relational database scaling

**Relational Database Scaling:

Relational databases like MySQL may face scalability challenges as the data and traffic grow.

**Database Sharding:

Database sharding is a horizontal partitioning technique where a large database is divided into smaller, more manageable parts called shards.

**Challenges with Database Sharding:

Managing multiple shards can become complex, especially when updates or new information needs to be added. Coordinating transactions across shards can be challenging. Maintenance, backup, and recovery operations become more intricate.

Edge Wrapper:

An edge wrapper is an abstraction layer that sits between the application and the sharded databases.

**Object-Relational Mapping (ORM):

ORM is a programming technique that allows data to be seamlessly converted between the relational database format and the application's object-oriented format.

**Edge Wrapper and ORM:

The edge wrapper integrates ORM functionality to provide a convenient interface for the application to interact with sharded databases.

62_6

3. Message Queuing Service

The messaging service queue will be responsible for the asynchronous communication between the clients and the synchronization service.

215

Below are the main requirements of the Message Queuing Service.

There will be two types of messaging queues in the service.

**1. Request Queue (Global Queue)
A single shared queue used by all clients to send updates. Whenever a client makes changes (file upload, update, delete), it pushes a message to this queue. The Synchronization Service consumes these messages and updates the Metadata Database accordingly.

**2. Response Queue (Per-Client Queue)
Each client has its own dedicated response queue. After processing updates, the Synchronization Service broadcasts changes to all relevant clients through their respective queues.

4. Synchronization Service

The Synchronization Service ensures that all clients stay consistent with the latest file updates across the system.

5. Cloud Storage

Cloud storage is used to store the actual file data (chunks) uploaded by users in a scalable and durable manner.

5. Database Design for Dropbox System Design

To understand Database design one should understand

We need the following tables to store our data:

1. Users

Users `

{ user_id(PK) name email password last_login_at created_at updated_at }

`

2. Devices

Devices `

{ device_id(PK) user_id(FK) created_at updated_at }

`

3. Objects

Objects `

{ object_id(PK) device_id(PK,FK) object_type parent_object_id name created_at updated_at }

`

4. Chunks

Chunks `

{ chunks_id(PK) object_id(PK,FK) url created_at updated_at }

`

5. AccessControlList

AccessControlList `

{ user_id(PK,FK1) object_id(PK,FK2) created_at update_at }

`

6. API Design for Dropbox System Design

APIs define how clients interact with the system to perform operations like upload, download, synchronization, and file management in a scalable way.

1. Download Chunk

This API would be used to download the chunk of a file.

Request `

GET /api/v1/chunks/:chunk_id X-API-Key: api_key Authorization: auth_token

Response

200 OK Content-Disposition: attachment; filename="" Content-Length: 4096000

`

The response will contain Content-Disposition header as attachment which will instruct the client to download the chunk. Note that Content-Length is set as 4096000 as each chunk is of 4 MB.

2. Upload Chunk

This API would be used to upload the chunk of a file.

Request `

POST /api/v1/chunks/:chunk_id X-API-Key: api_key Authorization: auth_token Content-Type: application/octet-stream /path/to/chunk

Response

200 OK

`

3. Get Objects

This API would be used by clients to query Meta Service for new files/folders when they come online. Client will pass the maximum object id present locally and the unique device id.

Request `

GET /api/v1/objects?local_object_id=&device_id= X-API-Key: api_key Authorization: auth_token

Response

200 OK { new_objects: [ { object_id: object_type: name: chunk_ids: [ chunk1, chunk2, chunk3 ] } ] }

`

Meta Service will check the database and return an array of objects containing name of object, object id, object type and an array of chunk_ids. Client calls the Download Chunk API with these chunk_ids to download the chunks and reconstruct the file.

7. Scalability for Dropbox System Design

To handle growing users and data efficiently, Dropbox uses multiple techniques to scale its system.