Using Checksums in Direct Uploads (original) (raw)

Using Checksums to Verify Integrity of Direct Uploads (with Shrine & Uppy)

When doing direct uploads to your app or a cloud service such as AWS S3, it's good practice to have the upload endpoint verify the integrity of the upload by using a checksum. You can do this by calculating a base64-encoded MD5 hash of the file on the client side before the upload, and include it in the Content-MD5 request header (AWS S3, Google Cloud Storage, and Shrine's upload_endpoint support this).

You can calculate the base64-encoded MD5 hash of the file using the spark-md5 andchunked-file-reader JavaScript librarires. You can pull them from [unpkg]:

...

Now you can create an fileMD5() function that calculates a base64-encoded MD5 hash of a File object and returns it as a Promise:

function fileMD5 (file) { return new Promise(function (resolve, reject) { var spark = new SparkMD5.ArrayBuffer(), reader = new ChunkedFileReader(); reader.subscribe('chunk', function (e) { spark.append(e.chunk); }); reader.subscribe('end', function (e) { var rawHash = spark.end(true); var base64Hash = btoa(rawHash); resolve(base64Hash); }); reader.readChunks(file); }) }

Now, how you're going to include that MD5 checksum depends on whether you're uploading directly to the cloud service (with Shrine's presign_endpoint plugin), or to your app using the upload_endpoint plugin.

AWS S3, Google Cloud Storage etc.

When fetching upload parameters from the presign endpoint, Shrine storage's #presignfunction needs to know that you'll be adding the Content-MD5 request header to the upload request. For both AWS S3 and Google Cloud Shrine storage this is done by passing the :content_md5 presign option:

Shrine.plugin :presign_endpoint, presign_options: -> (request) do { content_md5: request.params["checksum"], method: :put # only for AWS S3 storage } end

The above setup allows you to pass the MD5 hash via the checksum query parameter in the request to the presign endpoint. With Uppy it could look like this:

Uppy.Core({ // ... }) .use(Uppy.AwsS3, { getUploadParameters: function (file) { return fileMD5(file.data) .then(function (hash) { return fetch('/presign?filename='+ file.name + '&checksum=' + hash) }) .then(function (response) { return response.json() }) } }) // ... .run()

Upload endpoint

When uploading the file directly to your app using the upload_endpoint Shrine plugin, you can also use checksums, as the upload endpoint automatically detects the Content-MD5header. With Uppy it could look like this:

fileMD5(file).then(function (hash) { Uppy.Core({ // ... }) .use(Uppy.XHRUpload, { endpoint: '/upload', // Shrine's upload endpoint fieldName: 'file', headers: { 'Content-MD5': hash, 'X-CSRF-Token': document.querySelector('meta[name=_csrf]').content, } }) // ... .run() })