Perform MD5 hash calculation during upload #5186

cyberduck · 2010-09-13T18:24:12Z

4cddcf2 created the issue

Currently a MD5 hash of every upload to S3 is calculated before starting the upload. This can consume a large amount of time and no progress bar can be given during that operation therefor the upload time estimate is useless.

I suggest to calculate the MD5 hash during the upload when reading from the stream. See for an example: http://stackoverflow.com/questions/304268/using-java-to-get-a-files-md5-checksum

Now S3 will not return an error for a corrupted upload since it has no hash to compare. Instead the returned ETag from S3 has to be used to verify that the upload was successful: http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectPOST.html

Alternatively it would be good to have at least the option to disable the hash computation since there are cases where the overhead is not justified.

cyberduck · 2010-09-13T18:35:28Z

@dkocher commented

I agree. Calculating the hash on the fly would be an improvment. The only downside is that we need a second request when we still want to set the value of the MD5 in the metadata of the file as we currently do (see md5-hash in metadata).

cyberduck · 2010-09-13T19:17:08Z

4cddcf2 commented

Agreed and i was thinking about it. I think however that it is obsolete, one could use the ETag all the way through instead.

See:
http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectGET.html
http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectHEAD.html

cyberduck · 2010-11-19T12:05:35Z

@dkocher commented

If the property s3.upload.metadata.md5 is set to true (false is default), then set the Content-MD5 header and let S3 check the integrity of the upload. Otherwise, we calculate the MD5 on the fly during the upload and compare it to the ETag returned for the upload.

In 82239e1.

cyberduck · 2010-11-19T12:16:14Z

@dkocher commented

Same fix for Rackspace Cloudfiles in 52ab01c. Addendum in d7f153a.

cyberduck closed this as completed Nov 19, 2010

cyberduck assigned dkocher Nov 26, 2021

iterate-ch locked as resolved and limited conversation to collaborators Nov 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perform MD5 hash calculation during upload #5186

Perform MD5 hash calculation during upload #5186

cyberduck commented Sep 13, 2010

cyberduck commented Sep 13, 2010

cyberduck commented Sep 13, 2010

cyberduck commented Nov 19, 2010

cyberduck commented Nov 19, 2010

Perform MD5 hash calculation during upload #5186

Perform MD5 hash calculation during upload #5186

Comments

cyberduck commented Sep 13, 2010

cyberduck commented Sep 13, 2010

cyberduck commented Sep 13, 2010

cyberduck commented Nov 19, 2010

cyberduck commented Nov 19, 2010