Cyberduck Mountain Duck CLI

#8798 closed defect (fixed)

MD5 checksum failure for downloads

Reported by: anonymous Owned by: dkocher
Priority: normal Milestone: 4.7.1
Component: webdav Version: 4.7
Severity: normal Keywords: md5, hash, etag
Cc: Architecture: Intel
Platform: Windows 7

Description (last modified by dkocher)

Since updating to 4.7, virtually all my downloads via CyberDuck have failed. When I begin a download, I get an error message that reads:

Error
Download [x.pdf] failed
Mismatch between MD5 hash [x] of downloaded data and ETag [x] returned by server.

I have worked around this issue by reinstalling 4.6.4 (16610). This is fine for my needs, but I thought I should report this issue.

Change History (21)

comment:1 Changed on May 4, 2015 at 4:38:41 PM by dkocher

  • Description modified (diff)

comment:2 Changed on May 4, 2015 at 6:27:22 PM by dkocher

  • Milestone set to 4.8

Please post the transcript from the log drawer of the Transfers window. Choose ⌘-L on Mac or right-click the toolbar from the Transfers window and choose Log on Windows

comment:3 Changed on May 4, 2015 at 6:38:14 PM by dkocher

  • Component changed from core to webdav
  • Owner set to dkocher
  • Status changed from new to assigned

comment:4 Changed on May 4, 2015 at 8:21:13 PM by dkocher

  • Summary changed from MD5 hash check error to MD5 checksum failure for downloads

comment:5 Changed on May 5, 2015 at 2:31:02 PM by dkocher

Reviewed and refactored checksum validation in r17479.

comment:6 Changed on May 5, 2015 at 2:31:27 PM by dkocher

Please note that the download is still complete but failed verification of checksums.

comment:7 Changed on May 7, 2015 at 8:07:42 AM by dkocher

#8806 closed as duplicate.

comment:8 Changed on May 7, 2015 at 8:14:48 AM by dkocher

  • Priority changed from low to normal

comment:9 Changed on May 15, 2015 at 2:50:37 PM by dkocher

  • Resolution set to thirdparty
  • Status changed from assigned to closed

Please report this issue to the server vendor. You can choose Continue to ignore the verification failure.

comment:10 Changed on May 18, 2015 at 9:58:07 PM by cmcapellan

Hi dkocher, I don't see any way to continue past the error when I get this message doing a WebDAV download in the Windows client. Also, the server in this case is run by a large, unresponsive academic institution. Any chance of getting a way in the next release to ignore checksums the way it behaved in prior versions?

Thanks!

comment:11 Changed on May 19, 2015 at 9:38:55 AM by dkocher

Replying to cmcapellan:

Set the property queue.download.checksum to false using a hidden configuration option.

defaults write ~/Library/Preferences/ch.sudo.cyberduck.plist queue.download.checksum false

comment:12 Changed on May 20, 2015 at 1:47:48 AM by cmcapellan

Thank you, this is very helpful!

TL;DR for hidden configuration for Windows 7/8 users:

  1. Go to C:\Users\<YOUR USER NAME>\AppData\Roaming\iterate_GmbH\Cyberduck.exe_Url_<random string>\4.7.0.17432
  1. Open user.config, add a new entry under <settings> (was not already in my file)
<setting name="queue.download.checksum" value="false" />
Last edited on May 20, 2015 at 6:00:42 AM by dkocher (previous) (diff)

comment:13 follow-ups: Changed on May 21, 2015 at 5:14:07 PM by samottenhoff

The assumption in r17479 that all ETags that look like an MD5 hash *must* be calculated with the same logic as S3 is wrong. The spec is clear that there is no formula to an ETag. The ETag is not a method for byte validation:

http://www.webdav.org/specs/rfc4918.html#etag

Sakai is an open-source learning management system, and we calculate our ETag using a simple MD5 hash:

https://crucible.sakaiproject.org/changelog/Sakai.Git?cs=7f8f50feda276a68d2488990a4fb91b2de079f02

We follow the spec: our ETag is unique per item. Cyberduck 4.7 is Sakai's recommended WebDAV client and no longer works for hundreds of Sakai-using institutions around the world.

comment:14 Changed on May 21, 2015 at 6:17:40 PM by dkocher

We do not assume a S3 specific implementation. However, if the ETag returns matches [a-fA-F0-9]{32} we assume it is a MD5. We also detect SHA1 or SHA256 checksums with similar patterns (see r16956).

comment:15 in reply to: ↑ 13 Changed on May 21, 2015 at 6:19:47 PM by dkocher

Replying to samottenhoff:

Sakai is an open-source learning management system, and we calculate our ETag using a simple MD5 hash:

https://crucible.sakaiproject.org/changelog/Sakai.Git?cs=7f8f50feda276a68d2488990a4fb91b2de079f02

This changeset should just make it work with matching MD5 being calculated in Sakai and Cyberduck. Do you have any more insight why the checksums might not match?

comment:16 in reply to: ↑ 13 Changed on May 21, 2015 at 6:24:08 PM by dkocher

Replying to samottenhoff:

We follow the spec: our ETag is unique per item. Cyberduck 4.7 is Sakai's recommended WebDAV client and no longer works for hundreds of Sakai-using institutions around the world.

If your unique ID match the above pattern but is not actually a MD5 digest then that would explain the trouble.

comment:17 follow-up: Changed on May 21, 2015 at 7:01:06 PM by samottenhoff

However, if the ETag returns matches [a-fA-F0-9]{32} we assume it is a MD5.

r17479 assumes that if it sees a 32-character ETag, that it is an MD5 calculated using a certain formula for checksum verification. But there is no specification that dictates how the ETag should be calculated. Sakai just happens to use a 32-character ETag. Why is r17479 assuming that our ETag should be used for checksum verification?

comment:18 in reply to: ↑ 17 Changed on May 21, 2015 at 9:19:58 PM by dkocher

Replying to samottenhoff:

However, if the ETag returns matches [a-fA-F0-9]{32} we assume it is a MD5.

r17479 assumes that if it sees a 32-character ETag, that it is an MD5 calculated using a certain formula for checksum verification. But there is no specification that dictates how the ETag should be calculated. Sakai just happens to use a 32-character ETag. Why is r17479 assuming that our ETag should be used for checksum verification?

We do this educated guess using the regular expression match as we do like to have checksums to verify integrity of file transfers. It is just too bad that you have chosen to create a ID that looks like an MD5 but is not. We will possibly need to disable this by default for WebDAV.

comment:19 Changed on May 21, 2015 at 9:20:15 PM by dkocher

  • Resolution thirdparty deleted
  • Status changed from closed to reopened

comment:20 Changed on May 21, 2015 at 9:29:40 PM by samottenhoff

It is just too bad that you have chosen to create a ID that looks like an MD5 but is not.

You are misunderstanding what an MD5 is. An MD5 is an arbitrary hashing algorithm. Any string or series of bytes can have an MD5 computed (http://www.md5.cz/) for it.

r17479 is making a huge assumption that any 32-character string that looks like an MD5 *must* have been created to provide an MD5 file checksum. This assumption is incorrect. It appears this assumption begun with how Amazon S3 implemented their ETag.

comment:21 Changed on May 22, 2015 at 1:35:54 PM by dkocher

  • Resolution set to fixed
  • Status changed from reopened to closed

In r17628.

Note: See TracTickets for help on using tickets.
swiss made software