Cyberduck Mountain Duck CLI

Opened 4 years ago

Closed 4 years ago

#8798 closed defect (fixed)

MD5 checksum failure for downloads

Reported by: anonymous Owned by: dkocher
Priority: normal Milestone: 4.7.1
Component: webdav Version: 4.7
Severity: normal Keywords: md5, hash, etag
Cc: Architecture: Intel
Platform: Windows 7

Description (last modified by dkocher)

Since updating to 4.7, virtually all my downloads via CyberDuck have failed. When I begin a download, I get an error message that reads:

Error
Download [x.pdf] failed
Mismatch between MD5 hash [x] of downloaded data and ETag [x] returned by server.

I have worked around this issue by reinstalling 4.6.4 (16610). This is fine for my needs, but I thought I should report this issue.

Change History (21)

comment:1 Changed 4 years ago by dkocher

  • Description modified (diff)

comment:2 Changed 4 years ago by dkocher

  • Milestone set to 4.8

Please post the transcript from the log drawer of the Transfers window. Choose ⌘-L on Mac or right-click the toolbar from the Transfers window and choose Log on Windows

comment:3 Changed 4 years ago by dkocher

  • Component changed from core to webdav
  • Owner set to dkocher
  • Status changed from new to assigned

comment:4 Changed 4 years ago by dkocher

  • Summary changed from MD5 hash check error to MD5 checksum failure for downloads

comment:5 Changed 4 years ago by dkocher

Reviewed and refactored checksum validation in r17479.

comment:6 Changed 4 years ago by dkocher

Please note that the download is still complete but failed verification of checksums.

comment:7 Changed 4 years ago by dkocher

#8806 closed as duplicate.

comment:8 Changed 4 years ago by dkocher

  • Priority changed from low to normal

comment:9 Changed 4 years ago by dkocher

  • Resolution set to thirdparty
  • Status changed from assigned to closed

Please report this issue to the server vendor. You can choose Continue to ignore the verification failure.

comment:10 Changed 4 years ago by cmcapellan

Hi dkocher, I don't see any way to continue past the error when I get this message doing a WebDAV download in the Windows client. Also, the server in this case is run by a large, unresponsive academic institution. Any chance of getting a way in the next release to ignore checksums the way it behaved in prior versions?

Thanks!

comment:11 Changed 4 years ago by dkocher

Replying to cmcapellan:

Set the property queue.download.checksum to false using a hidden configuration option.

defaults write ~/Library/Preferences/ch.sudo.cyberduck.plist queue.download.checksum false

comment:12 Changed 4 years ago by cmcapellan

Thank you, this is very helpful!

TL;DR for hidden configuration for Windows 7/8 users:

  1. Go to C:\Users\<YOUR USER NAME>\AppData\Roaming\iterate_GmbH\Cyberduck.exe_Url_<random string>\4.7.0.17432
  1. Open user.config, add a new entry under <settings> (was not already in my file)
<setting name="queue.download.checksum" value="false" />
Last edited 4 years ago by dkocher (previous) (diff)

comment:13 follow-ups: Changed 4 years ago by samottenhoff

The assumption in r17479 that all ETags that look like an MD5 hash *must* be calculated with the same logic as S3 is wrong. The spec is clear that there is no formula to an ETag. The ETag is not a method for byte validation:

http://www.webdav.org/specs/rfc4918.html#etag

Sakai is an open-source learning management system, and we calculate our ETag using a simple MD5 hash:

https://crucible.sakaiproject.org/changelog/Sakai.Git?cs=7f8f50feda276a68d2488990a4fb91b2de079f02

We follow the spec: our ETag is unique per item. Cyberduck 4.7 is Sakai's recommended WebDAV client and no longer works for hundreds of Sakai-using institutions around the world.

comment:14 Changed 4 years ago by dkocher

We do not assume a S3 specific implementation. However, if the ETag returns matches [a-fA-F0-9]{32} we assume it is a MD5. We also detect SHA1 or SHA256 checksums with similar patterns (see r16956).

comment:15 in reply to: ↑ 13 Changed 4 years ago by dkocher

Replying to samottenhoff:

Sakai is an open-source learning management system, and we calculate our ETag using a simple MD5 hash:

https://crucible.sakaiproject.org/changelog/Sakai.Git?cs=7f8f50feda276a68d2488990a4fb91b2de079f02

This changeset should just make it work with matching MD5 being calculated in Sakai and Cyberduck. Do you have any more insight why the checksums might not match?

comment:16 in reply to: ↑ 13 Changed 4 years ago by dkocher

Replying to samottenhoff:

We follow the spec: our ETag is unique per item. Cyberduck 4.7 is Sakai's recommended WebDAV client and no longer works for hundreds of Sakai-using institutions around the world.

If your unique ID match the above pattern but is not actually a MD5 digest then that would explain the trouble.

comment:17 follow-up: Changed 4 years ago by samottenhoff

However, if the ETag returns matches [a-fA-F0-9]{32} we assume it is a MD5.

r17479 assumes that if it sees a 32-character ETag, that it is an MD5 calculated using a certain formula for checksum verification. But there is no specification that dictates how the ETag should be calculated. Sakai just happens to use a 32-character ETag. Why is r17479 assuming that our ETag should be used for checksum verification?

comment:18 in reply to: ↑ 17 Changed 4 years ago by dkocher

Replying to samottenhoff:

However, if the ETag returns matches [a-fA-F0-9]{32} we assume it is a MD5.

r17479 assumes that if it sees a 32-character ETag, that it is an MD5 calculated using a certain formula for checksum verification. But there is no specification that dictates how the ETag should be calculated. Sakai just happens to use a 32-character ETag. Why is r17479 assuming that our ETag should be used for checksum verification?

We do this educated guess using the regular expression match as we do like to have checksums to verify integrity of file transfers. It is just too bad that you have chosen to create a ID that looks like an MD5 but is not. We will possibly need to disable this by default for WebDAV.

comment:19 Changed 4 years ago by dkocher

  • Resolution thirdparty deleted
  • Status changed from closed to reopened

comment:20 Changed 4 years ago by samottenhoff

It is just too bad that you have chosen to create a ID that looks like an MD5 but is not.

You are misunderstanding what an MD5 is. An MD5 is an arbitrary hashing algorithm. Any string or series of bytes can have an MD5 computed (http://www.md5.cz/) for it.

r17479 is making a huge assumption that any 32-character string that looks like an MD5 *must* have been created to provide an MD5 file checksum. This assumption is incorrect. It appears this assumption begun with how Amazon S3 implemented their ETag.

comment:21 Changed 4 years ago by dkocher

  • Resolution set to fixed
  • Status changed from reopened to closed

In r17628.

Note: See TracTickets for help on using tickets.
swiss made software