Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MD5 checksum failure for downloads #8798

Closed
cyberduck opened this issue May 4, 2015 · 16 comments
Closed

MD5 checksum failure for downloads #8798

cyberduck opened this issue May 4, 2015 · 16 comments
Assignees
Labels
bug fixed webdav WebDAV Protocol Implementation
Milestone

Comments

@cyberduck
Copy link
Collaborator

anonymous created the issue

Since updating to 4.7, virtually all my downloads via CyberDuck have failed. When I begin a download, I get an error message that reads:

Error
Download [x.pdf] failed
Mismatch between MD5 hash [x] of downloaded data and ETag [x] returned by server.

I have worked around this issue by reinstalling 4.6.4 (16610). This is fine for my needs, but I thought I should report this issue.

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Please post the transcript from the log drawer of the Transfers window. Choose ⌘-L on Mac or right-click the toolbar from the Transfers window and choose Log on Windows

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Reviewed and refactored checksum validation in bfe5a45.

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Please note that the download is still complete but failed verification of checksums.

@cyberduck
Copy link
Collaborator Author

@dkocher commented

#8806 closed as duplicate.

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Please report this issue to the server vendor. You can choose Continue to ignore the verification failure.

@cyberduck
Copy link
Collaborator Author

e1772a6 commented

Hi dkocher, I don't see any way to continue past the error when I get this message doing a WebDAV download in the Windows client. Also, the server in this case is run by a large, unresponsive academic institution. Any chance of getting a way in the next release to ignore checksums the way it behaved in prior versions?

Thanks!

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Replying to [comment:10 cmcapellan]:

Set the property queue.download.checksum to false using a hidden configuration option.

defaults write ~/Library/Preferences/ch.sudo.cyberduck.plist queue.download.checksum false

@cyberduck
Copy link
Collaborator Author

e1772a6 commented

Thank you, this is very helpful!

TL;DR for hidden configuration for Windows 7/8 users:

  1. Go to C:\Users\<YOUR USER NAME>\AppData\Roaming\iterate_GmbH\Cyberduck.exe_Url_<random string>\4.7.0.17432

  2. Open user.config, add a new entry under (was not already in my file)

<setting name="queue.download.checksum" value="false" />

@cyberduck
Copy link
Collaborator Author

b6ea704 commented

The assumption in bfe5a45 that all ETags that look like an MD5 hash must be calculated with the same logic as S3 is wrong. The spec is clear that there is no formula to an ETag. The ETag is not a method for byte validation:

http://www.webdav.org/specs/rfc4918.html#etag

Sakai is an open-source learning management system, and we calculate our ETag using a simple MD5 hash:

https://crucible.sakaiproject.org/changelog/Sakai.Git?cs=7f8f50feda276a68d2488990a4fb91b2de079f02

We follow the spec: our ETag is unique per item. Cyberduck 4.7 is Sakai's recommended WebDAV client and no longer works for hundreds of Sakai-using institutions around the world.

@cyberduck
Copy link
Collaborator Author

@dkocher commented

We do not assume a S3 specific implementation. However, if the ETag returns matches [a-fA-F0-9]{32} we assume it is a MD5. We also detect SHA1 or SHA256 checksums with similar patterns (see 946cb53).

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Replying to [comment:13 samottenhoff]:

Sakai is an open-source learning management system, and we calculate our ETag using a simple MD5 hash:

https://crucible.sakaiproject.org/changelog/Sakai.Git?cs=7f8f50feda276a68d2488990a4fb91b2de079f02

This changeset should just make it work with matching MD5 being calculated in Sakai and Cyberduck. Do you have any more insight why the checksums might not match?

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Replying to [comment:13 samottenhoff]:

We follow the spec: our ETag is unique per item. Cyberduck 4.7 is Sakai's recommended WebDAV client and no longer works for hundreds of Sakai-using institutions around the world.

If your unique ID match the above pattern but is not actually a MD5 digest then that would explain the trouble.

@cyberduck
Copy link
Collaborator Author

b6ea704 commented

However, if the ETag returns matches [a-fA-F0-9]{32} we assume it is a MD5.

bfe5a45 assumes that if it sees a 32-character ETag, that it is an MD5 calculated using a certain formula for checksum verification. But there is no specification that dictates how the ETag should be calculated. Sakai just happens to use a 32-character ETag. Why is bfe5a45 assuming that our ETag should be used for checksum verification?

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Replying to [comment:17 samottenhoff]:

However, if the ETag returns matches [a-fA-F0-9]{32} we assume it is a MD5.

bfe5a45 assumes that if it sees a 32-character ETag, that it is an MD5 calculated using a certain formula for checksum verification. But there is no specification that dictates how the ETag should be calculated. Sakai just happens to use a 32-character ETag. Why is bfe5a45 assuming that our ETag should be used for checksum verification?
We do this educated guess using the regular expression match as we do like to have checksums to verify integrity of file transfers. It is just too bad that you have chosen to create a ID that looks like an MD5 but is not. We will possibly need to disable this by default for WebDAV.

@cyberduck
Copy link
Collaborator Author

b6ea704 commented

It is just too bad that you have chosen to create a ID that looks like an MD5 but is not.

You are misunderstanding what an MD5 is. An MD5 is an arbitrary hashing algorithm. Any string or series of bytes can have an MD5 computed (http://www.md5.cz/) for it.

bfe5a45 is making a huge assumption that any 32-character string that looks like an MD5 must have been created to provide an MD5 file checksum. This assumption is incorrect. It appears this assumption begun with how Amazon S3 implemented their ETag.

@cyberduck
Copy link
Collaborator Author

@dkocher commented

In 9458ab9.

@iterate-ch iterate-ch locked as resolved and limited conversation to collaborators Nov 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug fixed webdav WebDAV Protocol Implementation
Projects
None yet
Development

No branches or pull requests

2 participants