Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SHA1 or other verification for BackBlaze B2 #9733

Closed
cyberduck opened this issue Oct 16, 2016 · 8 comments
Closed

Add SHA1 or other verification for BackBlaze B2 #9733

cyberduck opened this issue Oct 16, 2016 · 8 comments
Assignees
Labels
b2 Backblaze B2 Protocol Implementation bug fixed
Milestone

Comments

@cyberduck
Copy link
Collaborator

00353a4 created the issue

CyberDuck Version: 5.2 (Beta)

-*Why mark this as defect, and not a new feature:**[[br]]
Because the current "only timestamp comparison" renders the entire b2 upload feature useless.

-What happens:*

  1. You have to enable TIMESTAMP for Uploads.[[br]]
    (I think this should be enabled by default.)[[br]]

  2. Upload a folder using Synchronize or with Upload.[[br]]

  3. Stop transfer half-way.[[br]]

  4. Upload same folder using Synchronize[[br]]
    Only the timestamp will get checked.[[br]][[br]]
    What should happen:

  5. Upload a folder using Synchronize, OR by hand.[[br]]

  6. The client should calculate SHA1sum and verify with local files.[[br]]

  7. IF file differs: transfer.[[br]]
    Then, no timestamp would be even needed.[[br]]

-*THE PROBLEM:**[[br]]
The current way is no way to transfer files.[[br]]
Backblaze provides a 99.99+% redundancy, and the client provides a 50% chance of getting your files up there safely. [[br]]
I mean, yes it may get uploaded, but we have no idea what the file ended up be. [[br]]
Here and there bits may be missing.[[br]]
And in my case, with a 1gbps server, the transfer got stuck a few times. [[br]]
With timestamps, I was able to "continue", but I have no idea if the files that ended up in B2 are MY files, or some damaged binaries.[[br]][[br]]

-*How to implement:**[[br]]
Well, Java has SHA1 calculation included. [[br]]
For Backblaze B2: https://www.backblaze.com/b2/docs/b2_get_file_info.html [[br]]
Compare both, if it does not match, delete file on B2, and restart transfer.

@cyberduck
Copy link
Collaborator Author

00353a4 commented

I checked the source for b2 meanwhile but I can only see that the code retrieves the SHA1 that Backblaze gives, and parses it.[[br]]
But I see no chunk where it would calculate the local files. The transfer also starts way too quickly.[[br]]
On a Xeon 1245v2, a 4gb file starts to upload within mere seconds. I doubt SHA1 can be calculated that fast.[[br]][[br]]

7-zip's SHA1 calculation takes 22 seconds on the same machine.[[br]]
So something is fishy.[[br]][[br]]

I also read the wiki about the Synchronize feature that SHA comparison is supported for S3.[[br]]
But S3 for Reduced Availability says you will lose files. Now that's not that appealing.[[br]]
B2 is cheaper, and more redundant.

@cyberduck
Copy link
Collaborator Author

00353a4 commented

I think this wiki article also needs some refresh:[[br]]
(https://trac.cyberduck.io/wiki/help/en/howto/sync)

@cyberduck
Copy link
Collaborator Author

00353a4 commented

Well, upon further testing, LARGE FILES (have no idea what they call large) have NO SHA1 sum.[[br]]
[[br]]
Uploaded a 4GB file, and the response is:[[br]]

 "contentLength": 3910021120,
 "contentSha1": "none",

[[br]][[br]]
Though, IF the returned sha1 sum is NOT "none" CyberDuck should do an SHA1 check.

@cyberduck
Copy link
Collaborator Author

@dkocher commented

We do include the checksum in every PUT request. A sample request is

Date: Sun, 16 Oct 2016 19:20:09 GMT
POST /b2api/v1/b2_upload_file/caed09e66bf60e3a556e0d10/c001_v0001033_t0025 HTTP/1.1
X-Bz-Content-Sha1: 63e5fa074c55e34e430a6ac7482f299edfec0769
Content-Type: image/jpeg
X-Bz-File-Name: IMG_5562.jpg
Content-Length: 3474026
Host: pod-000-1033-07.backblaze.com
Connection: Keep-Alive
User-Agent: Cyberduck/5.2.0.21187 (Mac OS X/10.12) (x86_64)
Accept-Encoding: gzip,deflate
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Cache-Control: max-age=0, no-cache, no-store
Content-Type: application/json;charset=UTF-8
Content-Length: 401
Date: Sun, 16 Oct 2016 19:20:14 GMT

The checksum is calculated in B2SingleUploadService#upload and included in the request in B2WriteFeature#write. Currently we do not check the SHA1 of the response.

@cyberduck
Copy link
Collaborator Author

@dkocher commented

If you upload a file that is transferred as a large upload it will not be visible to synchronize when interrupted. Please restart the upload to resume a large file upload.

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Checksum verification after file transfer in 77e79ef.

@cyberduck
Copy link
Collaborator Author

00353a4 commented

That was blazing fast, thank you!

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Milestone renamed

@iterate-ch iterate-ch locked as resolved and limited conversation to collaborators Nov 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
b2 Backblaze B2 Protocol Implementation bug fixed
Projects
None yet
Development

No branches or pull requests

2 participants