Cyberduck Mountain Duck CLI

#9733 closed defect (fixed)

Add SHA1 or other verification for BackBlaze B2

Reported by: cyborgduck Owned by: dkocher
Priority: normal Milestone: 5.2
Component: b2 Version: 5.1
Severity: normal Keywords: sha backblaze b2
Cc: Architecture:
Platform:

Description (last modified by cyborgduck)

CyberDuck Version: 5.2 (Beta)

Why mark this as defect, and not a new feature:
Because the current "only timestamp comparison" renders the entire b2 upload feature useless.

What happens:

0) You have to enable TIMESTAMP for Uploads.
(I think this should be enabled by default.)
1) Upload a folder using Synchronize or with Upload.
2) Stop transfer half-way.
3) Upload same folder using Synchronize
Only the timestamp will get checked.

What should happen:

1) Upload a folder using Synchronize, OR by hand.
2) The client should calculate SHA1sum and verify with local files.
3) IF file differs: transfer.
Then, no timestamp would be even needed.

THE PROBLEM:
The current way is no way to transfer files.
Backblaze provides a 99.99+% redundancy, and the client provides a 50% chance of getting your files up there safely.
I mean, yes it may get uploaded, but we have no idea what the file ended up be.
Here and there bits may be missing.
And in my case, with a 1gbps server, the transfer got stuck a few times.
With timestamps, I was able to "continue", but I have no idea if the files that ended up in B2 are MY files, or some damaged binaries.

How to implement:
Well, Java has SHA1 calculation included.
For Backblaze B2: https://www.backblaze.com/b2/docs/b2_get_file_info.html
Compare both, if it does not match, delete file on B2, and restart transfer.

Change History (10)

comment:1 Changed on Oct 16, 2016 at 10:43:16 AM by cyborgduck

  • Description modified (diff)
  • Version 5.1.3 deleted

comment:2 Changed on Oct 16, 2016 at 11:01:22 AM by cyborgduck

  • Version set to 5.1

I checked the source for b2 meanwhile but I can only see that the code retrieves the SHA1 that Backblaze gives, and parses it.
But I see no chunk where it would calculate the local files. The transfer also starts way too quickly.
On a Xeon 1245v2, a 4gb file starts to upload within mere seconds. I doubt SHA1 can be calculated that fast.

7-zip's SHA1 calculation takes 22 seconds on the same machine.
So something is fishy.

I also read the wiki about the Synchronize feature that SHA comparison is supported for S3.
But S3 for Reduced Availability says _you will lose files_. Now that's not that appealing.
B2 is cheaper, and more redundant.

comment:3 Changed on Oct 16, 2016 at 11:07:20 AM by cyborgduck

I think this wiki article also needs some refresh:
https://trac.cyberduck.io/wiki/help/en/howto/sync

comment:4 Changed on Oct 16, 2016 at 12:00:45 PM by cyborgduck

Well, upon further testing, LARGE FILES (have no idea what they call large) have NO SHA1 sum.

Uploaded a 4GB file, and the response is:

 "contentLength": 3910021120,
 "contentSha1": "none",



Though, IF the returned sha1 sum is NOT "none" CyberDuck should do an SHA1 check.

comment:5 Changed on Oct 16, 2016 at 4:05:36 PM by dkocher

  • Component changed from core to b2
  • Milestone set to 6.0
  • Owner set to dkocher
  • Status changed from new to assigned

comment:6 Changed on Oct 16, 2016 at 7:27:39 PM by dkocher

We do include the checksum in every PUT request. A sample request is

Date: Sun, 16 Oct 2016 19:20:09 GMT
POST /b2api/v1/b2_upload_file/caed09e66bf60e3a556e0d10/c001_v0001033_t0025 HTTP/1.1
X-Bz-Content-Sha1: 63e5fa074c55e34e430a6ac7482f299edfec0769
Content-Type: image/jpeg
X-Bz-File-Name: IMG_5562.jpg
Content-Length: 3474026
Host: pod-000-1033-07.backblaze.com
Connection: Keep-Alive
User-Agent: Cyberduck/5.2.0.21187 (Mac OS X/10.12) (x86_64)
Accept-Encoding: gzip,deflate
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Cache-Control: max-age=0, no-cache, no-store
Content-Type: application/json;charset=UTF-8
Content-Length: 401
Date: Sun, 16 Oct 2016 19:20:14 GMT

The checksum is calculated in B2SingleUploadService#upload and included in the request in B2WriteFeature#write. Currently we do not check the SHA1 of the response.

comment:7 Changed on Oct 17, 2016 at 9:14:41 AM by dkocher

If you upload a file that is transferred as a large upload it will not be visible to synchronize when interrupted. Please restart the upload to resume a large file upload.

comment:8 Changed on Oct 17, 2016 at 12:50:40 PM by dkocher

  • Resolution set to fixed
  • Status changed from assigned to closed

Checksum verification after file transfer in r21678.

comment:9 Changed on Oct 17, 2016 at 12:58:55 PM by cyborgduck

That was blazing fast, thank you!

comment:10 Changed on Oct 19, 2016 at 2:17:19 PM by dkocher

  • Milestone changed from 6.0 to 5.2

Milestone renamed

Note: See TracTickets for help on using tickets.