Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support SHA256 checksums and use checksum verification #9662

Closed
cyberduck opened this issue Aug 24, 2016 · 9 comments
Closed

Support SHA256 checksums and use checksum verification #9662

cyberduck opened this issue Aug 24, 2016 · 9 comments
Assignees
Labels
enhancement fixed irods IRODS Protocol Implementation
Milestone

Comments

@cyberduck
Copy link
Collaborator

40fbe1f created the issue

cyberduck.log says:

WARN ch.cyberduck.core.io.Checksum - Failure to detect algorithm for checksum sha2:47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=

I've seen that core/src/main/java/ch/cyberduck/core/io/Checksum.java guesses different checksum algorithms, and its likely confused by the 'sha2:' prefix it gets from iRODS.

Can you add support for this, please?

@cyberduck
Copy link
Collaborator Author

@dkocher commented

It looks like this checksum is additionally Base64 encoded.

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Dependent on upstream issue.

@cyberduck
Copy link
Collaborator Author

40fbe1f commented

Thanks for looking into this. I also wondered if the custom checksum comparison implementation in Cyberduck (as it currently exists) is even necessary here. Jargon should compute and transmit a client checksum and have the iRODS server validate it, if this feature is not turned off. Currently it is turned off with setComputeAndVerifyChecksumAfterTransfer in the IRODSSesson class.

BTW, having Jargon handle the client checksumming would have the added benefit of that checksum being stored in the iRODS database for future comparisons (I'm not 100% sure about that, but the C client has a similar behavior).

@cyberduck
Copy link
Collaborator Author

40fbe1f commented

(I should add that the Jargon checksumming does not work if you upload a file with the streaming API. I found out in a similar thread for a different Jargon client: irods-contrib/irods-cloud-browser#182 )

@cyberduck
Copy link
Collaborator Author

40fbe1f commented

I wanted to add that I tested Cyberduck 5.1.0 upload with iRODS 4.1.9 today in more depth, with Cyberduck debug log enabled.

Cyberduck does this call:
org.irods.jargon.core.pub.CollectionAndDataObjectListAndSearchAOImpl - ObjStat ...
for retrieving the file size and checksum from iRODS after transfer.

The iRODS/Jargon response will contain a checksum only if an iRODS server rule for calculating the checksum on upload is defined (acPostProcForPut {msiSysChksumDataObj; }) - otherwise, the returned checksum is empty (...dataId=321827, checksum=, ownerName=...). The usage of msiSysChksumDataObj is not necessary if the iRODS client submits a checksum during upload - the client-calculated checksum will be transferred along with the data and the server will check it upon receipt. However, because Cyberduck does setComputeAndVerifyChecksumAfterTransfer(false), the Jargon default behavior (as defined in jargon.properties) is overridden and Jargon does not send a checksum. Thus, when Cyberduck asks for the object checksum, it does not receive any.

The iRODS default rule file does not contain the msiSysChksumDataObj call, but Cyberduck seems to depend on it, as well as on the MD5 checksum algorithm setting, which is SHA256 on modern iRODS servers. Switching back to MD5 and including the msiSysChksumDataObj rule, I can verify that Cyberduck does not complain about the checksum any more.

There is another dependency on server configuration built in: the ObjStat information contains an empty iRODS checksum even if msiSysChksumDataObj is present, in the case that Cyberduck writes to a 'compound' iRODS resource. This may be an iRODS or Jargon bug.

I think the ideal implementation would be to remove setComputeAndVerifyChecksumAfterTransfer(false) and setComputeChecksumAfterTransfer(false), so that the default transfer.computeandvalidate.checksum=true from jargon.properties is in effect. Do not attempt to parse iRODS checksums and compare it to a self-computed checksum. Rather, have Jargon handle this internally. Thus, Cyberduck would automatically support all known iRODS checksum algorithms and wouldn't depend on a particular server-side rule being present.

Maybe the Jargon verify feature was turned off for performance reasons - then it would be good to have this configurable for people who would like to use the Jargon default, maybe as a "hidden option" somewhere.

@cyberduck
Copy link
Collaborator Author

@dkocher commented

In 9d076ff.

1 similar comment
@cyberduck
Copy link
Collaborator Author

@dkocher commented

In 9d076ff.

@cyberduck
Copy link
Collaborator Author

40fbe1f commented

I've had a look at the changes - looks good in general, but I wanted to point out that there are still two types of checksums that iRODS may return, depending on iRODS server configuration and/or how the client uploaded a file.

If you want to work with the checksum obtained from iRODS, the checksum string may

  • start with "sha2:" - then remove the prefix and treat it as base64 encoded with SHA256 algorithm (looks like the prefix is currently not removed)
  • does not start with "sha2:" - then treat it as base16 encoded with MD5 algorithm (just as the earlier implementation did)

Supporting both cases would cover many possible iRODS setups (IMHO).

@cyberduck
Copy link
Collaborator Author

@dkocher commented

Milestone renamed

@iterate-ch iterate-ch locked as resolved and limited conversation to collaborators Nov 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement fixed irods IRODS Protocol Implementation
Projects
None yet
Development

No branches or pull requests

2 participants