Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backblaze B2 bucket file listing speed regression #14527

Closed
Cherry opened this issue Apr 12, 2023 · 7 comments · Fixed by #14542
Closed

Backblaze B2 bucket file listing speed regression #14527

Cherry opened this issue Apr 12, 2023 · 7 comments · Fixed by #14542
Assignees
Labels
b2 Backblaze B2 Protocol Implementation bug
Milestone

Comments

@Cherry
Copy link

Cherry commented Apr 12, 2023

Describe the bug

Since version 8.5.8, listing files in large Backblaze B2 buckets is SIGNIFICANTLY slower. A bucket that would previous take a few seconds to load, now takes over 20x that. It appears to be caused by the changes in the following PR, listing files in every folder. During my testing, some of these folders have thousands of files in them. #14431

To Reproduce

Steps to reproduce the behavior:

  • Create a bucket in B2 with tens of thousands of "directories" and files
  • Browse around the bucket in Cyberduck 8.5.9 vs the previous 8.5.7

Expected behavior

  • Listing bucket files contents should be as fast as it was in prior versions

Logs

I enabled debug logging to try and get an idea about how many requests were made. I then simply connected to B2, and expanded the root of the bucket.

8.5.9

Time Taken: ~120s
Number of times B2FileInfoResponse is in log file: >20,0000
Total debug log size: 81,508 KB

8.5.7

Time Taken: ~3s
Number of times B2FileInfoResponse is in log file: 131
Total debug log size: 70 KB

If there is any further debug I can do, please let me know.

Versions

  • OS: Windows 10 Pro 22H2
  • Cyberduck: 8.5.9 (39636), and tested on 8.5.7 (39535)
@Cherry Cherry changed the title Backblaze B2 bucket listing speed regression Backblaze B2 bucket file listing speed regression Apr 12, 2023
@ylangisc ylangisc added bug b2 Backblaze B2 Protocol Implementation labels Apr 13, 2023
@Aenigma
Copy link

Aenigma commented Apr 14, 2023

I'm experiencing this in 8.5.9 as well after updating from 8.5.7. Some of my b2 buckets with large folder hierarchies containing about 15k files effectively just loads forever.

@dkocher dkocher added this to the 8.9.0 milestone Apr 16, 2023
@syneryder
Copy link

Another +1, though I wouldn't describe it as just a "speed regression", as it has taken Cyberduck 8.5.9 10 minutes to list the root directory of a B2 bucket - and the root directory only has 9 folders and zero files.

10 minutes for every Backblaze B2 folder list command makes Cyberduck completely unusable.

8.5.7 works fine, but the new behavior occurred in 8.5.8 and continues in 8.5.9. I'll need to downgrade back to Cyberduck 8.5.7 until resolved.

dkocher added a commit that referenced this issue Apr 17, 2023
@dkocher dkocher linked a pull request Apr 17, 2023 that will close this issue
@dkocher
Copy link
Contributor

dkocher commented May 18, 2023

This is resolved in the current snapshot and beta builds.

@SchoNie
Copy link

SchoNie commented May 30, 2023

This still fails on 8.6.0 for me. It just times out after waiting 1 minute. I have about 600.000 files in the bucket. But the folder I am listing only has 10 folders (keys)
8.5.7 last working version.

Last part of debug log:

2023-05-30 19:33:29,692 [background-9] INFO  ch.cyberduck.transcript.request - POST /b2api/v2/b2_list_file_names HTTP/1.1
2023-05-30 19:33:29,692 [background-9] INFO  ch.cyberduck.transcript.request - Content-Length: 91
2023-05-30 19:33:29,692 [background-9] INFO  ch.cyberduck.transcript.request - Content-Type: application/json; charset=UTF-8
2023-05-30 19:33:29,692 [background-9] INFO  ch.cyberduck.transcript.request - Host: api003.backblazeb2.com
2023-05-30 19:33:29,692 [background-9] INFO  ch.cyberduck.transcript.request - Connection: Keep-Alive
2023-05-30 19:33:29,692 [background-9] INFO  ch.cyberduck.transcript.request - User-Agent: Cyberduck/8.6.0.39818 (Windows 10/10.0.22621.0) (amd64)
2023-05-30 19:33:29,692 [background-9] INFO  ch.cyberduck.transcript.request - Accept-Encoding: gzip,deflate
2023-05-30 19:33:29,692 [background-9] INFO  ch.cyberduck.transcript.request - Authorization: ********
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.response - HTTP/1.1 200 
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.response - Cache-Control: max-age=0, no-cache, no-store
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.response - Content-Type: application/json;charset=UTF-8
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.response - Content-Length: 955
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.response - Date: Tue, 30 May 2023 17:33:29 GMT
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.response - Keep-Alive: timeout=5
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.response - Connection: keep-alive
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.impl.execchain.MainClientExec - Connection can be kept alive for 5000 MILLISECONDS
2023-05-30 19:33:29,708 [background-9] DEBUG synapticloop.b2.request.BaseB2Request - Received status code of: 200, for POST request to url 'https://api003.backblazeb2.com/b2api/v2/b2_list_file_names'
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection [id: 4][route: {s}->https://api003.backblazeb2.com:443] can be kept alive for 5.0 seconds
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.impl.conn.DefaultManagedHttpClientConnection - http-outgoing-4: set socket timeout to 0
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection released: [id: 4][route: {s}->https://api003.backblazeb2.com:443][total available: 2; route allocated: 1 of 10; total allocated: 2 of 2147483647]
2023-05-30 19:33:29,708 [background-9] WARN  synapticloop.b2.response.B2ListFilesResponse - No field for key accountId
2023-05-30 19:33:29,708 [background-9] WARN  synapticloop.b2.response.B2ListFilesResponse - No field for key bucketId
2023-05-30 19:33:29,708 [background-9] WARN  synapticloop.b2.response.B2ListFilesResponse - No field for key nextFileId
2023-05-30 19:33:29,708 [background-9] WARN  synapticloop.b2.response.B2FileInfoResponse - No field for key size
2023-05-30 19:33:29,708 [background-9] WARN  synapticloop.b2.response.B2FileInfoResponse - Found an unexpected key of 'contentMd5' in JSON that is not mapped to a field, with value 'null'.
2023-05-30 19:33:29,708 [background-9] WARN  synapticloop.b2.response.B2FileInfoResponse - Found an unexpected key of 'serverSideEncryption' in JSON that is not mapped to a field, with value '{"mode":null,"algorithm":null}'.
2023-05-30 19:33:29,708 [background-9] WARN  synapticloop.b2.response.B2FileInfoResponse - Found an unexpected key of 'legalHold' in JSON that is not mapped to a field, with value '{"isClientAuthorizedToRead":false,"value":null}'.
2023-05-30 19:33:29,708 [background-9] WARN  synapticloop.b2.response.B2FileInfoResponse - Found an unexpected key of 'fileRetention' in JSON that is not mapped to a field, with value '{"isClientAuthorizedToRead":false,"value":null}'.
2023-05-30 19:33:29,708 [background-9] DEBUG ch.cyberduck.core.b2.B2VersionIdProvider - Return version 116d57d9edb4f4eb7a500615 from attributes for file Path{path='/REDACTEDBUCKETNAME', type=[directory, volume]}
2023-05-30 19:33:29,708 [background-9] DEBUG synapticloop.b2.request.BaseB2Request - Setting key 'startFileName' to value 'FOLDER/.bzEmpty'
2023-05-30 19:33:29,708 [background-9] DEBUG synapticloop.b2.request.BaseB2Request - Setting key 'maxFileCount' to value '1'
2023-05-30 19:33:29,708 [background-9] DEBUG synapticloop.b2.request.BaseB2Request - Setting key 'bucketId' to value '116d57d9edb4f4eb7a500615'
2023-05-30 19:33:29,708 [background-9] DEBUG synapticloop.b2.request.BaseB2Request - POST request to URL 'https://api003.backblazeb2.com/b2api/v2/b2_list_file_names'
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.client.protocol.RequestAddCookies - CookieSpec selected: default
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.client.protocol.RequestAuthCache - Auth cache not set in the context
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection request: [route: {s}->https://api003.backblazeb2.com:443][total available: 2; route allocated: 1 of 10; total allocated: 2 of 2147483647]
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection leased: [id: 4][route: {s}->https://api003.backblazeb2.com:443][total available: 1; route allocated: 1 of 10; total allocated: 2 of 2147483647]
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.impl.conn.DefaultManagedHttpClientConnection - http-outgoing-4: set socket timeout to 0
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.impl.conn.DefaultManagedHttpClientConnection - http-outgoing-4: set socket timeout to 30000
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.impl.execchain.MainClientExec - Executing request POST /b2api/v2/b2_list_file_names HTTP/1.1
2023-05-30 19:33:29,708 [background-9] DEBUG org.apache.http.impl.execchain.MainClientExec - Proxy auth state: UNCHALLENGED
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.request - POST /b2api/v2/b2_list_file_names HTTP/1.1
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.request - Content-Length: 91
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.request - Content-Type: application/json; charset=UTF-8
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.request - Host: api003.backblazeb2.com
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.request - Connection: Keep-Alive
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.request - User-Agent: Cyberduck/8.6.0.39818 (Windows 10/10.0.22621.0) (amd64)
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.request - Accept-Encoding: gzip,deflate
2023-05-30 19:33:29,708 [background-9] INFO  ch.cyberduck.transcript.request - Authorization: ********
2023-05-30 19:33:59,721 [background-9] DEBUG org.apache.http.impl.conn.DefaultManagedHttpClientConnection - http-outgoing-4: Close connection
2023-05-30 19:34:29,730 [background-9] DEBUG org.apache.http.impl.conn.DefaultManagedHttpClientConnection - http-outgoing-4: Shutdown connection
2023-05-30 19:34:29,730 [background-9] DEBUG org.apache.http.impl.execchain.MainClientExec - Connection discarded
2023-05-30 19:34:29,730 [background-9] DEBUG org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection released: [id: 4][route: {s}->https://api003.backblazeb2.com:443][total available: 1; route allocated: 0 of 10; total allocated: 1 of 2147483647]
2023-05-30 19:34:29,730 [background-9] WARN  ch.cyberduck.core.http.ExtendedHttpRequestRetryHandler - Cancel retry request with execution count 1 for failure java.net.SocketTimeoutException: Read timed out

@metadaddy
Copy link

@ylangisc Would it be helpful to have access to a B2 bucket with a large number of files to test this? I could spin one up and give you a read-only app key.

@dkocher
Copy link
Contributor

dkocher commented Jun 11, 2023

Reopened in #14775.

@dkocher
Copy link
Contributor

dkocher commented Jun 11, 2023

@ylangisc Would it be helpful to have access to a B2 bucket with a large number of files to test this? I could spin one up and give you a read-only app key.

Thanks for the help. We should be able to reproduce easily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
b2 Backblaze B2 Protocol Implementation bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants