Firestore: Fix spurious "Backend didn't respond within 10 seconds" errors when network just slow by dconeybe · Pull Request #8145 · firebase/firebase-js-sdk (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation19 Commits9 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Firestore was erroneously concluding that the connection to the backend was broken when the response to the initial listen request was large (e.g. 1000 documents) and the network speed was slow.
This bug resulted in erroneous errors that looked like this:
Could not reach Cloud Firestore backend. Backend didn't respond within 10 seconds.
This typically indicates that your device does not have a healthy Internet connection at the moment. The client will operate in offline mode until it is able to successfully connect to the backend.
The fix is to propagate the first "OPEN" message from WebChannel to the online state detector so that it will consider the connection to be "alive" even though no proto messages have been received on the stream.
Googlers see b/325591749
Size Report 1
Affected Products
@firebase/firestore
| Type | Base (d1eae3a) | Merge (e03fd3f) | Diff |
|---|---|---|---|
| browser | 377 kB | 377 kB | +275 B (+0.1%) |
| esm5 | 362 kB | 362 kB | +345 B (+0.1%) |
| main | 580 kB | 580 kB | +464 B (+0.1%) |
| module | 377 kB | 377 kB | +275 B (+0.1%) |
| react-native | 377 kB | 377 kB | +275 B (+0.1%) |
bundle
| Type | Base (d1eae3a) | Merge (e03fd3f) | Diff |
|---|---|---|---|
| firestore (CSI Auto Indexing Disable and Delete) | 268 kB | 269 kB | +113 B (+0.0%) |
| firestore (CSI Auto Indexing Enable) | 268 kB | 269 kB | +113 B (+0.0%) |
| firestore (Persistence) | 303 kB | 303 kB | +172 B (+0.1%) |
| firestore (Query Cursors) | 247 kB | 241 kB | -6.36 kB (-2.6%) |
| firestore (Query) | 245 kB | 238 kB | -6.36 kB (-2.6%) |
| firestore (Read data once) | 233 kB | 226 kB | -6.36 kB (-2.7%) |
| firestore (Read Write w Persistence) | 323 kB | 323 kB | +172 B (+0.1%) |
| firestore (Realtime updates) | 235 kB | 229 kB | -6.36 kB (-2.7%) |
| firestore (Transaction) | 212 kB | 205 kB | -6.47 kB (-3.1%) |
| firestore (Write data) | 212 kB | 205 kB | -6.39 kB (-3.0%) |
firebase
| Type | Base (d1eae3a) | Merge (e03fd3f) | Diff |
|---|---|---|---|
| firebase-compat.js | 786 kB | 786 kB | +165 B (+0.0%) |
| firebase-firestore-compat.js | 342 kB | 342 kB | +165 B (+0.0%) |
| firebase-firestore.js | 436 kB | 436 kB | +275 B (+0.1%) |
Test Logs
- Base (d1eae3a): https://github.com/firebase/firebase-js-sdk/actions/runs/8724885836
- Merge (e03fd3f): https://github.com/firebase/firebase-js-sdk/actions/runs/8741678167
@dconeybe thanks for working on this issue. I tested the PR, the query can load about 100 documents then found the below error
Error: FIRESTORE (10.6.0) INTERNAL ASSERTION FAILED: Unexpected state
at fail (index.esm2017.js:194:1)
at __PRIVATE_AsyncQueueImpl.Iu (index.esm2017.js:18538:1)
at __PRIVATE_AsyncQueueImpl.enqueue (index.esm2017.js:18454:1)
at __PRIVATE_AsyncQueueImpl.enqueueAndForget (index.esm2017.js:18439:1)
at DelayedOperation.handleDelayElapsed (index.esm2017.js:15141:1)
at index.esm2017.js:15123:1
Hi @dconeybe thank you for working on the fix.
Do you have a tip how I can build the sdk and import it as a dart package?
thomasdao
Please try these steps:
cd packages/firestoreyarn build:debugcp -r dist ~/YOUR_PROJECT/node_modules/@firebase/firestore
The build:debug script will build dist/index.esm2017.js without all of the optimizations and code mangling. This will make your stack traces easier to interpret.
Now, please reproduce again and post an updated stack trace. Hopefully the stack trace from unmangled code will make more sense.
Hi @dconeybe thank you for working on the fix. Do you have a tip how I can build the sdk and import it as a dart package?
@Valansch Unfortunately, no, I don't have any experience with dart or flutter. Would you be able to ask this question on the https://github.com/firebase/flutterfire repository?
@dconeybe please see the error below
Error log
firebase.worker.js:50 Error: FIRESTORE (10.6.0) INTERNAL ASSERTION FAILED: AsyncQueue is already failed: Error: FIRESTORE (10.6.0) INTERNAL ASSERTION FAILED: `pendingResponses` is less than 0. Actual value: -1. This indicates that the SDK received more target acks from the server than expected. The SDK should not continue to operate.
at fail (index.esm2017.js:213:1)
at hardAssert (index.esm2017.js:223:1)
at TargetState.recordTargetResponse (index.esm2017.js:8464:1)
at index.esm2017.js:8523:1
at Array.forEach (<anonymous>)
at WatchChangeAggregator.forEachTarget (index.esm2017.js:8570:1)
at WatchChangeAggregator.handleTargetChange (index.esm2017.js:8512:1)
at onWatchStreamChange (index.esm2017.js:20721:1)
at PersistentListenStream.onMessage (index.esm2017.js:20010:1)
at index.esm2017.js:19945:1
at fail (index.esm2017.js:213:1)
at AsyncQueueImpl.verifyNotFailed (index.esm2017.js:25633:1)
at AsyncQueueImpl.enqueue (index.esm2017.js:25539:1)
at AsyncQueueImpl.enqueueAndForget (index.esm2017.js:25521:1)
at DelayedOperation.handleDelayElapsed (index.esm2017.js:21168:1)
at index.esm2017.js:21145:1
@thomasdao Thanks for providing the updated stack trace. That error looks like the bug that you reported in #7652 has resurfaced. That is surprising because the bug you reported should have been fixed in v10.5.2 (and you are using v10.6.0).
To rule out this issue, could you try the following:
- Upgrade your dependency on firebase in
package.jsonto the latest:"firebase": "^10.11.0" - Delete your
package-lock.jsonandnode_modulesdirectory. - Run
npm installin your application. - Make sure that the
firebase-js-sdkthat you checked out from GitHub is at the latest commit in the master branch.
Thanks!
@dconeybe I updated Firebase to 10.11.0, then copy the firestore dist folder, and the query still hangs with below error:
index.esm2017.js:78 [2024-04-16T15:19:20.510Z] @firebase/firestore: Firestore (10.11.0): WebChannelConnection RPC 'Listen' stream 0x5a8c4d87 transport errored:
Qd {type: 'c', target: Q, g: Q, defaultPrevented: false, status: 1}
@thomasdao Can you confirm that you have the dconeybe/WebChannelOnOpenFix_Bug325591749 branch checked out in your local clone of this repository? If so, could you try re-running the yarn build:debug and cp -r dist ~/YOUR_PROJECT/node_modules/@firebase/firestore steps?
Also, if you're willing, could you enable Firestore debug logging (i.e. call setLogLevel('debug')) and provide the entire log that is produced? Feel free to email your log to me directly (my email is my github username at google.com) if you're concerned about posting the logs publicly.
@dconeybe I've sent the debug log to your email. I've checked the branch again and it's correct:
$ git pull
Already up to date.
$ git rev-parse --abbrev-ref HEAD
dconeybe/WebChannelOnOpenFix_Bug325591749
$ git branch --show-current
dconeybe/WebChannelOnOpenFix_Bug325591749
@thomasdao Thank you for sending me the logs. Unfortunately, based on the logs I can confirm that this PR will NOT fix the issue you are reporting (#7860). One thing, however, that I noticed from your logs is that your app seems to using Firestore before a user is signed in, and then continue to use it after the user is signed in. This could be innocuous, or could be undesirable behavior, depending on what your app is doing. I just wanted to point it out.
dconeybe marked this pull request as ready for review
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with comments.
| * |
|---|
| * A stream can be closed locally with close() or can be closed remotely or |
| * through network errors. onClose is guaranteed to be called. onOpen will only |
| * be called if the stream successfully established a connection. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify the difference between onOpen and onConnected in the documentation? Currently the docs state that "onOpen will only be called if the stream successfully established a connection." That definition of onOpen is what I would expect for onConneted.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the docs. LMK if they clarify.
The naming "onOpen" is somewhat unfortunate, as it does not actually indicate that an opened connection with the backend has been established. I think at one point this was the meaning, but the channel is now optimized to send the "handshake" with the first message to avoid an extra round-trip with the backend. So WebChannelConnection now sends onOpen as soon as it is ready to send a message so that it can send the handshake along with the first message. I added onConnected to send an event when an actual connection with the backend has been established.
| }); |
|---|
| }); |
| it('gets connected event before first message', () => { |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the title of this a little misleading because the test doesn't appear to test for any messages. Or does it and I'm just not seeing how?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The awaitCallback() will fail if some other callback occurs (such as a message). The test doesn't care about if the "message" callback occurs, only that "connected" occurs before it. Does that clarify?
dconeybe deleted the dconeybe/WebChannelOnOpenFix_Bug325591749 branch
One thing, however, that I noticed from your logs is that your app seems to using Firestore before a user is signed in, and then continue to use it after the user is signed in.
@dconeybe beside user data, I also store public settings in Firestore, so I need to setup Firestore even if the user does not sign in.
firebase locked and limited conversation to collaborators