xdsclient: support fallback within each xDS server authority [A71] · Issue #6902 · grpc/grpc-go (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Description
To support xDS client fallback, we need a bunch of changes in the xdsclient.authority
:
xdsclient.authority
needs to support an ordered list of transports- Instead of supporting a single transport, an ordered slice of transports will be maintained
- When transport N is active, all transports N+1 and greater will be shut down
- When transport N goes down, transport N+1 will be activated
- Communicate the connectivity state from the
xdsclient/transport.Transport
to thexdsclient.authority
- We recently added functionality to support a generic pub-sub mechanism here This is used by the client channel to report connectivity state changes via an internal-only API here.
- We could switch the
xdsclient/transport.Transport
to use this API and report connectivity state to thexdsclient.authority
, or we could rely on the ADS stream being closed before receiving the first response from the server as being a signal for connectivity failure, and the subsequent successful receipt of a message from the server as a signal for connectivity success. - Either way, the xdsclient/transport.Transport needs to make this available to the xdsclient.authority
xdsclient.authority
already maintains the state of all registered watches. So, we have the required data to determine if at least one watcher exists for a resource that is not cached.xdsclient.authority
will have a long-running goroutine that uses connectivity state of the transports (as mentioned in bullet item 2) and the state of the registered watches (as mentioned in bullet item 3) to initiate fallback to a lower priority server when the higher priority server goes down, and initiate reverting to a higher priority server (and shutting down lower priority servers) when it comes back up- When a switch from one transport to the other happens, the LRS stream needs to be started on the new one, if any user of the xDS client had initiated one earlier
Other changes that could improve the state of things here:
- Currently,
xdsclient.authority
does not pass a context to its transport. The latter creates a context from the background context to use for the ADS stream. The transport is closed by specifically invoking aClose
method.- With the need to support multiple transports within the same authority, we could have the
xdsclient.authority
pass a context to every transport that it creates. This will help with closing all transports, when the authority is being shut down. - But, we still need an explicit
Close
method on the transport because we need the ability to shut down individual transports when a higher priority server comes back up.
- With the need to support multiple transports within the same authority, we could have the