Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rds: make RouteConfigProvider shared among ListenerImpls, or move factory_context_ outside of Listener. #8303

Closed
stevenzzzz opened this issue Sep 20, 2019 · 8 comments · Fixed by #8485
Labels
design proposal Needs design doc/proposal before implementation stale stalebot believes this issue/PR has not been touched recently

Comments

@stevenzzzz
Copy link
Contributor

Title: To resolve issues like #7923, or migrate RDS to config-provider-framework, we need to make sure RdsRouteConfigProvider "Listener independent".

Description:

At present a RdsRouteConfigProviderImpl is bound to a listenerImpl, as it holds a FactoryContext references to the ListenerImpl.

If we want to save memory on dedup the copy of RDS route config providers (e.g., #7923 ), we would need to make sure the referenced FactoryContext outlives a ListenerImpl.

Besides that, a RDS Subscription is held globally (weak_ptr) by the SingletonManager[RdsConfigProviderManager],which could be shared between Listeners, but the RDSSubscription also holds a reference to the factoryContext (in this case, ListenerImpl). If LDS updates a listener A, a live RDSSubscription owned by another listener B may crash Envoy when it tries to read from the destructed FactoryContext(ListenerImpl). VHDS implementation has a TODO that stated this concern:
/~https://github.com/envoyproxy/envoy/blob/master/source/common/router/rds_impl.cc#L113

Proposal:

One way to make the factoryContext referenced by router/... is to pass a Server::CommonFactoryContext (e.g. the Server::Instance ) instead of a Server::FactoryContext (only implemented by ListenerImpl) into the router/...

Right now, most of the APIs desired by objects under router/... are actually from CommonFactoryContext, except for several places that are kinda Listener specific:

  • initManager(), which is used to initiate a Subscription, but as SRDS(also VHDS, see issue#7617) comes into Envoy, we need something like the noop_init_manager in scoped_rds.cc
    We could possibly pass the current ListernerImpl::initManager() along with the CommonFactoryContext(a.k.a the Server::Instance) to router/... As it's only required at Subscription Start time.
  • RouteEntryImplBase::per_filter_configs_
    This is the place where users could insert their own per-filter-per-route config data, a FactoryContext is passed in to translate the opaque protobuf::Struct into C++ RouteSpecificFilterConfig object. In the upstream codebase, this parameter is not used in any translation so far. But there is a risk some users' translation may fail if they depends on some per-listener data during the translation, if we decide to move forward by passing a CommonFactoryContext into router/... instead of the ListenerImpl itself.
@zuercher zuercher added the design proposal Needs design doc/proposal before implementation label Sep 20, 2019
@lambdai
Copy link
Contributor

lambdai commented Sep 20, 2019

Thank you for writing the summary!

WRT InitManager:
It's arguable if the current usage of InitManager is in a good shape. I am seeing

  1. undefined behavior: Adding target to initialized InitManager (@ vhds)
  2. not adding target to another init manager (@ reusing subscription)
    I think we need SharedTarget(see my prototype here), also it's probably not ok to use listener's init manager everywhere.

WRT PerFilterConfigs
It's not allowed to use listener specific config if RDS is shared among listeners. As currently the code base is not utilizing listener specific config we should remove it ASAP to avoid people not aware of the design change.

Below is a list each RDS related FactoryContext usage. It may contain half truth.

vhds subscription
scope() clusterManager() initManager() at ctor

staticrouteconfigproviderimpl
timesource()

RdsRouteConfigSubscription
scope() messageValidationVisitor() clusterManager() timeSource()

RdsRouteConfigProviderImpl
threadLocal()

ConfigImpl
scope()

RouteEntryImplBase
runtime() messageValidationVisitor() api() Dispatcher().Timesource()

WeightedClusterEntry

PerFilterConfig
runtime()

VirtualHostImpl
scope() runtime() cm()

RouteMatcher

@stevenzzzz
Copy link
Contributor Author

stevenzzzz commented Sep 20, 2019 via email

@mattklein123
Copy link
Member

@stevenzzzz @lambdai I like the idea of splitting the context into effectively a server/global context and a listener context. Can we do that first?

@lambdai
Copy link
Contributor

lambdai commented Sep 21, 2019

My current wip on rds dedup is using the impl of ListenerContext but leaving the listener specific stuff unimplemented.
So ugly that I believe no one would approve.

Once I get the PR close to workable, I can propose the granular of ServerContext.
The idea is that server context should be naturally compatible with RDS dedup impl, unless we have a strong reason not to

@stevenzzzz
Copy link
Contributor Author

@alyssawilk @htuch FYI

htuch pushed a commit that referenced this issue Oct 18, 2019
Introduce ServerFactoryContext which is an implementation of CommonFactoryContext. Comparing to ListenerFactoryContext, ServerFactoryContext has a wider lifetime.
This is the prerequisite to share rds configuration among listeners.

Also experimenting move messageValidateVisitor out of CommonFactoryContext. The benefit is that now Server::Instance perfectly implements CommonFactoryContext.

ServerFactoryContext doesn't implement the function. The test shows it is never invoked.
Will put messageValidateVisitor into FactoryContext: this piece of work will be done in follow up PR

This PR is extracted from #8454

Fixes #8303

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@lambdai
Copy link
Contributor

lambdai commented Oct 19, 2019

@htuch This design is half done:
#8485 enables #8523
The latter one implements the shared rds config

@stale
Copy link

stale bot commented Nov 18, 2019

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Nov 18, 2019
@stale
Copy link

stale bot commented Nov 25, 2019

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted". Thank you for your contributions.

@stale stale bot closed this as completed Nov 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design proposal Needs design doc/proposal before implementation stale stalebot believes this issue/PR has not been touched recently
Projects
None yet
4 participants