-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Observed envoy memory while adding|removing ingress #499
Comments
|
@mattalberts thank you for your detailed bug report. I'm sorry that Contour is behaviour like that, there is no reason for Contour to consume gigabytes of memory, let alone hundreds of megabytes. Internally at Heptio I've also had a report that matches your symptoms, but from memory it never made it into an issue (/cc @alexbrand, please correct me if I'm wrong). I see that you're using Contour 0.5.0. The tricky bit about investigating this issue is that over the last two weeks almost all of contours business logic has been rewritten to support the new CRD we're adding in 0.6. The downside of that is any effort spent investigating what is going on with 0.5 is probably wasted effort as the code that may be causing the issue has likely gone to data heaven. I just landed #504 which marks the rewrite complete with respect to the features of Contour 0.5 -- everything that worked in 0.5 is confirmed to work in master now. How would you feel about trying to reproduce the problem with I'm going to continue to investigate this issue and have tagged it as a blocker for 0.6 final. |
We saw a memory consumption issue in the contour process, which I was not able to reproduce in 0.6-alpha.1 (#424 (comment)). It seems to me like this is a leak in Envoy though? |
@alexbrand @davecheney Hi! Thanks for writing back. Contour actually appears to be the reasonable container (I also really like the project :P); its envoy that appears to be leaky (though I wasn't certain where to origin the issue). I'm happy to start eval'ing from master! Its pretty easy to stand up a test environment and re-run my scripts :). |
Thanks for confirming. Even though contour's container is well behaved the issue causing Envoy to grow huge might be contour's fault; something with the way we're communicating with Envoy perhaps. I'm going to keep this issue open as a high priority, but also reference #443 which we'll do whenever we start on the 0.7 cycle (probably august) |
Update ~ contour:masterBased on interested, I've updated to The results are largely the same; though, the slope of the line plotting memory growth is lower (e.g. we still grow into GBs of memory used, but top out at ~32GB rather than ~64GB). Observation ~ Adding Ingressesmalbook:ingress-system malberts$ ./tools/clone-http-ingress.sh -n 5000 -b 100
REPORT
total=5000
block_size=100
delay enabled=0 duration=0.000000
BLOCK MEMORY DURATION
00000000 38Mi 12.73000000
00000001 38Mi 13.51000000
00000002 38Mi 15.69000000
00000003 727Mi 16.91000000
00000004 727Mi 19.13000000
00000005 727Mi 18.27000000
00000006 1397Mi 19.66000000
00000007 1397Mi 17.98000000
00000008 1397Mi 20.06000000
00000009 2322Mi 20.20000000
00000010 2322Mi 21.42000000
00000011 2322Mi 23.06000000
00000012 3201Mi 26.34000000
00000013 3201Mi 26.48000000
00000014 3962Mi 26.93000000
00000015 3962Mi 31.62000000
00000016 4750Mi 26.29000000
00000017 4750Mi 29.35000000
00000018 5308Mi 31.72000000
00000019 5308Mi 29.80000000
00000020 6104Mi 33.49000000
00000021 6104Mi 33.40000000
00000022 7012Mi 36.17000000
00000023 7745Mi 44.15000000
00000024 8615Mi 48.12000000
00000025 8615Mi 46.05000000
00000026 9419Mi 42.14000000
00000027 10281Mi 41.55000000
00000028 10281Mi 43.36000000
00000029 11109Mi 45.34000000
00000030 11907Mi 47.71000000
00000031 12713Mi 70.68000000
00000032 13480Mi 70.05000000
00000033 15110Mi 76.67000000
00000034 15853Mi 66.57000000
00000035 16718Mi 75.32000000
00000036 17501Mi 77.54000000
00000037 19168Mi 72.29000000
00000038 20075Mi 74.16000000
00000039 20848Mi 82.56000000
00000040 22403Mi 95.71000000
00000041 23176Mi 75.30000000
00000042 23908Mi 76.13000000
00000043 25675Mi 85.38000000
00000044 26451Mi 81.63000000
00000045 27945Mi 83.64000000
00000046 28724Mi 85.74000000
00000047 30249Mi 87.59000000
00000048 31112Mi 90.99000000
00000049 32721Mi 100.62000000 Heap TraceAgain, I've only included a portion of the output to reduce the copy-paste size. At least I was able to get a more legible stack trace (the raw-trace is attached).
|
Thanks for the update.
Just to confirm, the large process is envoy, not contour? Is that correct?
Looking at the stack trace you supplied the underlying issue might be due to contour constantly updating lds which introduces new ssl certificates to the https listener. This might be causing, or exposing an underlying issue in envoy.
… On 7 Jul 2018, at 07:12, Matthew Alberts ***@***.***> wrote:
Update ~ contour:master
Based on interested, I've update to contour:master and re-run tests. The results are largely the same; though, the slope of the slope of the line plotting memory growth is lower.
@davecheney
@alexbrand
Observation ~ Adding Ingresses
malbook:ingress-system malberts$ ./tools/clone-http-ingress.sh -n 5000 -b 100
REPORT
total=5000
block_size=100
delay enabled=0 duration=0.000000
BLOCK MEMORY DURATION
00000000 38Mi 12.73000000
00000001 38Mi 13.51000000
00000002 38Mi 15.69000000
00000003 727Mi 16.91000000
00000004 727Mi 19.13000000
00000005 727Mi 18.27000000
00000006 1397Mi 19.66000000
00000007 1397Mi 17.98000000
00000008 1397Mi 20.06000000
00000009 2322Mi 20.20000000
00000010 2322Mi 21.42000000
00000011 2322Mi 23.06000000
00000012 3201Mi 26.34000000
00000013 3201Mi 26.48000000
00000014 3962Mi 26.93000000
00000015 3962Mi 31.62000000
00000016 4750Mi 26.29000000
00000017 4750Mi 29.35000000
00000018 5308Mi 31.72000000
00000019 5308Mi 29.80000000
00000020 6104Mi 33.49000000
00000021 6104Mi 33.40000000
00000022 7012Mi 36.17000000
00000023 7745Mi 44.15000000
00000024 8615Mi 48.12000000
00000025 8615Mi 46.05000000
00000026 9419Mi 42.14000000
00000027 10281Mi 41.55000000
00000028 10281Mi 43.36000000
00000029 11109Mi 45.34000000
00000030 11907Mi 47.71000000
00000031 12713Mi 70.68000000
00000032 13480Mi 70.05000000
00000033 15110Mi 76.67000000
00000034 15853Mi 66.57000000
00000035 16718Mi 75.32000000
00000036 17501Mi 77.54000000
00000037 19168Mi 72.29000000
00000038 20075Mi 74.16000000
00000039 20848Mi 82.56000000
00000040 22403Mi 95.71000000
00000041 23176Mi 75.30000000
00000042 23908Mi 76.13000000
00000043 25675Mi 85.38000000
00000044 26451Mi 81.63000000
00000045 27945Mi 83.64000000
00000046 28724Mi 85.74000000
00000047 30249Mi 87.59000000
00000048 31112Mi 90.99000000
00000049 32721Mi 100.62000000
Heap Trace
Again, I've only included a portion of the output to reduce the copy-paste size. At least I was able to get a more legible stack trace (the raw-trace is attached).
(pprof) top
Total: 9982.5 MB
Leak of 697222424 bytes in 831016 objects allocated from:
@ 0090f485 unknown
@ 00000000008df612 BUF_memdup ??:0
@ 000000000091b217 CRYPTO_BUFFER_new ??:0
@ 00000000008ba473 bssl::x509_to_buffer ssl_x509.cc:0
@ 00000000008baa39 ssl_use_certificate ssl_x509.cc:0
@ 0000000000688f0c Envoy::Ssl::ContextImpl::ContextImpl /proc/self/cwd/source/common/ssl/context_impl.cc:144
@ 0000000000689e06 Envoy::Ssl::ServerContextImpl::ServerContextImpl /proc/self/cwd/source/common/ssl/context_impl.cc:439
@ 000000000068c6af Envoy::Ssl::ContextManagerImpl::createSslServerContext /proc/self/cwd/source/common/ssl/context_manager_impl.cc:75
@ 000000000056e9bc Envoy::Ssl::ServerSslSocketFactory::ServerSslSocketFactory /proc/self/cwd/source/common/ssl/ssl_socket.cc:375
@ 000000000051f83c Envoy::Server::Configuration::DownstreamSslSocketFactory::createTransportSocketFactory /proc/self/cwd/source/server/config/network/ssl_socket.cc:36
@ 000000000055a02c Envoy::Server::ListenerImpl::ListenerImpl /proc/self/cwd/source/server/listener_manager_impl.cc:203
@ 000000000055b575 Envoy::Server::ListenerManagerImpl::addOrUpdateListener /proc/self/cwd/source/server/listener_manager_impl.cc:330
@ 000000000076b0dd Envoy::Server::LdsApi::onConfigUpdate /proc/self/cwd/source/server/lds_api.cc:59
@ 000000000076cce0 Envoy::Config::GrpcMuxSubscriptionImpl::onConfigUpdate /proc/self/cwd/bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:53
@ 00000000007722b3 Envoy::Config::GrpcMuxImpl::onReceiveMessage /proc/self/cwd/source/common/config/grpc_mux_impl.cc:174
@ 000000000076f622 Envoy::Grpc::TypedAsyncStreamCallbacks::onReceiveMessageUntyped /proc/self/cwd/bazel-out/k8-opt/bin/include/envoy/grpc/_virtual_includes/async_client_interface/envoy/grpc/async_client.h:172
@ 0000000000788c65 Envoy::Grpc::AsyncStreamImpl::onData /proc/self/cwd/source/common/grpc/async_client_impl.cc:131
@ 000000000078d93b Envoy::Http::AsyncStreamImpl::encodeData /proc/self/cwd/source/common/http/async_client_impl.cc:108
@ 00000000006b2a93 Envoy::Http::Http2::ConnectionImpl::onFrameReceived /proc/self/cwd/source/common/http/http2/codec_impl.cc:445
@ 00000000006b58b6 nghttp2_session_on_data_received /tmp/nghttp2.dep.build/nghttp2-1.29.0/lib/nghttp2_session.c:4881
@ 00000000006b94e1 nghttp2_session_mem_recv /tmp/nghttp2.dep.build/nghttp2-1.29.0/lib/nghttp2_session.c:6443
@ 00000000006b1aee Envoy::Http::Http2::ConnectionImpl::dispatch /proc/self/cwd/source/common/http/http2/codec_impl.cc:302
@ 000000000066634e Envoy::Http::CodecClient::onData /proc/self/cwd/source/common/http/codec_client.cc:115
@ 00000000006664cc Envoy::Http::CodecClient::CodecReadFilter::onData /proc/self/cwd/bazel-out/k8-opt/bin/source/common/http/_virtual_includes/codec_client_lib/common/http/codec_client.h:159
@ 000000000056dc66 Envoy::Network::FilterManagerImpl::onContinueReading /proc/self/cwd/source/common/network/filter_manager_impl.cc:56
@ 000000000056c5ce Envoy::Network::ConnectionImpl::onReadReady /proc/self/cwd/source/common/network/connection_impl.cc:443
@ 000000000056cded Envoy::Network::ConnectionImpl::onFileEvent /proc/self/cwd/source/common/network/connection_impl.cc:419
@ 0000000000566307 _FUN /proc/self/cwd/source/common/event/file_event_impl.cc:61
@ 00000000008a1d11 event_process_active_single_queue.isra.29 /tmp/libevent.dep.build/libevent-2.1.8-stable/event.c:1639
@ 00000000008a246e event_base_loop /tmp/libevent.dep.build/libevent-2.1.8-stable/event.c:1961
@ 000000000054dcdd Envoy::Server::InstanceImpl::run /proc/self/cwd/source/server/server.cc:356
@ 0000000000464850 Envoy::MainCommonBase::run /proc/self/cwd/source/exe/main_common.cc:83
Leak of 498609600 bytes in 831016 objects allocated from:
@ 0090f485 unknown
@ 00000000008ae16c SSL_CTX_new ??:0
@ 00000000006888ae Envoy::Ssl::ContextImpl::ContextImpl /proc/self/cwd/source/common/ssl/context_impl.cc:34
@ 0000000000689e06 Envoy::Ssl::ServerContextImpl::ServerContextImpl /proc/self/cwd/source/common/ssl/context_impl.cc:439
@ 000000000068c6af Envoy::Ssl::ContextManagerImpl::createSslServerContext /proc/self/cwd/source/common/ssl/context_manager_impl.cc:75
@ 000000000056e9bc Envoy::Ssl::ServerSslSocketFactory::ServerSslSocketFactory /proc/self/cwd/source/common/ssl/ssl_socket.cc:375
@ 000000000051f83c Envoy::Server::Configuration::DownstreamSslSocketFactory::createTransportSocketFactory /proc/self/cwd/source/server/config/network/ssl_socket.cc:36
@ 000000000055a02c Envoy::Server::ListenerImpl::ListenerImpl /proc/self/cwd/source/server/listener_manager_impl.cc:203
@ 000000000055b575 Envoy::Server::ListenerManagerImpl::addOrUpdateListener /proc/self/cwd/source/server/listener_manager_impl.cc:330
@ 000000000076b0dd Envoy::Server::LdsApi::onConfigUpdate /proc/self/cwd/source/server/lds_api.cc:59
@ 000000000076cce0 Envoy::Config::GrpcMuxSubscriptionImpl::onConfigUpdate /proc/self/cwd/bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:53
@ 00000000007722b3 Envoy::Config::GrpcMuxImpl::onReceiveMessage /proc/self/cwd/source/common/config/grpc_mux_impl.cc:174
@ 000000000076f622 Envoy::Grpc::TypedAsyncStreamCallbacks::onReceiveMessageUntyped /proc/self/cwd/bazel-out/k8-opt/bin/include/envoy/grpc/_virtual_includes/async_client_interface/envoy/grpc/async_client.h:172
@ 0000000000788c65 Envoy::Grpc::AsyncStreamImpl::onData /proc/self/cwd/source/common/grpc/async_client_impl.cc:131
@ 000000000078d93b Envoy::Http::AsyncStreamImpl::encodeData /proc/self/cwd/source/common/http/async_client_impl.cc:108
@ 00000000006b2a93 Envoy::Http::Http2::ConnectionImpl::onFrameReceived /proc/self/cwd/source/common/http/http2/codec_impl.cc:445
@ 00000000006b58b6 nghttp2_session_on_data_received /tmp/nghttp2.dep.build/nghttp2-1.29.0/lib/nghttp2_session.c:4881
@ 00000000006b94e1 nghttp2_session_mem_recv /tmp/nghttp2.dep.build/nghttp2-1.29.0/lib/nghttp2_session.c:6443
@ 00000000006b1aee Envoy::Http::Http2::ConnectionImpl::dispatch /proc/self/cwd/source/common/http/http2/codec_impl.cc:302
@ 000000000066634e Envoy::Http::CodecClient::onData /proc/self/cwd/source/common/http/codec_client.cc:115
@ 00000000006664cc Envoy::Http::CodecClient::CodecReadFilter::onData /proc/self/cwd/bazel-out/k8-opt/bin/source/common/http/_virtual_includes/codec_client_lib/common/http/codec_client.h:159
@ 000000000056dc66 Envoy::Network::FilterManagerImpl::onContinueReading /proc/self/cwd/source/common/network/filter_manager_impl.cc:56
@ 000000000056c5ce Envoy::Network::ConnectionImpl::onReadReady /proc/self/cwd/source/common/network/connection_impl.cc:443
@ 000000000056cded Envoy::Network::ConnectionImpl::onFileEvent /proc/self/cwd/source/common/network/connection_impl.cc:419
@ 0000000000566307 _FUN /proc/self/cwd/source/common/event/file_event_impl.cc:61
@ 00000000008a1d11 event_process_active_single_queue.isra.29 /tmp/libevent.dep.build/libevent-2.1.8-stable/event.c:1639
@ 00000000008a246e event_base_loop /tmp/libevent.dep.build/libevent-2.1.8-stable/event.c:1961
@ 000000000054dcdd Envoy::Server::InstanceImpl::run /proc/self/cwd/source/server/server.cc:356
@ 0000000000464850 Envoy::MainCommonBase::run /proc/self/cwd/source/exe/main_common.cc:83
@ 00000000004156c8 main /proc/self/cwd/source/exe/main.cc:30
@ 00007f114a497b8d unknown
Leak of 464537944 bytes in 831016 objects allocated from:
@ 0090f485 unknown
@ 00000000008d7047 asn1_enc_save ??:0
@ 00000000008d4ebd ASN1_item_ex_d2i ??:0
@ 00000000008d519c asn1_template_noexp_d2i tasn_dec.c:0
@ 00000000008d53db asn1_template_ex_d2i tasn_dec.c:0
@ 00000000008d4c3b ASN1_item_ex_d2i ??:0
@ 00000000008d4f6a ASN1_item_d2i ??:0
@ 000000000090f3c8 d2i_X509_AUX ??:0
@ 0000000000913487 PEM_ASN1_read_bio ??:0
@ 0000000000688edc Envoy::Ssl::ContextImpl::ContextImpl /proc/self/cwd/source/common/ssl/context_impl.cc:143
@ 0000000000689e06 Envoy::Ssl::ServerContextImpl::ServerContextImpl /proc/self/cwd/source/common/ssl/context_impl.cc:439
@ 000000000068c6af Envoy::Ssl::ContextManagerImpl::createSslServerContext /proc/self/cwd/source/common/ssl/context_manager_impl.cc:75
@ 000000000056e9bc Envoy::Ssl::ServerSslSocketFactory::ServerSslSocketFactory /proc/self/cwd/source/common/ssl/ssl_socket.cc:375
@ 000000000051f83c Envoy::Server::Configuration::DownstreamSslSocketFactory::createTransportSocketFactory /proc/self/cwd/source/server/config/network/ssl_socket.cc:36
@ 000000000055a02c Envoy::Server::ListenerImpl::ListenerImpl /proc/self/cwd/source/server/listener_manager_impl.cc:203
@ 000000000055b575 Envoy::Server::ListenerManagerImpl::addOrUpdateListener /proc/self/cwd/source/server/listener_manager_impl.cc:330
@ 000000000076b0dd Envoy::Server::LdsApi::onConfigUpdate /proc/self/cwd/source/server/lds_api.cc:59
@ 000000000076cce0 Envoy::Config::GrpcMuxSubscriptionImpl::onConfigUpdate /proc/self/cwd/bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:53
@ 00000000007722b3 Envoy::Config::GrpcMuxImpl::onReceiveMessage /proc/self/cwd/source/common/config/grpc_mux_impl.cc:174
@ 000000000076f622 Envoy::Grpc::TypedAsyncStreamCallbacks::onReceiveMessageUntyped /proc/self/cwd/bazel-out/k8-opt/bin/include/envoy/grpc/_virtual_includes/async_client_interface/envoy/grpc/async_client.h:172
@ 0000000000788c65 Envoy::Grpc::AsyncStreamImpl::onData /proc/self/cwd/source/common/grpc/async_client_impl.cc:131
@ 000000000078d93b Envoy::Http::AsyncStreamImpl::encodeData /proc/self/cwd/source/common/http/async_client_impl.cc:108
@ 00000000006b2a93 Envoy::Http::Http2::ConnectionImpl::onFrameReceived /proc/self/cwd/source/common/http/http2/codec_impl.cc:445
@ 00000000006b58b6 nghttp2_session_on_data_received /tmp/nghttp2.dep.build/nghttp2-1.29.0/lib/nghttp2_session.c:4881
@ 00000000006b94e1 nghttp2_session_mem_recv /tmp/nghttp2.dep.build/nghttp2-1.29.0/lib/nghttp2_session.c:6443
@ 00000000006b1aee Envoy::Http::Http2::ConnectionImpl::dispatch /proc/self/cwd/source/common/http/http2/codec_impl.cc:302
@ 000000000066634e Envoy::Http::CodecClient::onData /proc/self/cwd/source/common/http/codec_client.cc:115
@ 00000000006664cc Envoy::Http::CodecClient::CodecReadFilter::onData /proc/self/cwd/bazel-out/k8-opt/bin/source/common/http/_virtual_includes/codec_client_lib/common/http/codec_client.h:159
@ 000000000056dc66 Envoy::Network::FilterManagerImpl::onContinueReading /proc/self/cwd/source/common/network/filter_manager_impl.cc:56
@ 000000000056c5ce Envoy::Network::ConnectionImpl::onReadReady /proc/self/cwd/source/common/network/connection_impl.cc:443
@ 000000000056cded Envoy::Network::ConnectionImpl::onFileEvent /proc/self/cwd/source/common/network/connection_impl.cc:419
@ 0000000000566307 _FUN /proc/self/cwd/source/common/event/file_event_impl.cc:61
Leak of 352350784 bytes in 831016 objects allocated from:
@ 0068c684 unknown
@ 000000000056e9bc Envoy::Ssl::ServerSslSocketFactory::ServerSslSocketFactory /proc/self/cwd/source/common/ssl/ssl_socket.cc:375
@ 000000000051f83c Envoy::Server::Configuration::DownstreamSslSocketFactory::createTransportSocketFactory /proc/self/cwd/source/server/config/network/ssl_socket.cc:36
@ 000000000055a02c Envoy::Server::ListenerImpl::ListenerImpl /proc/self/cwd/source/server/listener_manager_impl.cc:203
@ 000000000055b575 Envoy::Server::ListenerManagerImpl::addOrUpdateListener /proc/self/cwd/source/server/listener_manager_impl.cc:330
@ 000000000076b0dd Envoy::Server::LdsApi::onConfigUpdate /proc/self/cwd/source/server/lds_api.cc:59
@ 000000000076cce0 Envoy::Config::GrpcMuxSubscriptionImpl::onConfigUpdate /proc/self/cwd/bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:53
@ 00000000007722b3 Envoy::Config::GrpcMuxImpl::onReceiveMessage /proc/self/cwd/source/common/config/grpc_mux_impl.cc:174
@ 000000000076f622 Envoy::Grpc::TypedAsyncStreamCallbacks::onReceiveMessageUntyped /proc/self/cwd/bazel-out/k8-opt/bin/include/envoy/grpc/_virtual_includes/async_client_interface/envoy/grpc/async_client.h:172
@ 0000000000788c65 Envoy::Grpc::AsyncStreamImpl::onData /proc/self/cwd/source/common/grpc/async_client_impl.cc:131
@ 000000000078d93b Envoy::Http::AsyncStreamImpl::encodeData /proc/self/cwd/source/common/http/async_client_impl.cc:108
@ 00000000006b2a93 Envoy::Http::Http2::ConnectionImpl::onFrameReceived /proc/self/cwd/source/common/http/http2/codec_impl.cc:445
@ 00000000006b58b6 nghttp2_session_on_data_received /tmp/nghttp2.dep.build/nghttp2-1.29.0/lib/nghttp2_session.c:4881
@ 00000000006b94e1 nghttp2_session_mem_recv /tmp/nghttp2.dep.build/nghttp2-1.29.0/lib/nghttp2_session.c:6443
@ 00000000006b1aee Envoy::Http::Http2::ConnectionImpl::dispatch /proc/self/cwd/source/common/http/http2/codec_impl.cc:302
@ 000000000066634e Envoy::Http::CodecClient::onData /proc/self/cwd/source/common/http/codec_client.cc:115
@ 00000000006664cc Envoy::Http::CodecClient::CodecReadFilter::onData /proc/self/cwd/bazel-out/k8-opt/bin/source/common/http/_virtual_includes/codec_client_lib/common/http/codec_client.h:159
@ 000000000056dc66 Envoy::Network::FilterManagerImpl::onContinueReading /proc/self/cwd/source/common/network/filter_manager_impl.cc:56
@ 000000000056c5ce Envoy::Network::ConnectionImpl::onReadReady /proc/self/cwd/source/common/network/connection_impl.cc:443
@ 000000000056cded Envoy::Network::ConnectionImpl::onFileEvent /proc/self/cwd/source/common/network/connection_impl.cc:419
@ 0000000000566307 _FUN /proc/self/cwd/source/common/event/file_event_impl.cc:61
@ 00000000008a1d11 event_process_active_single_queue.isra.29 /tmp/libevent.dep.build/libevent-2.1.8-stable/event.c:1639
@ 00000000008a246e event_base_loop /tmp/libevent.dep.build/libevent-2.1.8-stable/event.c:1961
@ 000000000054dcdd Envoy::Server::InstanceImpl::run /proc/self/cwd/source/server/server.cc:356
@ 0000000000464850 Envoy::MainCommonBase::run /proc/self/cwd/source/exe/main_common.cc:83
@ 00000000004156c8 main /proc/self/cwd/source/exe/main.cc:30
@ 00007f114a497b8d unknown
8605.2 86.2% 86.2% 8605.2 86.2% OPENSSL_malloc
821.0 8.2% 94.4% 821.0 8.2% OPENSSL_realloc
336.0 3.4% 97.8% 9843.3 98.6% Envoy::Ssl::ContextManagerImpl::createSslServerContext
53.5 0.5% 98.3% 53.5 0.5% __gnu_cxx::new_allocator::allocate (inline)
46.3 0.5% 98.8% 46.3 0.5% std::__cxx11::basic_string::_M_construct
19.0 0.2% 99.0% 20.0 0.2% std::_List_node::_List_node (inline)
13.2 0.1% 99.1% 13.2 0.1% std::__cxx11::basic_string::reserve
12.8 0.1% 99.2% 12.8 0.1% std::__cxx11::basic_string::_M_mutate
12.7 0.1% 99.4% 9856.9 98.7% std::make_unique (inline)
9.5 0.1% 99.5% 9.5 0.1% std::__fill_a (inline)
envoy.hprof.0220.heap.txt
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@davecheney Correct. The memory hungry container is envoy. Contour remains within reasonable range across ingress insertion. Here, I've grabbed a screenshot of each container (in isolation) at the same time window. I've annotated the images to separate the different stages
Contour MemoryEnvoy Memory |
@mattalberts can you share Do all your Ingress objects share the same TLS key? FWIW, I also hacked something similar together: /~https://github.com/rosskukulinski/contour-envoy-memory-batch. Leaving this in the 0.6.0 milestone for now, but we may also want to see how things change with Envoy 1.7. |
I sure can! Its a little rough, especially right here. MEM=$(kubectl -n ingress-system top pods -l'app=contour-ingress,component=debug' | awk 'NR==2{print $3}')
|
@rosskukulinski I have this script too (a derivation of the first script) that i used for log rate testing
|
Bumping to 0.8. This issue is important, but can't be p0 this late in the cycle. |
Updates projectcontour#499 Updates projectcontour#273 Updates projectcontour#1176 The XDS spec says that Envoy will always initiate a stream with a discovery request, and expects the management server to respond with only one discovery response. After that, Envoy will initiate another discovery request containing an ACK or a NACK from the previous response. Currently Contour ignores the ACK/NACK, this is projectcontour#1176, however after inspection of the current code it is evident that we're also not waiting for Envoy to send the next discovery request. This PR removes the inner `for {}` loop that would continue to reuse the initial discovery request until the client disconnected. The previous code was written in a time when we'd just implemented filtering and it was possible for the filter to return no results, hence the inner loop was--incorrectly--trying to loop until there was a result to return. Huge thanks to @lrouquette who pointed this out. Signed-off-by: Dave Cheney <dave@cheney.net>
We've continued to work on #499 throughout 0.14 by reducing the number of spurious updates sent to Envoy. Moving to 0.15 as work continues. |
In 0.15 we added filtering for unrelated secrets and services. Moving to the next milestone as there is no more work scheduled for this release. |
Moving to the 1.0 release milestone as there is work scheduled for the release candidates. |
Fixes projectcontour#1425 Fixes projectcontour#1385 Updates projectcontour#499 This PR threads the leader elected signal throught to contour.EventHandler allowing it to skip writing status back to the API unless it is currently the leader. This should fixes projectcontour#1425 by removing the condition where several Contours would fight to update status. This updates projectcontour#499 by continuing to reduce the number of updates that Contour generates, thereby processes. This PR does create a condition where during startup no Contour may be the leader and the xDS tables reach steady state before anyone is elected. This would mean the status of an object would be stale until the next update from the API server after leadership was established. To address this a mechanism to force a rebuild of the dag is added to the EventHandler and wired to election success. Signed-off-by: Dave Cheney <dave@cheney.net>
Fixes #1425 Fixes #1385 Updates #499 This PR threads the leader elected signal throught to contour.EventHandler allowing it to skip writing status back to the API unless it is currently the leader. This should fixes #1425 by removing the condition where several Contours would fight to update status. This updates #499 by continuing to reduce the number of updates that Contour generates, thereby processes. This PR does create a condition where during startup no Contour may be the leader and the xDS tables reach steady state before anyone is elected. This would mean the status of an object would be stale until the next update from the API server after leadership was established. To address this a mechanism to force a rebuild of the dag is added to the EventHandler and wired to election success. Signed-off-by: Dave Cheney <dave@cheney.net>
Hello, TL;DR upgrade to Contour 1.2.0 or later and follow the recommendation to use Envoy 1.13.0 or later. After some investigations on an internally reported issue I am pleased to say this issue can be bought to a close. The root cause of the issue was envoyproxy/envoy#7923 which caused envoy to keep N squared copies of the RDS database in memory for each LDS update. This meant that as the number of vhosts defined across Ingress/IngressRoute/HTTPProxy documents that used TLS grew, this would consume N*N memory on the envoy side for each configuration update. Said another way, the memory consumed by Envoy for each configuration was quadratic, not linear. This issues was resolved upstream in envoyproxy/envoy#9209 and shipped as part of Envoy 1.13.0. I am marking this issue as complete against the 1.2.0 milestone. The remaining work to reduce the cost of LDS updates is tracked on #1039 which at the time of writing remains blocked on upstream support for FDS. |
great work team to bring this to a close! |
Fantastic!
…Sent from my iPhone
On Feb 26, 2020, at 9:40 PM, Michael Michael ***@***.***> wrote:
great work team to bring this to a close!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Signed-off-by: Steve Kriss <krisss@vmware.com>
Envoy Memory Investigation
Odd container memory observations do not affect the contour container. As seen in the graphs, contour memory changes are dwarfed by envoy memory changes. I wanted to start the question/discussion with the contour project to see if similar observations have been seen.
I found two possibly related issues in the envoy project
The likelihood of a relation is low, but I've included links to the issues for the sake of completenes.
I have yet to find the root cause, but it feels like handling around config (either a leak or purposeful check-point while merging changes).
@davecheney
Summary
The goal is to observe container memory while adding and removing ingress definitions.
Spoilers
Launch Context
To help eliminate possible causes for memory growth, both hot restart and envoy metrics scraping have been disabled.
Observation ~ Adding Ingresses
Let's observe container memory while adding ingress definitions.
Though envoy specific metrics are currently unavailable (the scrape is disabled); container memory metrics were mirrored by envoy heap allocation prior to disabling.
The tool
clone-http-ingress.sh
is bash script that adds ingress definitions in blocksThe ingress definition is generated from this template.
The command below will add a total of 5000 definitions in blocks of 100.
>$ ./clone-http-ingress.sh -b 100 -n 5000 REPORT total=5000 block_size=100 delay enabled=0 delay=0.000000 BLOCK MEMORY DURATION 00000000 91Mi 17.20000000 00000001 91Mi 15.18000000 00000002 91Mi 19.02000000 00000003 1628Mi 21.68000000 00000004 1628Mi 19.76000000 00000005 1628Mi 20.33000000 00000006 3550Mi 19.08000000 00000007 3550Mi 23.22000000 00000008 3550Mi 24.30000000 00000009 6123Mi 25.92000000 00000010 6123Mi 24.22000000 00000011 8075Mi 28.61000000 00000012 8075Mi 24.51000000 00000013 10248Mi 24.16000000 00000014 10248Mi 30.01000000 00000015 12601Mi 27.30000000 00000016 12601Mi 27.13000000 00000017 12601Mi 29.85000000 00000018 14741Mi 31.88000000 00000019 16793Mi 32.22000000 00000020 16793Mi 34.20000000 00000021 18388Mi 34.93000000 00000022 18388Mi 38.84000000 00000023 20918Mi 40.79000000 00000024 22845Mi 41.11000000 00000025 22845Mi 40.71000000 00000026 24758Mi 43.63000000 00000027 27056Mi 45.06000000 00000028 28994Mi 50.16000000 00000029 30965Mi 46.90000000 00000030 30965Mi 47.86000000 00000031 33168Mi 47.39000000 00000032 35386Mi 48.52000000 00000033 37886Mi 51.76000000 00000034 39765Mi 58.24000000 00000035 42228Mi 56.12000000 00000036 44600Mi 51.11000000 00000037 46879Mi 54.84000000 00000038 46879Mi 53.24000000 00000039 49202Mi 51.93000000 00000040 51236Mi 55.18000000 00000041 53300Mi 53.77000000 00000042 54895Mi 55.32000000 00000043 56731Mi 57.56000000 00000044 58954Mi 58.81000000 00000045 60353Mi 61.28000000 00000046 62116Mi 57.53000000 00000047 63968Mi 61.20000000 00000048 65922Mi 61.22000000 00000049 67487Mi 61.11000000
Observation ~ Deleting Ingresses
Continuing from the conditions above, let's observe memory while cleaning up the cluster.
Heap Trace
Selecting a heap trace from during the middle of insertion, there does appear to be a fairly large leak. I've included the top set of leaks (not all), to avoid a huge copy-paste.
The text was updated successfully, but these errors were encountered: