Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: query: TLS handshake error #6230

Closed
crystalgreen opened this issue Nov 20, 2024 · 5 comments · Fixed by #6239 or #6241
Closed

[Bug]: query: TLS handshake error #6230

crystalgreen opened this issue Nov 20, 2024 · 5 comments · Fixed by #6239 or #6241
Labels

Comments

@crystalgreen
Copy link

What happened?

Using v1.63.0 on Windows.
When loading https://localhost:16686/ in the browser
I get the error on the console where I run jaeger-all-in-one.exe

http: TLS handshake error from [::1]:55388: tls: first record does not look like a TLS handshake

I run with args:

--query.additional-headers="Access-Control-Allow-Origin: " --collector.otlp.enabled=true --collector.otlp.http.cors.allowed-origins="" --collector.otlp.http.cors.allowed-headers="*" --collector.otlp.http.host-port=4318 --collector.otlp.http.tls.key=%TLS%\key.pem --collector.otlp.http.tls.cert=%TLS%\cert.pem --collector.otlp.http.tls.enabled=true --query.http.tls.enabled=true --query.http.tls.key=%TLS%\key.pem --query.http.tls.cert=%TLS%\cert.pem

This works with v1.62.0.

Maybe this is related to PR #6023
I don't see an update to the CLI doc.

Steps to reproduce

  1. Run jaeger-all-in-one.exe on windows with args similar to above
  2. Open UI at default port in browser
  3. See described error in jaeger console

Expected behavior

It should work as before - the UI should load but it remains empty.

Relevant log output

http: TLS handshake error from [::1]:55388: tls: first record does not look like a TLS handshake

Screenshot

No response

Additional context

No response

Jaeger backend version

v1.63.0

SDK

No response

Pipeline

No response

Stogage backend

none (all-in-one)

Operating system

Windows

Deployment model

No response

Deployment configs

@yurishkuro
Copy link
Member

@mahadzaryab1 can you take a look?

@yurishkuro
Copy link
Member

yurishkuro commented Nov 22, 2024

@mahadzaryab1 I am able to reproduce, so it looks like it might be related to introducing OTEL helpers:

go run ./cmd/all-in-one \
    --query.http.tls.enabled=true \
    --query.http.tls.cert=./pkg/config/tlscfg/testdata/example-server-cert.pem \
    --query.http.tls.key=./pkg/config/tlscfg/testdata/example-server-key.pem

after the request the server logs a bunch of errors:

2024/11/21 23:27:00 http: TLS handshake error from [::1]:62373: client sent an HTTP request to an HTTPS server
2024/11/21 23:27:04 http: TLS handshake error from [::1]:62374: EOF
2024/11/21 23:27:05 http: TLS handshake error from [::1]:62375: remote error: tls: unknown certificate
2024/11/21 23:27:05 http: TLS handshake error from [::1]:62376: remote error: tls: unknown certificate
2024/11/21 23:27:13 http: TLS handshake error from [::1]:62377: remote error: tls: unknown certificate
2024/11/21 23:27:13 http: TLS handshake error from [::1]:62378: remote error: tls: unknown certificate
2024/11/21 23:27:13 http: TLS handshake error from [::1]:62379: tls: first record does not look like a TLS handshake
$ curl -v -k https://localhost:16686/
...
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://localhost:16686/
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: localhost:16686]
* [HTTP/2] [1] [:path: /]
* [HTTP/2] [1] [user-agent: curl/8.7.1]
* [HTTP/2] [1] [accept: */*]
> GET / HTTP/2
> Host: localhost:16686
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
* Closing connection
* Recv failure: Connection reset by peer
* Send failure: Broken pipe
curl: (16) Recv failure: Connection reset by peer

@yurishkuro
Copy link
Member

yurishkuro commented Nov 22, 2024

@mahadzaryab1 I think this is the mistake:

if s.queryOptions.HTTP.TLSSetting != nil {
err = s.httpServer.ServeTLS(s.httpConn, "", "")
} else {
err = s.httpServer.Serve(s.httpConn)
}

The listener is already upgraded to TLS by ToListener() function. If I replace it with only s.httpServer.Serve(s.httpConn) , then the connection works, but ONLY with http/1.1:

$ curl -k --http1.1 https://localhost:16686/ > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4454    0  4454    0     0   893k      0 --:--:-- --:--:-- --:--:-- 1087k

So there's still something funky going on, as by default curl tries to use http/2 and the server doesn't like it. Meanwhile, OTEL's test do appear to be testing both 2 and 1 /~https://github.com/open-telemetry/opentelemetry-collector/blob/00ad49af88794a8ad03d9994d290d1dea410ede2/config/confighttp/confighttp_test.go#L681

@mahadzaryab1
Copy link
Collaborator

@yurishkuro Thanks for the reproduction steps. I've got a patch at #6239. I'm still unsure as to why the unit tests weren't able to catch this and will see if I can come up with a regression test for this.

@yurishkuro
Copy link
Member

Let's keep this open for a bit until we find out why the tests did not catch this.

One suspicious thing I see is here

if test.HTTPTLSEnabled && test.TLS.ClientCAFile != "" {

The HTTP/TLS tests are only run when ClientCA is present. If I remove that 2nd clause, the tests still pass - but both before and after your fix, so they are still not catching the issue. But maybe we have other similar filters.

It's also not clear to me why we are doing manual dial outs in this block

if serverOptions.HTTP.TLSSetting != nil {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants