Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement HTTP spec #2438

Merged
merged 72 commits into from
Aug 29, 2023
Merged

feat: Implement HTTP spec #2438

merged 72 commits into from
Aug 29, 2023

Conversation

MarcoPolo
Copy link
Collaborator

@MarcoPolo MarcoPolo commented Jul 20, 2023

A MVP for libp2p HTTP spec libp2p/specs#508.

This adds an Host for doing request-response with HTTP semantics. The libp2phttp.Host contrasts with a "stream host" (basichost or host.Host interface). It is capable of doing HTTP over an HTTP transport and/or a stream transport (provided by the stream host).

What works in this MVP:

  • .well-known/libp2p
  • A helper func to give you the best roundtripper given a list of multiaddrs and an optional stream host.
  • A way of listening for HTTP on stream transports as well as a native HTTP transport.

My thoughts after implementing the spec is that the spec is a nice minimal set of constraints that are relatively easy to implement. Even making a compliant node using the the stock HTTP library is fairly straightforward.

If this looks good after a review, I would suggest merging this as an experimental feature to allow folks to start using it and getting feedback. I may try to integrate this into the IPNI stack to see how it feels.

Note: This branch targets marco/gostream which pulls in /~https://github.com/libp2p/go-libp2p-gostream into the monorepo.


Update Aug 2:
I've removed the libp2p-noise auth from the http spec, and replaced it with this spec focused only on auth: libp2p/specs#564.

I think for the mvp we can merge this without doing any peer id auth, since peer id auth is optional anyways.

@BigLep
Copy link
Contributor

BigLep commented Jul 24, 2023

@MarcoPolo : for the "libp2p and HTTP" effort, what will the testing story be? I'm not saying that all needs to be in this PR, but I'm curious about where we're aiming to end up.

@MarcoPolo
Copy link
Collaborator Author

@MarcoPolo : for the "libp2p and HTTP" effort, what will the testing story be? I'm not saying that all needs to be in this PR, but I'm curious about where we're aiming to end up.

This PR adds basic tests that test end-to-end communication on top of stream transports or an HTTP transport. It tests that a stock HTTP client/server can interoperate. It also adds tests around peer id authentication. It doesn't currently, but will include a test that tests the SNI+TLS flow for peer id authentication, and tests that incorrect auth steps correctly fail (these are noted in before merge list in the PR body).

Once we have multiple implementations of the spec, we will leverage the multidim interop tester to test this as well. It should just work, with no changes needed to the test runner.

Was there a test you had in mind that was missing here? Or something that you feel should be highlighted?

@BigLep
Copy link
Contributor

BigLep commented Jul 25, 2023

Was there a test you had in mind that was missing here? Or something that you feel should be highlighted?

Thanks for explaining. I didn't have anything else in mind. I had missed the tests you described when I took a quick scan earlier. My bad for not catching those.

Concerning multidimen interop running, we'll also need that for testing version compatibility as well right (even if go-libp2p is the only implementation to support this functionality for awhile)?

@MarcoPolo
Copy link
Collaborator Author

I've removed the libp2p-noise auth from the http spec, and replaced it with this spec focused only on auth: libp2p/specs#564. I've also removed auth from this PR.

I think for the mvp we can merge this without doing any peer id auth, since peer id auth is optional anyways.

Copy link
Contributor

@gammazero gammazero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. I plan on using it soon.
Left some trivial suggestions.

p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
p2p/http/ping.go Outdated Show resolved Hide resolved
p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
p2p/http/ping.go Outdated Show resolved Hide resolved
p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
@MarcoPolo
Copy link
Collaborator Author

@gammazero I have a surprise for you: ipni/go-libipni#102

p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
// lets us know if we've recently connected to an HTTP endpoint and might
// have a warm idle connection for it (managed by the underlying HTTP
// roundtripper). In some cases, this lets us reuse our existing custom roundtripper (i.e. SNI != host).
recentHTTPAddrs *lru.Cache[peer.ID, httpAddr]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be keyed by the addr+scheme+sni combo.

@MarcoPolo MarcoPolo mentioned this pull request Aug 11, 2023
21 tasks
Copy link
Contributor

@marten-seemann marten-seemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of my comments come from looking at the GoDoc (godoc -http=:6060), and putting myself into the shoes of a user of this package.

Should we split up libp2phttp.go into two files, one for the client and one for the server?


The following code doesn't work:

host, err := libp2phttp.New(libp2phttp.ListenAddrs([]ma.Multiaddr{ma.StringCast("/ip4/127.0.0.1/tcp/8080/http")}))
if err != nil {
	log.Fatal(err)
}
host.SetHttpHandler("/website", http.FileServer(http.Dir("/tmp")))
log.Fatal(host.Serve())

Haven't debugged it yet. It creates the well-known handler, but http://localhost:8080/website/ returns a 404.

p2p/security/noise/pb/payload.proto Outdated Show resolved Hide resolved
p2p/http/ping.go Outdated Show resolved Hide resolved
p2p/http/ping.go Outdated Show resolved Hide resolved
p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
ServerMustAuthenticatePeerID bool
}

func RoundTripperPreferHTTPTransport(o roundTripperOpts) roundTripperOpts {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation.

p2p/http/options.go Show resolved Hide resolved
p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
Comment on lines 575 to 681
ma.ForEach(addr, func(c ma.Component) bool {
switch c.Protocol().Code {
case ma.P_IP4, ma.P_IP6, ma.P_DNS, ma.P_DNS4, ma.P_DNS6:
out.host = c.Value()
case ma.P_TCP, ma.P_UDP:
out.port = c.Value()
case ma.P_TLS:
out.useHTTPS = true
case ma.P_SNI:
out.sni = c.Value()

}
return out.host == "" || out.port == "" || !out.useHTTPS || out.sni == ""
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't reimplement multiaddr parsing. manet.ToNetAddr should provide half of what you need here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I did originally, but it's not actually what I want here. manet.ToNetAddr resolves the address, which I don't want to do. And it fails if there's an {/http,/tls/http, /https, /tls/sni/.../http} component in the multiaddr, which is expected in these cases.

But I think I agree with the underlying assumption that this should be easier to describe than a foreach and switch (which doesn't even check ordering). Maybe extend multiformats/go-multiaddr-fmt to include doing something like capture groups? Something like this regex:

/(?<host>ip4|ip6|dns|dns4|dns6)/(?<port>tcp|udp)...

p2p/http/libp2phttp.go Show resolved Hide resolved
@MarcoPolo
Copy link
Collaborator Author

Lots of my comments come from looking at the GoDoc (godoc -http=:6060), and putting myself into the shoes of a user of this package.

Nice! Thanks!

The following code doesn't work:

host, err := libp2phttp.New(libp2phttp.ListenAddrs([]ma.Multiaddr{ma.StringCast("/ip4/127.0.0.1/tcp/8080/http")}))
if err != nil {
	log.Fatal(err)
}
host.SetHttpHandler("/website", http.FileServer(http.Dir("/tmp")))
log.Fatal(host.Serve())

When you go to "/website" the fileserver is seeing the full path "/website" and you get a 404 because you don't have a /tmp/website folder. This is the exact same behavior you get with a servemux (m.Handle("/website/", http.FileServer(http.Dir("/tmp")))).

You should strip the prefix of the path:

host.SetHttpHandler("/website", http.StripPrefix("/website", http.FileServer(http.Dir("/tmp"))))

Maybe this should be the default behavior for SetHttpHandler?

@marten-seemann marten-seemann mentioned this pull request Aug 15, 2023
23 tasks
@MarcoPolo
Copy link
Collaborator Author

I've changed the behavior of SetHTTPHandler to strip the prefix for the handler automatically. I can't think of use cases where you wouldn't want that. You can still avoid it by setting a handler on the underlying servemux directly.

p2p/http/ping/ping.go Outdated Show resolved Hide resolved
p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
Copy link
Contributor

@gammazero gammazero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am currently using this here: ipni/go-libipni#113

This looks great and I cannot wait until it gets merged.

p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
p2p/http/libp2phttp.go Show resolved Hide resolved
// ServeInsecureHTTP indicates if the server should serve unencrypted HTTP requests over TCP.
ServeInsecureHTTP bool
// ServeMux is the http.ServeMux used by the server to serve requests
ServeMux http.ServeMux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not set, http.DefaultServeMux is used? Not sure if that's how it's implemented right now, but that's how http.Server works.

Copy link
Collaborator Author

@MarcoPolo MarcoPolo Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it will not use http.DefaultServeMux. To me, that's a default for the standard library, and I think it would be confusing if we referenced it here. Users would create a new host that automatically exposes things they weren't expecting.

The reason this is public is to allow a user to set a custom servemux, or to manually manage the mapping outside of SetHTTPHandler. I'll expand the comment to mention this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a pointer if we want to let users set a custom servemux

// - integrate with the conn gater and resource manager

// ProtocolMeta is metadata about a protocol.
type ProtocolMeta struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I like Meta. Alternative: ProtocolConfig?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config implies that this is a configurable, but this is metadata about a peer's supported protocols.

Path string `json:"path"`
}

type PeerMeta map[protocol.ID]ProtocolMeta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PeerProtocols?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong preference here, but I do prefer that PeerMeta is consistent with ProtocolMeta.

}

// GetPeerMetadata gets a peer's cached protocol metadata from the http host.
func (h *Host) GetPeerMetadata(server peer.ID) (PeerMeta, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to return the whole map here? Alternative: SupportsProtocol(peer.ID) (path string, bool). It would be nice if we didn't have to make the map an exported type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason to export the whole type is to give users flexibility in managing the Host's peerMetadata cache without exposing that cache. This is useful if they want to add out of band knowledge of a peer's metadata. This method along with Set and Remove should let users do anything they need to with this metadata cache.

p2p/http/libp2phttp.go Outdated Show resolved Hide resolved
// NewRoundTripper returns an http.RoundTripper that can fulfill and HTTP
// request to the given server. It may use an HTTP transport or a stream based
// transport. It is valid to pass an empty server.ID.
func (h *Host) NewRoundTripper(server peer.AddrInfo, opts ...RoundTripperOption) (http.RoundTripper, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling with this abstraction. An http.RoundTripper is supposed to be able to issue requests to any HTTP host. This is more than just a theoretical concern: The http.Client has a Transport field that takes an http.RoundTripper, and would then be able to follow redirects. If we return a RoundTripper here that's scoped a single host, it won't be possible to redirect from one libp2p+HTTP host to another.

I don't have a good solution for this now, I'll have to think about the correct abstraction more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it won't be possible to redirect from one libp2p+HTTP host to another.

I agree this is an interesting use case we should try to support.

The thing I like about this api is that a user can pass in many multiaddrs and the best transport is chosen for them. There isn't anywhere else we could do this besides when we create a round tripper. But, thinking more about this, this doesn't have to be a property of the core round tripper. We can have a round tripper on top of some core round tripper that can do this logic. It would be great if the core round tripper could, as you say, fulfill any HTTP request regardless of the transport.

Here's one idea how to do this:

  1. Change this function to accept no parameters. It just returns an http.RoundTripper.
  2. This roundtripper can fulfill HTTP requests to the following URIs:
    a. http:// and https://, just like the http.DefaultTransport does.
    b. multiaddr:<peer-multiaddr>. Which would be a new URI scheme for multiaddrs. The value after multiaddr: is simply the string encoded version of the multiaddr. The multiaddr can include a peer's transport information, or could include only the peer id /p2p/<peer-id> and rely on discovery or out-of-band knowledge of a peer's addresses. Note this multiaddr can also be an http transport multiaddr.

This would work but would lead to us having to define the httppath component since we'll need a way to reference the HTTP path we want to use in the multiaddr.

We would probably also want to register the multiaddr uri scheme in IANA (but could defer).

Here's a branch of this idea: /~https://github.com/libp2p/go-libp2p/compare/marco/http...marco/newroundtripper?expand=1. It includes an httppath implementation, but we should standardize that (not use the private area) if we go that route. It also has a test demonstrating the redirect to other host (using multiaddrs).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now let’s remove the NewRoundtripper and only expose the NamespacedRoundtripper and NamespacedClient. We can then add the NewRoundTripper later.

The only thing we loose is redirects across hosts. I thinks that’s fine to omit for this initial version. And we have plan above on how to add it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed this to NewConstrainedRoundTripper, I think that will unblock us here. In a future version we can introduce the NewRoundTripper that accepts a multiaddr uri and can handle redirects across hosts.

Copy link
Contributor

@marten-seemann marten-seemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@MarcoPolo MarcoPolo changed the base branch from marco/gostream to master August 29, 2023 05:51
@MarcoPolo MarcoPolo merged commit ac038db into master Aug 29, 2023
@MarcoPolo MarcoPolo deleted the marco/http branch August 29, 2023 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants