Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REST: AuthManager API #10753

Closed
wants to merge 17 commits into from
Closed

REST: AuthManager API #10753

wants to merge 17 commits into from

Conversation

adutra
Copy link
Contributor

@adutra adutra commented Jul 23, 2024

As discussed in the sync meeting, here is a PR that builds on top of @jackye1995 original work here: #10621. I kept his original commit intact for reference.

The main components introduced in this PR are:

  • AuthManager: manages authentication data for an owning catalog or other service.
  • AuthSession: represents a session of authentication data (typically, headers), which can be used to authenticate outgoing requests.

Auth schemes

Two existing auth schemes were retrofitted to implement the new interfaces: OAuth (see OAuth2Manager) and Sigv4 (see RESTSigV4AuthManager and RESTSigV4Signer).

Two other auth schemes were introduced: BasicAuthManager and NoopAuthManager. The former is a simple manager that authenticates with a username and password using basic auth, and the latter is a manager that does nothing and that is used when authentication is disabled. I chose to include these components mainly to serve as examples of simple
managers, but who knows if some catalogs out there could find interest in using the Basic one.

Configuration

A new AuthProperties class was introduced to hold new properties that are used to configure authentication. Following the idea of properly namespaced options, the new properties are all prefixed with rest.auth.<auth-type>. where <auth-type> is the name of the auth scheme. Selecting an auth scheme is done by setting the rest.auth.type property. Schemes can be selected by their logical name (e.g. oauth2, basic, sigv4, none) or directly by a class name implementing AuthManager.

OAuth-specific components

This PR doesn't change the way OAuth is used in the REST catalog, only moves the logic to the new OAuth2Manager class. OAuth properties weren't modified either. Imho it would be good at some point to refactor OAuth2Util into something more extensible, but that's out of the scope for now.

AWS-specific components

AWS SigV4 authentication was previously enabled by the rest.sigv4-enabled option. This property is still supported, but it is now deprecated in favor of rest.auth.type=sigv4.

Also, RESTSigV4Signer used to implement Apache HTTP client's HttpRequestInterceptor interface, and was injected (at a rather low layer) by HTTPClient.Builder.build(). This has been replaced and RESTSigV4Signer is now created in a higher layer (catalog level).

Note that this change opens up new opportunities; e.g. it is now possible to do more fine-grained operations e.g. create context- or table-specific signers. But the current implementation of RESTSigV4AuthManager does not take full advantage of this new flexibility for now and simply creates one single signer to sign all requests, just as it was before.

Another AWS-specific component had to be adapted: S3V4RestSignerClient (enabled by the s3.remote-signing-enabled option): contrary to RESTSigV4Signer, S3V4RestSignerClient sends signing requests to the REST catalog itself, which in turn may require authentication. This component has some shortcomings that are not addressed in this PR (e.g., it never closes its resources), but nevertheless, it is now delegating authentication to a (somewhat simplified) AuthManager instead of managing that itself.

Implementation details

Caching of sessions: responsibility for caching sessions has been transferred from RESTSessionCatalog and S3V4RestSignerClient to become an implementation detail of AuthManager impls.

Interference with HTTP Client CredentialsProvider: a CredentialsProvider can be passed to HTTPClient.Builder to implement proxy authentication. This PR does not change that. Care must be taken to avoid interference between the CredentialsProvider and the AuthManager (i.e. the proxy authentication should apply only proxy auth headers to outgoing requests, but this isn't enforced and interference was already possible).

.palantir/revapi.yml Outdated Show resolved Hide resolved
@@ -81,14 +71,12 @@ public abstract class S3V4RestSignerClient
private static final String SCOPE = "sign";

@SuppressWarnings("immutables:incompat")
private static volatile ScheduledExecutorService tokenRefreshExecutor;
// FIXME auth manager is not properly closed
Copy link
Contributor Author

@adutra adutra Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not new to this PR; the executor wasn't being closed either. I think ideally this component should share the same REST client and the same Auth Manager as the catalog, but that's far from trivial to implement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make this AutoClosable and make sure we close if it is called (even though it won't be right now).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind if we do that in a follow-up PR? I can work on it immediately after this one is merged.

@adutra adutra force-pushed the auth-manager-adutra branch 5 times, most recently from 903091f to 8fd8204 Compare July 24, 2024 12:03
.palantir/revapi.yml Outdated Show resolved Hide resolved
@adutra adutra force-pushed the auth-manager-adutra branch 2 times, most recently from a705635 to d8e1981 Compare July 30, 2024 15:06
@adutra adutra force-pushed the auth-manager-adutra branch from d8e1981 to 5db7d36 Compare August 8, 2024 09:11
@adutra adutra force-pushed the auth-manager-adutra branch from 94308ba to 98787b2 Compare August 26, 2024 11:39
@adutra adutra force-pushed the auth-manager-adutra branch from 98787b2 to 4cb2d6c Compare September 18, 2024 16:38
@adutra adutra force-pushed the auth-manager-adutra branch from 4cb2d6c to 893cc3c Compare October 10, 2024 10:15
Copy link
Contributor

@danielcweeks danielcweeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a quick pass through and overall, I really like the abstraction. I have some concerns around the header handling as it appears that there are behavioral changes.

@adutra adutra force-pushed the auth-manager-adutra branch from 4abc8dc to 2252d16 Compare October 22, 2024 08:17
@adutra
Copy link
Contributor Author

adutra commented Oct 22, 2024

FYI I rebased the PR, it's been a while since the last rebase.

@adutra
Copy link
Contributor Author

adutra commented Oct 24, 2024

@danielcweeks it took me some time to address your comments but this is ready for a second round of review whenever you have time.

Apart from polishing commits, two major changes since your last review:

  1. No more double initialization of the manager in RESTSessionCatalog.
  2. Static headers passed through configuration via header.* are now correctly honored across the board. They are distinct from headers generated by the AuthSession.

🙏

@jbonofre
Copy link
Member

As @adutra addressed @danielcweeks comments, I wonder if we can't include this PR for 1.7.0 ? @RussellSpitzer @rdblue @danielcweeks thoughts ?

In the meantime, I'm doing the review on this PR.

@jbonofre
Copy link
Member

@flyrain do you mind to take a look on this one ? Thanks !

@adutra adutra force-pushed the auth-manager-adutra branch from fb70329 to 737b0b2 Compare December 4, 2024 11:51
Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adutra I did a first pass and was mostly focusing on the AuthManager abstractions (having them makes sense to me and APIs around that mostly LGTM).

I still have to go through all the changes within the catalog and the signer. Also having the refactorings around introducing BaseHTTPClient makes this quite difficult to review (I would try and do that in a separate PR)

@@ -139,4 +148,100 @@ public void testOAuth2FormDataDecoding() {

assertThat(RESTUtil.decodeFormData(formString)).isEqualTo(expected);
}

@ParameterizedTest
@MethodSource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we typically use explicit method names here rather than naming the argument method and the test method the same.

Suggested change
@MethodSource
@MethodSource("validRequestUris")

assertThat(RESTUtil.buildRequestUri(request)).isEqualTo(expected);
}

public static Stream<Arguments> buildRequestUri() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public static Stream<Arguments> buildRequestUri() {
public static Stream<Arguments> validRequestUris() {

.build();
assertThatThrownBy(() -> RESTUtil.buildRequestUri(request))
.isInstanceOf(RESTException.class)
.hasMessageContaining("Paths should not start with /");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.hasMessageContaining("Paths should not start with /");
.hasMessage(
"Received a malformed path for a REST request: /v1/namespaces. Paths should not start with /");

if possible, we should always try to assert against the full msg. This also helps in seeing how the entire error msg will look like when this happens

.build();
assertThatThrownBy(() -> RESTUtil.buildRequestUri(request2))
.isInstanceOf(RESTException.class)
.hasMessageContaining("Failed to create request URI");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.hasMessageContaining("Failed to create request URI");
.hasMessage(
"Failed to create request URI from base http://localhost/ not a valid path, params {}");

}

private static Long expiresAtMillis(Map<String, String> props) {
Long expiresInMillis = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be named expiresAtMillis, because that's what we're returning here

return new Builder();
}

public static Builder builder(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of having such a builder method. We already have a generated builder, so we should rather go through the builder: HTTPRequest.builder().baseUri(baseUri).method(method).path(path)...build()


/** Returns the query parameters of this request. */
@Value.Parameter(order = 3)
public abstract Map<String, String> parameters();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would name this queryParams or queryParameters as it's not immediately obvious that these are the query parameters when using the builder

import org.apache.iceberg.rest.responses.ErrorResponse;

/** A base class for {@link RESTClient} implementations using {@link HTTPRequest}. */
public abstract class BaseHTTPClient implements RESTClient {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while such refactorings make a lot of sense, it would be easier to have them in a separate PR as this reduces the review complexity of what you're trying to do in this PR.
Could you please extract the introduction of BaseHTTPClient into a separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more than just a refactoring, BaseHTTPClient is playing an important role here: it's "bridging" the existing RESTClient methods (get(), post(), postForm() etc.) towards a single target: the buildRequest() method, which returns an HTTPRequest instance. Without this class, each implementation of get(), post(), postForm() etc. would need to do the job of creating an HTTPRequest then passing it to the authenticator, then executing it – it would be a lot more confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it still adds too much reviewing complexity to an already large PR, hence why I'm suggesting to do this in a separate PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, BaseHTTPClient has been removed.

public AuthSession catalogSession(RESTClient sharedClient, Map<String, String> properties) {
Preconditions.checkArgument(
properties.containsKey(AuthProperties.BASIC_USERNAME),
"Property %s is required",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about Invalid username: missing required property ...

Map<String, String> queryParameters();

/** Returns all the headers of this request. The map is case-sensitive! */
@Value.Parameter(order = 4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of these annotations shouldn't be needed

return RESTObjectMapper.mapper();
}

HTTPRequest withBaseUri(URI baseUri);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why these are needed but the below diff should achieve the same with less code:

--- a/core/src/main/java/org/apache/iceberg/rest/HTTPRequest.java
+++ b/core/src/main/java/org/apache/iceberg/rest/HTTPRequest.java
@@ -43,7 +43,6 @@ public interface HTTPRequest {
    * Returns the base URI configured at the REST client level. The base URI is used to construct the
    * full {@link #requestUri()}.
    */
-  @Value.Parameter(order = 0)
   URI baseUri();

   /**
@@ -56,19 +55,15 @@ public interface HTTPRequest {
   }

   /** Returns the HTTP method of this request. */
-  @Value.Parameter(order = 1)
   HTTPMethod method();

   /** Returns the path of this request. */
-  @Value.Parameter(order = 2)
   String path();

   /** Returns the query parameters of this request. */
-  @Value.Parameter(order = 3)
   Map<String, String> queryParameters();

   /** Returns all the headers of this request. The map is case-sensitive! */
-  @Value.Parameter(order = 4)
   @Value.Redacted
   Map<String, List<String>> headers();

@@ -84,7 +79,6 @@ public interface HTTPRequest {

   /** Returns the raw, unencoded request body. */
   @Nullable
-  @Value.Parameter(order = 5)
   @Value.Redacted
   Object body();

@@ -105,21 +99,9 @@ public interface HTTPRequest {
     return RESTObjectMapper.mapper();
   }

-  HTTPRequest withBaseUri(URI baseUri);
-
-  HTTPRequest withMethod(HTTPMethod method);
-
-  HTTPRequest withPath(String path);
-
-  HTTPRequest withQueryParameters(Map<String, ? extends String> queryParameters);
-
-  HTTPRequest withHeaders(Map<String, ? extends List<String>> headers);
-
-  HTTPRequest withBody(Object body);
-
   default HTTPRequest putHeadersIfAbsent(Map<String, String> headers) {
     Map<String, List<String>> newHeaders = Maps.newLinkedHashMap(headers());
     headers.forEach((name, value) -> newHeaders.putIfAbsent(name, List.of(value)));
-    return withHeaders(newHeaders);
+    return ImmutableHTTPRequest.builder().from(this).headers(newHeaders).build();
   }

super.close();
} finally {
AuthSessionCache cache = sessionCache;
sessionCache = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should be this.sessionCache to clearly indicate that an instance variable is assigned here (and to align with other places in the codebase

Consumer<ErrorResponse> errorHandler) {
Object body) {
URI baseUri = URI.create("https://localhost:8080");
ObjectMapper mapper = RESTObjectMapper.mapper();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already the default mapper, so can be omitted

}

@ParameterizedTest
@MethodSource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one should also explicitly point to the method being used


/** A Bearer token supplier which will be used for interaction with the server. */
@Value.Default
public Supplier<String> token() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing this will actually break existing applications that were providing the token through a supplier

@@ -200,77 +123,6 @@ private RESTClient httpClient() {
return httpClient;
}

private AuthSession authSession() {
String token = token().get();
if (null != token) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how will this now work when the token has been provided through the token supplier?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Auth Manager takes care of it. The token supplier returns properties().get(OAuth2Properties.TOKEN). The auth manager will do the same if configured with that property.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default token supplier returned properties().get(OAuth2Properties.TOKEN) but what if you had an entirely different supplier that would fetch the token from somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how can you create different suppliers? The only way to instantiate this class is through the method S3V4RestSignerClient.create(), which is hard-coded in S3FileIOProperties.applySignerConfiguration(). And that method does not allow custom token suppliers to be provided.

Copy link
Contributor

@nastra nastra Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can always go through

ImmutableS3V4RestSignerClient.builder()
            .properties(...)
            .token(() -> mySecretToken)
            .requestPropertiesSupplier(() -> requestProps)
            .build()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I meant ImmutableS3V4RestSignerClient.builder() (I've updated the above example)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK 👍

To create a signer this way, you need to:

  • Reimplement S3FileIOProperties or extend it and override applySignerConfiguration();
  • Reimplement S3FileIO or extend it and override the constructor and initialize(Map<String,String) since S3FileIOProperties is hard-coded in these places.

So... I don't think this is a realistic scenario.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it actually is a realistic scenario, because that's the API we've been using internally for the Signer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sigh OK let's put back those methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, cf 9b8e7bd

.put(OAuth2Properties.TOKEN, token)
.put(OAuth2Properties.SCOPE, SCOPE)
.buildKeepingLast();
return authManager().catalogSession(httpClient(), properties);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there maybe be an override of this method that takes AuthConfig instead of properties? And catalogSession(..., Map<String, String> properties) could then always call catalogSession(..., AuthConfig)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not, because in spite of its generic name, AuthConfig is only meaningful for the OAuth2 scheme. It doesn't make sense e.g. for SigV4.

@jbonofre
Copy link
Member

@nastra @danielcweeks @RussellSpitzer I thought we had a consensus about moving forward on this PR, targeting 1.8.0. Can we move forward on this one ? Thanks !

@nastra
Copy link
Contributor

nastra commented Dec 11, 2024

@jbonofre I've talked offline with @adutra and he's going to split up this PR into multiple smaller ones. The diff inside this PR is just too large to review and touches a lot of critical pieces, thus requiring large resouce commitments from reviewers. Having this split up into smaller PRs allows us to get stuff reviewed/merged in smaller chunks.

@jbonofre
Copy link
Member

@nastra thanks ! much appreciated ! If I understand the request to split in smaller PRs, I'm surprised that it comes now (this PR has been created in July, and several reviews since then). In order to preserve contributor "motivation", we should try to ask splitting in smaller PRs in the early stages (first reviews round). Just an overall comment. Thanks !

@adutra are you working on "smaller" PRs ? Do you need my help there ?

@adutra
Copy link
Contributor Author

adutra commented Dec 12, 2024

As requested, this PR will be split in many ones. The first one is #11769. I'm going to close this one now.

@adutra adutra closed this Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants