Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redshift password in connection string can't be illegal-URL characters #132

Closed
tristanreid opened this issue Dec 3, 2015 · 21 comments
Closed
Labels
Milestone

Comments

@tristanreid
Copy link

If you have a password with a carat or question mark, fails with 'Illegal character'

java.net.URISyntaxException: Illegal character in query at index 122: redshift://me.us-west-2.redshift.amazonaws.com:5439/my_db?tcpKeepAlive=true&user=treid&password=M^gI?[12oaV^Mv#$
    at java.net.URI$Parser.fail(URI.java:2829)
    at java.net.URI$Parser.checkChars(URI.java:3002)
    at java.net.URI$Parser.parseHierarchical(URI.java:3092)
    at java.net.URI$Parser.parse(URI.java:3034)
    at java.net.URI.<init>(URI.java:595)
    at com.databricks.spark.redshift.JDBCWrapper.getConnector(RedshiftJDBCWrapper.scala:137)
...

If you have a percentage sign, you get 'Malformed escape pair at index...'

If I urlencode my password parameter I just get java.sql.SQLException: [Amazon](500150) Error setting/closing connection: password authentication failed for user "me"

Is there a way around this?

Thanks in advance, apologies if I'm missing something obvious

@tristanreid
Copy link
Author

Looks like the URI is only instantiated to get at the schema, so an easy fix is to just ditch the parameters before passing it in. Is this an ok fix? Should I PR?

/~https://github.com/databricks/spark-redshift/blob/master/src/main/scala/com/databricks/spark/redshift/RedshiftJDBCWrapper.scala#L196

@JoshRosen
Copy link
Contributor

I guess that not all valid JDBC connection strings are valid URIs? If that's the case, then modifying this code to not use the URI-parsing code for extracting the subprotocol seems like a fine approach (we could just use a regex). Feel free to submit a PR for this and I'll be happy to review.

@JoshRosen
Copy link
Contributor

This was fixed by #134. Thanks @tristanreid!

@JoshRosen JoshRosen added the bug label Dec 3, 2015
@JoshRosen JoshRosen added this to the 0.5.3 milestone Dec 3, 2015
@jamesjgardner
Copy link

Hey guys,

Has this been put into any released version? I'm having issues connecting to Redshift because my password has special characters. I've tried wrapping in quotes, and even encodedPass = URLEncoder.encode(password).

It seems that no matter how I try to encode I'm receiving:
Amazon Error setting/closing connection: password authentication failed for user

@JoshRosen
Copy link
Contributor

Hi @jamesjgardner,

This patch was included in the 0.6.0 release: /~https://github.com/databricks/spark-redshift/releases/tag/v0.6.0

Which version of spark-redshift are you using?

@jamesjgardner
Copy link

Hi @JoshRosen,

I'm using:
"com.databricks" %% "spark-redshift" % "0.6.0",
"com.amazon.redshift" % "jdbc4" % "1.1.7.1007" % "test" from "https://s3.amazonaws.com/redshift-downloads/drivers/RedshiftJDBC4-1.1.7.1007.jar"

And doing a simple URLEncoder to clean up the password.
val cleanPassword = URLEncoder.encode(password, "UTF-8")


val jdbcURL = s"jdbc:redshift://address:port/db?user=$user&password=$cleanPassword"

@JoshRosen
Copy link
Contributor

I can try to test this out on my own Redshift instance later, but to help me debug it would be really helpful if you could confirm whether your jdbcURL connection string works outside of spark-redshift. If the exact same connection string works when you use it outside of spark-redshift but fails with this library then that's a bug that we need to fix.

@jamesjgardner
Copy link

I tried about every combination of URI encoding and different parameter names e.g. UID and PWD are also used by redshift to no avail. I ended up just creating another user with a password without a bunch of special characters and it works fine. I would greatly appreciate it if I could get an update if this get's worked out. You might want to see if you can get it to work on a password with an "=" in it.

@tristanreid
Copy link
Author

I just confirmed that my earlier fix no longer works - using the same jar that I made before, and same JDBC driver. I am able to use the same password to connect via SQLJWorkbench using the exact same JDBC driver. Weird. I'll take a look at troubleshooting this.

@JoshRosen JoshRosen reopened this Jan 20, 2016
@JoshRosen JoshRosen modified the milestones: 0.6.1, 0.6.0 Jan 20, 2016
@JoshRosen
Copy link
Contributor

I've re-opened this issue to track it for 0.6.1.

@jamesjgardner
Copy link

Thanks!

James

On Jan 20, 2016, at 5:52 PM, Josh Rosen notifications@github.com wrote:

I've re-opened this issue to track it for 0.6.1.


Reply to this email directly or view it on GitHub.

@tristanreid
Copy link
Author

Alright - now I don't know how this succeeded previously - it clearly doesn't work. I just wrote some straight JDBC code, and I can show that it works when you set username/password as properties, but not when you put them into the URL of the connection string.

Compile this with javac JDBCTest.java
Run with java -cp ".:/path/to/RedshiftJDBC41.jar" JDBCTest username password jdbc:postgresql://etc...

The output is that the first connection succeeds, and the second fails with password authentication error.

I propose to change the getConnector function of RedshiftJDBCWrapper.scala such that it parses and uses the username and password as properties instead of leaving them in the URL. Does that sound ok? /~https://github.com/databricks/spark-redshift/blob/master/src/main/scala/com/databricks/spark/redshift/RedshiftJDBCWrapper.scala#L200

import java.sql.*;
import java.util.Properties;

public class JDBCTest {
    public static void executeQuery(Connection conn) throws Exception {
       ResultSet rs = conn.createStatement().executeQuery("select * from information_schema.tables;");

       if (rs.next()){
          System.out.println("Query Successful...");
          System.out.println("\t" + rs.getString("table_name"));
       }
       rs.close();
       conn.close();
    }

    public static void main(String[] args) {
        String username = args[0];
        String password = args[1];
        String url = args[2];
        Statement stmt = null;
        try{
           Class.forName("com.amazon.redshift.jdbc41.Driver");
           Properties props = new Properties();
           props.setProperty("user", username);
           props.setProperty("password", password);
           executeQuery(DriverManager.getConnection(url, props));
           String fullUrl = url + "?user=" + username + "&password=" + password;
           executeQuery(DriverManager.getConnection(fullUrl));
        }catch(Exception ex){
           ex.printStackTrace();
        }
     }
}

@tristanreid
Copy link
Author

Here's an example: tristanreid@d3ab1db

I just tested this, and it works. I think the scala is stylistic a little ugly. If this approach is ok I'd welcome feedback on that?

@JoshRosen
Copy link
Contributor

Here's my initial thoughts:

  • If a user passes us a valid JDBC URI then we need to support it; we cannot break existing user configurations.
  • It's going to be potentially confusing if the JDBC URI that we accept doesn't work in other environments. In other words, deviating too far from the standard URI format may be a bad idea.

Therefore, I have the following proposal:

  • Introduce new configuration parameters for passing in the username and password.
  • To handle precedence:
    • If the credentials are only passed in the URI, use those.
    • If the credentials are only passed via the new conf parameters, use those.
    • If both sources of credentials are set, fail with a clear error message.
  • Add a unit test for the above precedence logic.
  • Document the new parameters in the README.

I don't think we should try the more complex "parse the password out of the URI with string manipulation" approach, since it seems harder to test and reason about and seems somewhat nonstandard.

@tristanreid
Copy link
Author

So to ensure I understand: if creds are passed in the URI, use the existing logic, not the parsing? I think that's the simplest approach, and parallels what would happen in other environments.

@JoshRosen
Copy link
Contributor

Yep, that's right.

@tristanreid
Copy link
Author

Almost done: tristanreid@8fe5c13

Where's the best place to add the unit test?

@JoshRosen
Copy link
Contributor

If it was me, I'd probably add a RedshiftJDBCWrapperSuite and throw a test in there. I'm also happy to do this myself. Do you mind submitting a pull request so I can review the code there?

JoshRosen pushed a commit that referenced this issue Jan 30, 2016
Per #132, this PR adds user and password options.  If these options are present and the user/password are still in the URL, an exception is thrown with a clear error message.

Unit tests are still pending - maybe it's better to check for ambiguous parameters in the Parameter class instead, to make it easier to test?  Glad for any other feedback as well!

Author: Tristan Reid <tristan@datascience.com>

Closes #162 from tristanreid/master.
@JoshRosen
Copy link
Contributor

Going to mark this as fixed now that #162 has been merged.

@Nath5
Copy link

Nath5 commented Feb 19, 2016

When I put the user / password in the jdbc url I get a authentication failure, my password contains the following special characters:

?!

When I use the user and password options I get the error below:

Exception in thread "main" java.sql.SQLNonTransientConnectionException: [Amazon][JDBC](10100) Connection Refused: [Amazon][JDBC](11640) Required Connection Key(s): PWD, UID; [Amazon][JDBC](11480) Optional Connection Key(s): AuthMech, BlockingRowsMode, DriverLogLevel, FilterLevel, Language, loginTimeout, socketTimeout, ssl, sslfactory, tcpKeepAlive, TCPKeepAliveMinutes
        at com.amazon.exceptions.ExceptionConverter.toSQLException(Unknown Source)
        at com.amazon.jdbc.common.BaseConnectionFactory.checkResponseMap(Unknown Source)
        at com.amazon.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
        at com.amazon.jdbc.common.AbstractDriver.connect(Unknown Source)
        at com.databricks.spark.redshift.JDBCWrapper.getConnector(RedshiftJDBCWrapper.scala:225)
        at com.databricks.spark.redshift.RedshiftWriter.saveToRedshift(RedshiftWriter.scala:363)
        at com.databricks.spark.redshift.DefaultSource.createRelation(DefaultSource.scala:106)

I am using the latest release in java

compile 'com.databricks:spark-redshift_2.10:0.6.0'

Any help would be appreciated.

@JoshRosen
Copy link
Contributor

@Nath5, see #162. We haven't cut a release which contains that patch yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants