Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blacklisting questions/concerns with master node. #101

Closed
rmontgomery429 opened this issue Dec 28, 2015 · 15 comments
Closed

Blacklisting questions/concerns with master node. #101

rmontgomery429 opened this issue Dec 28, 2015 · 15 comments

Comments

@rmontgomery429
Copy link

Does the master node get blacklisted? If so, why? We're running a "all writes to master, all reads to a replica" setup and it would seem that master should never be blacklisted. Does setting blacklist_duration for the master node to 0 prevent blacklisting?

Also, is there a way to turn off blacklisting completely? If so, what is the recommended way to do this?

@todddickerson
Copy link

👍

@bleonard
Copy link
Contributor

bleonard commented Jan 6, 2016

in connection_wrapper i see this

    def _makara_blacklist!
      @connection.disconnect! if @connection
      @connection = nil
      @blacklisted_until = Time.now.to_i + @config[:blacklist_duration]
    end

that suggests if you set blacklist_duration to zero or negative, it would likely work out.

@bleonard
Copy link
Contributor

bleonard commented Jan 6, 2016

if I'm reading it right, if we never raised BlacklistConnection that would also be like it was just a "normal" errors (you'd see the original)

maybe gracefully in /~https://github.com/taskrabbit/makara/blob/master/lib/makara/error_handler.rb should not be so graceful if blacklist_duration <= 0 or some other setting is set.

@rmontgomery429
Copy link
Author

Yeah, some things we noticed is that our database was returning errors, but we never saw the underlying error which was hard to debug. So I'm not sure gracefully is helpful in that regard. Perhaps blacklisting could be appended with some kind of nested exception or message like "Blacklisting due to: 'blah blah blah'" might be more helpful.

We also did set the value to 0 but still saw blacklisting happening, but that might have been the gracefully stepping in there masking the underlying issue. So maybe it wasn't being blacklisted but it looked like it was.

For the moment we've removed makara in an attempt to debug the underlying issues, but are eager to put it back in as soon as possible.

I'm also thinking that a more explicit 'blacklisting: on/off' sort of mechanism would be clearer than setting the duration to 0 or -1. It works and maybe could just be documented as a start. If this were the "correct way" then yes, I would also expect to not see any blacklisting errors.

@bleonard
Copy link
Contributor

bleonard commented Jan 8, 2016

we do log the error.

::Makara::Logging::Logger.log("[Makara] Gracefully handling: #{err}")

You might have to set up the logger to your Rails one. Something like this:

Makara::Logging::Logger.logger = Rails.logger

if you catch BlacklistConnection, you can also see the original error via the original_error method.

@bleonard
Copy link
Contributor

bleonard commented Jan 8, 2016

I'm going to combine this with this one (#78) -- I think they are both saying that master is special and especially if there is only one, there is something interesting that should happen where it works like "normal" - more investigating on that that means but "merging" them.

@rmontgomery429
Copy link
Author

👍

@robbwagoner
Copy link

Supporting use-case: An AWS RDS node in Multi-AZ mode can failover automatically. The DNS name remains the same, and the standby-node is transparently made master. Reconnection(s) by Makara would be necessary.

Read Replicas are for Slave-use only. They can be promoted to a stand-alone Master, but they are then disconnected from the replication topology.

@rmontgomery429
Copy link
Author

@robbwagoner We ran into the exact same issue.

@clarakwan
Copy link

Did the master issue ever get addressed? We're seeing similar issues as well.

@NoSync
Copy link

NoSync commented Mar 16, 2018

Wondering the same as @clarakwan

@johnwu96822
Copy link

johnwu96822 commented Jun 1, 2018

We are also seeing an issue (in Rails) when master got blacklisted wrongfully due to this #207 hence the connection was closed in _makara_blacklist!. But because it's in a transaction, AR attempted to rollback the transaction on the same connection (transaction keeps the connection instead of checking out a new one: /~https://github.com/rails/rails/blob/4-2-stable/activerecord/lib/active_record/connection_adapters/abstract/transaction.rb#L51) which was closed due to blacklisting. Therefore it raises a connection closed exception.

I monkey patched our use of Makara to never call _makara_blacklist! on the master to avoid this.

@laimison
Copy link

laimison commented May 7, 2019

In my case I discovered that if master failed, it never tries to refresh DNS and get new IP for master (not 100% sure if this is not blacklisting).

A note that transparently DNS is replaced to a new master by automated DB failover.

The only thing that helps is just restarting Rails which is an expensive option.

Am I missing some option in database.yml? Are you solving the same issue? For JVMs there is the parameter networkaddress.cache.ttl so I think equivalent option should help.

@laimison
Copy link

To leave feedback, in my case sonots patch worked to solve master's DNS issue. More details:

ankane/distribute_reads#24

@laimison
Copy link

By the way, wasn't the parameter disable_blacklist available at the time of writing to solve your issues?

An example:

connections:
  - role: master
    host: mymaster.sql.host
    disable_blacklist: true
    
  - host: myslave.sql.host
    name: Slave

From README.md

disable_blacklist - do not blacklist node at any error, useful in case of one master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants