Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(tests): For DNS discovery only use a local DNS server #2598

Merged
merged 4 commits into from
Aug 9, 2024

Conversation

flub
Copy link
Contributor

@flub flub commented Aug 7, 2024

Description

Because the Node builder by default configures DNS discovery the tests
all do a lookup. This went to the staging DNS server, but even so, we
hit 429 Too Many Requests on this sometimes.

Instead this configures a domain under .test.. This domain is
reserved for tests and any DNS server should return an emtpy response
without going through upstream DNS servers - unless the records were
directly added to the DNS server receiving the query in which case it
will respond.

So now all our default DNS discovery just doesn't discover anything,
which is fine as all tests have other mechanisms to find the nodes.
The tests that do use DNS discovery run their own local DNS server for
this, which is also switched to .test., because that's the right
domain to use for this.

Breaking Changes

When using iroh::node::Node in tests, when cfg(test) is enabled, the
default node discovery will no longer work. You must explicitly configure
the node discovery, options to use a test-local node discovery or a
staging node discovery are described in the API docs.

Notes & open questions

I feel like we should also try this with the relay server. That will
probably be a bit harder and it is not an issue for me right now.

I actually need this for the quinn11 branch, but the fix is so generic
that it makes more sense to do on main and merge it back.

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

Because the Node builder by default configures DNS discovery the tests
all do a lookup.  This went to the staging DNS server, but even so, we
hit 429 Too Many Requests on this sometimes.

Instead this configures a domain under `.test.`.  This domain is
reserved for tests and any DNS server should return an emtpy response
without going through upstream DNS servers - unless the records were
directly added to the DNS server receiving the query in which case it
will respond.

So now all our default DNS discovery just doesn't discover anything,
which is fine as all tests have other mechanisms to find the nodes.
The tests that do use DNS discovery run their own local DNS server for
this, which is also switched to `.test.`, because that's the right
domain to use for this.
Copy link

github-actions bot commented Aug 7, 2024

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/2598/docs/iroh/

Last updated: 2024-08-08T15:19:34Z

let retry_after = Duration::from_secs(failed_attempts);
republish.as_mut().reset(Instant::now() + retry_after);
warn!(
err = %format!("{err:#}"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this stops an entire traceback being thrown into the log. the traceback is totally useless. anyhow's alt-display formatting is made just for this.

@matheus23
Copy link
Contributor

Okay. Yeah I think I see the reasoning.
The tests would just start sending requests to <node_id>.staging-dns.iroh.link, even though none of the tests would use a PkarrPublisher, so they would never actually be able to find anything there.
The only tests that were using a PkarrPublisher were running the publisher locally, using test_utils::dns_and_pkarr_servers::DnsPkarrServer, and were configuring it to be at "dns.example", even though ideally it'd be hosted on a .test. TLD, such that if we end up actually asking a real DNS server, it'd know not to recurse.
And the tests that do use the PkarrDnsServer would use that server directly for DNS resolving, instead of the current router/system/1.1.1.1 or whatever.

@matheus23
Copy link
Contributor

matheus23 commented Aug 7, 2024

Even if we don't depend on DNS discovery for our unit tests, wouldn't it be possible that someone who's developing something that uses iroh as a library, as well as a PkarrPublisher, ends up suddenly not being able to connect_by_node_id?

You'd need to update the usages of N0_DNS_PKARR_RELAY_STAGING, too, right?

Copy link
Collaborator

@Arqu Arqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm undecided on this. If the change was purely to have a correct URI for test dns setups I'd say it's a great PR.
But judging it on its goal to provide a better setup for general testing I don't think this is complete yet. I'd rather hit the public infra and do the thing it's supposed to than black hole it by default.
We should be providing a setup that allows you to have a dns server present if you actually need it and then blackhole everything else.

Or to simplify, think we need a latch to provide a custom setup when needed. Otherwise PR is good.

Edit: Don't think we need to solve for how we spin it up in this PR, but making sure tests that currently reach out to public infra at least are "latched" to it until we work out the rest.

@flub
Copy link
Contributor Author

flub commented Aug 8, 2024

Right now we have tools (public!) in iroh_net::test_utils which let you easily spin up a test-local DNS/Pkarr server (and relay server). We use this in the tests that do need this functionality.

I guess it is fair to point out that anyone else using this in tests will now by default black-hole this. That needs to be documented clearly indeed.

@flub
Copy link
Contributor Author

flub commented Aug 8, 2024

The goal is to not hit any public infra for discovery during tests. Unless on purpose.

@Arqu I'm not sure I fully understand your comment, but maybe the DnsPkarrServer from test_utils is what you're after?

I've added a bunch of documentation that was dearly missing anyway. I hope this makes it clearer to users what is happening by default and how this changes during tests.

PTAL

Copy link
Collaborator

@Arqu Arqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chef's kiss, think the docs is exactly what was needed.
Also, yeah, the test_utils stuff is what I was after in combo with the docs so folks can set up an instance if they actually need it.

Thank you!

@flub flub enabled auto-merge August 9, 2024 10:55
@flub flub added this pull request to the merge queue Aug 9, 2024
Merged via the queue into main with commit 5eee643 Aug 9, 2024
30 checks passed
@dignifiedquire dignifiedquire deleted the flub/no-public-staging-dns branch August 12, 2024 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants