-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNS and Let's encrypt certificates for OCaml #27
Comments
Thank you for suggesting this @hannesm, and of course of all the hard work you've done that has led to this being possible at all. I'm entirely supportive of the suggestion as the HTTP process is a real pain to manage, but would like to see the end-to-end process deployed somewhere other than ocaml.org first to ensure it's suitably mature. @mtelvers, would you like to have a go at this on some other domain such as realworldocaml.org, and document it to your and @hannesm' satisfaction on infra.ocaml.org? Once it's documented and demonstrable elsewhere (particularly with respect to how to edit DNS zone files and so on, presumably via git), we will then need to present this to Xavier Leroy to get his permission to make the (big) change for the ocaml.org domain. I'm also tagging @RyanGibb who is interested in matters of OCaml and DNS and may want to assist. |
What I failed to mention in the initial issue is that such a setup with OCaml-DNS has been used for various domains since more than 5 years; also mirage.io works this way. Also, the let's encrypt challenge: so getting a signed certificate does not require the private key of the certificate signing request :D This is why the setup works pretty nicely. I can for sure help how to setup a primary name server and using DNS zones in a git repository. I can also provide secondary name servers if desired. |
@avsm and thanks for your |
Hi @hannesm, I would be very interested in assisting with this if you could use my help. I have a question if that's okay. I'm wondering if there's possible vulnerabilities with the spoofing of TLSA records. Reading RFC 6698:
As far as I understand, the OCaml-DNS resolver supports DNSSEC, but the authoritative server doesn't. Do you think this is an issue? |
@RyanGibb thanks for your offer. From my observation, the current DNS deployment for ocaml.org / realworldocaml.org does not use DNSSec. Also, DNSSec integration into the authoritative servers (for OCaml-DNS) is on the agenda, and will be done this year. As another point, so you can spoof TLSA records - but what is the attack vector? The service that uploads the CSR authenticates itself to the authoritative servers. The service that downloads the certificate checks that the public key in the certificate matches the private key it has. The certificate chain is checked to be valid (against the system trust anchors). I don't quite understand where DNSSec would be necessary, but maybe I'm failing to see the attack vector (@RyanGibb would you mind to elaborate a bit more what you mean with "Do you thin this is an issue?"). The service can btw also ask the authoritative server directly for TLSA records (certificates). To me, I wonder whether @mtelvers has an opinion and/or time for diving into such a thing ("running authoritative DNS services") or not. There's some IETF document that it is suggested to use anycast IP addresses for this, I'm myself not doing this since I don't have sufficiently many machines and BGP speakers to have such a setup -- but tbh it works fine with "just normal" IPv4 addresses ;) |
Hi @hannesm, thanks for your reply. After reading https://hannes.nqsb.io/Posts/DnsServer I think I understand that the TLSA records are only used for distributing the CSRs and certificates between DNS servers and services in your solution, not for replacing a CA with the DNSSEC trust anchor as rfc6698 describes. Hence the TLSA RRs' location at If so, I understand the reason why DNSSEC isn't required. The CA (letsencrypt) is the root of trust. The TSLA records are just a convent way of distributing the provided certificate. Apologies for my misunderstanding!
Aside from the letsencrypt DNS-01 challenge, that's great to hear :-) |
@RyanGibb yes, your understanding is correct.
I'm not sure I understand what your comment means, would you mind to explain? |
Great, thank you for confirming.
I just mean to say, despite DNSSEC not being required for the letsencrypt DNS challenge, it's good to know that it's on the agenda. |
Hi all, just to give a small update on this I've created a nameserver primarily targeting Unix using the new effects-based IO library and mirage OCaml-DNS library that is able to perform dynamic UPDATEs authenticated using TSIG: /~https://github.com/RyanGibb/aeon/. I'm hopping to add support for the letsencrypt challenge to this namserver directly (as opposed to running in a separate process), simplifying the communication required. |
Dear @RyanGibb, thanks for your effort. But I'd really like to hear from @mtelvers what would be worth for OCaml infrastructure. And I opened this issue to explicitly understand whether using MirageOS unikernels would be possible/interesting for that infrastructure. I feel very torpedized and getting the issue stolen by your "hey, look, I developed something new that supports some parts (certainly no notify, hasn't been tested for years on real domains, etc.) in this shiny new IO framework" -- especially since I've been doing the underlying DNS development since 2017. Your "add support for the letsencrypt challenge to this namserver directly" is as well something that can be done in a MirageOS unikernel. |
My sincere apologies @hannesm. I in no way meant to torpedo this issue. You've been working on this for far longer than myself, and my contribution is a small layer using a different IO library. I just wanted to express my continued interest in this topic and share some work that I've been doing for a different project that relates to this issue. I should have been more clear on that. |
@hannesm @RyanGibb please do take a positive interpretation of each other's efforts. The world of managing OCaml infrastructure is small enough already without us driving each other away. In my view, Ryan has been learning and reproducing Hannes' efforts, and that's appreciated. If you could perhaps split up your experiences with "reproducing the Mirage DNS stack" vs your own reimplementations on eio, that would be most useful for the knowledge sharing in this issue. But let's wait for @mtelvers to comment on his plans first, and if he's not available, then make a wider call to the community for more assistance with reproducing the Mirage DNS stack on other domains . |
@hannesm I think your post may have been inspired by my convoluted implementation of Let's Encrypt certificates which I used for #19. This not my preferred approach. I prefer to use automatic provisioning, which is included in Caddy. In this case, this option was not available to me as the requirement was for round-robin DNS. With round-robin, I could not guarantee the response would arrive at the requesting server. The natural resolution is to use DNS challenge, but that was not available as DNS updates are administered manually by @avsm. Therefore, I switched to NGINX, this gave me the granular configuration required to redirect HTTP challenges to the originating server. We would also need to agree on a hosting strategy for the unikernel to ensure a redundant deployment. Reading through https://hannes.nqsb.io/Posts/DnsServer, under the Let's encrypt! section, how would we configure a reverse proxy such as (Caddy/NGINX) to request a certificate with an hmac-secret? @avsm What are the success criteria? Or perhaps more importantly, what administrative controls need to be kept in place by the new solution? If we can use DNS-01 Challenge rather than HTTP-01, then this would be worth implementing for the round-robin DNS for opam.ocaml.org. The alternative solution would be a Gandi API key which would delegate more access than just creating TXT records. |
The initial CSR can be uploaded with The certificates can be downloaded by the service with the following shell script - e.g. via a cron job (since it is ensured that the certificate is updated 2 weeks before expiry): #!/bin/sh
set -e
hostname=$1
dig_opts=" +noquestion +nocomments +noauthority +noadditional +nostats"
if [ $# = 2 ]; then
dig_opts="$dig_opts @$2"
fi
data=$(dig tlsa _letsencrypt._tcp.$hostname $dig_opts | awk '{if (NR>3){print}}' | cut -f 5- -d ' ' | sort | grep '^[03] 0 0' | cut -d ' ' -f 4- | sed -e 's/ //g')
file=
hex_to_bin () {
data=$(echo $@ | sed -e 's/\([0-9A-F][0-9A-F]\)/0x\1 /g')
for hex in $data; do
oct=$(printf "%o" $hex)
if [ $oct = "0" ]; then
printf "\0" >> $file
else
printf "%1b" $(echo '\0'$oct) >> $file
fi
done
}
cert_file=$(mktemp)
i=0
for cert in $data; do
i=$(echo $i + 1 | bc)
file=$(mktemp)
hex_to_bin $cert
openssl x509 -inform der -outform pem -in $file -out $cert_file.$i
rm $file
done
# now mix and match, the final $i should be the leaf certificate
inter=$(mktemp)
last_inter=$(echo $i - 1 | bc)
for j in $(seq 1 $last_inter); do
cat $cert_file.$j >> $inter
done
openssl verify -show_chain -verify_hostname $hostname -untrusted $inter $cert_file.$i
out=$hostname.pem
if [ -f $out ]; then
out=$(mktemp)
fi
cat $cert_file.$i >> $out
cat $inter >> $out
rm -f $cert_file* $inter
echo "PEM bundle in $out" |
@mtelvers wrote:
Good question. There's one important missing piece in our current infrastructure: secrets management. We already have a bunch of keys lying around, and with the DNS infrastructure will have even more with the various nsupdate pieces. So I think we need to come up with some way to store and securely share the various private material (and ensure that there's robust administrative controls there). Once we have that, I'm satisfied that we can manage the |
Would you mind to expand your requirements here? As far as I can see, there is (when we consider self-hosted authoritative DNS)
Are there more types of secrets needed? Certainly, the password manager (1) looks out of scope for this discussion. The secrets required by humans (2) to modify the zone file can be (a) access to a (private) git repository hosted on GitHub (as done by the mirage organization) (b) a shared secret in the password manager. For the secrets between machines (3), the current setup (for e.g. mirage.io) is: the git repository with the zone files contains the shared secrets (to communicate between primary and secondary servers). The primary DNS server has access to it (via a ssh key that is provided as boot parameter (command line argument), the public part is registered with GitHub); the secondary DNS servers receive the shared secrets as boot parameter. Now, lifting the boot parameters (none of the below is implemented yet)
Let me know what you think, and/or let's have a discussion (maybe a video meeting?) about other approaches (and about the concrete goals). |
I'd only add to the secrets list for ocaml.org: Ahead of any discussion, it would be good to have the current status of the secrets in the ocaml.org cluster written down @mtelvers, and we can converge on what missing gaps there are in terms of rolling out any change in DNS infrastructure. |
A typical infrastructure deployment uses OCaml services running internally (usually under Docker) with HTTPS offloaded to a reverse proxy. Since we need a reverse proxy, the most straightforward approach is to use Caddy, which is a reverse proxy and manages the certificates automatically. Here is the entire Caddy configuration file needed for a typical service:
In this setup, Caddy resolves the challenges automatically via HTTP challenge using the DNS entries that @avsm creates. For a more complex setup, such as where say, if www.ocaml.org resolved to multiple addresses, the ideal setup would be to use the DNS-01 challenge. This is achieved like this (complete configuration file given):
A typical invocation would be like this:
There is an outstanding issue to move deploy.ocamllabs.io to deploy.mirage.io. Currently this is deployed using Caddy exactly as described above. Perhaps we can use this as a test case to integrate your hmac script into Caddy? I also see that there is a Caddy module for hmac which may do what we need? |
I have not used caddy before, but I searched around a bit and I found this: /~https://github.com/caddy-dns/rfc2136 I'm not sure exactly how it works, and it may not be able to take advantage of the tricks in the letsencrypt secondary dns server, but I think it should work. |
@hannesm, I am working through your blog post, and some links may have been renamed/moved since it was reviewed in 2019. Can you help me locate these? Perhaps this is now released?
There is no branch
|
Dear @mtelvers, thanks a lot for your comment(s). Indeed, that changed a bit since the packages are now released. I'll work on revising that blog post. The sources are now:
All these unikernels are as well available as reproducible binaries (hvt -- kvm) from our infrastructure: |
Indeed, RFC2136 is the "dynamic updates for DNS" RFC, which is implemented by OCaml-DNS. And the configuration snippet from the link:
Are supposed to directly work. This would mean: (a) no need for dns-letsencrypt-secondary (b) enroll your hmac secret with key and key_name available to caddy and dns-primary-git. I've not tested the interaction with caddy (but since it is RFC-specified, and works with bind, it should be fine). |
Relying on RFC2136 and using another interoperable bit of software like Caddy seems ideal here; well spotted @reynir. I'm hopeful that we'll eventually have a Caddy replacement in OCaml (I'm working on one on the side), but it'll obviously take some time to mature before being suitable for OCaml.org deployment. |
@hannesm I have made some progress, but it doesn't seem to work, and I am unsure where I am going wrong. I have a DNS server up and running with this command:
However, it doesn't work when I try to test it with this (per your example).
The console output is
I get rate limited by Git pretty quickly; therefore, I tried to use a local git server. I generated a key with
My apologies; I am probably making some basic error. |
Thanks for your report, @mtelvers. I just pushed an update to the blog post. To answer your trouble:
should be
Indeed that argument changed: Hope that helps. |
@hannesm Thank you. The local git repository is now working. However, it is still not resolving names. I'll have another look in the morning. Thanks again. |
@mtelvers if you pass |
Success! The remote must include the branch even when there is only one branch called master. Thus this works:
|
Happy to hear it works! 🥳 It defaults to trying branch |
Domain tunbury.uk is now using |
Deploying an authoritative DNS servers as a MirageOS unikernelsGit ServerThese steps create a Git Server on the local machine to host the zone repository. A remote machine could be used instead with suitable changes to the commands. The machine acting as the Git server should have
Create a user called
Create an SSH key to secure access to the repository.
Copy the new key to the git user.
Verify that OCaml and OpamInstall the necessary prerequisites using apt.
Install Opam
Setup Opam with a 4.14 switch.
DNS ZoneCreate the remote repository.
Locally, create the folder and zone file.
Now create a HMAC secret using random data.
Setup a git repository and push the zone file.
Git SSH access authenticationGet the fingerprint of the Git server's SSH key in the format
In order for the DNS server to authenticate again the Git server we need to generate a key pair using the
Then generate the key by running
The public key portion needs to be added to the
Seed is in the format Primary DNS ServerClone the git repository for
To run the DNS server, we need to specify the host SSH key fingerprint (determined above), the seed of the private key to authenticate against the Git server, the IP address to listen on, and finally, the location of the remote repository. The default branch is
Verify that your name server is operating with
DockerWe need to install Docker to perform the Caddy build.
Add your user to the
CaddyThe default Docker image for Caddy does not include support for RFC2136 therefore, we need a custom Docker build. Create a
Run the build with Test UsageTo test the automatic creation of certificates, we will implement a typical service with Caddy as the reverse proxy. Create a
And a create
Finally, run ThanksReference: https://hannes.nqsb.io/Posts/DnsServer |
Cool great to hear your success @mtelvers. I've been re-reading http://infra.ocaml.org/opam-ocaml-org and am wondering whether a next step would be to use (as done by mirage.io and other domains) DNS for storing certificates and certificate signing requests. The advantages would be:
Instead, the process would be:
If there's demand, I can provide shell scripts for (a) uploading CSR (based on Please let me know what you think. |
In summary, we would like to provision SSL certificates for a reverse HTTPS proxy, specifically when we have round-robin DNS. Any proxy server is suitable; Caddy would be my preferred choice as it provides automatic certificate provisioning and renewal. The current setup is that DNS is managed manually by @avsm via Gandi's web GUI thus restricting us to using HTTP-01 challenges. We have a working solution to this problem using NGINX. This solution redirects missed HTTP-01 challenges to the alternate server. Disadvantages of the current solutionLooking at the disadvantages identified with this solution:
DNS-01 ChallengesA typical solution to this issue is to use DNS-01 challenges instead of HTTP-01. This solution requires the DNS records needed by Let’s Encrypt to be provisioned by the client using a shared secret token. We would require approval from Xavier via @avsm to allow automatic updates to the ocaml.org domain. With that approval, we then have multiple approaches available to us on how to implement the solution. Gandi APIWe could use the Gandi API for DNS-01 challenges. This solution would not require scripts or cron jobs and could be handled entirely with Caddy. See /~https://github.com/caddy-dns/gandi MirageOS DNS with nsupdateWe could use MirageOS to provide a DNS server which could be updated using HMAC keys to publish the necessary records using nsupdate. Once the certificates are provisioned and downloaded, NGINX can be signalled to read the new certificates. MirageOS DNS with CaddyWe could complete the deployment with a MirageOS DNS server and use RFC2136 integrated into Caddy. /~https://github.com/caddy-dns/rfc2136. This was implemented last week on ci.mirage.io and deploy.mirage.io. The implementation was very straightforward. SummaryAs we have a solution, is a change needed? Are Xavier/Anil happy to allow automated updates to ocaml.org? With automated updates, should we use Gandi API or RFC2136? Is further testing required, in which case, what is needed? It is good public relations, PR, to use an OCaml/MirageOS DNS server for ocaml.org. However, the counterargument is that we are not a DNS provider, and therefore running a DNS service is a distraction from our core purpose. |
Thanks for extensive explanation @mtelvers. So we could have taken a shortcut back on January 16th if you would have replied with "no, there's no interest in running our own DNS services". |
I haven't even had a chance to digest all this excellent analysis yet, and the issue got closed?! I certainly didn't interpret Mark's analysis above as "no there's no interest", Hannes. However, it'll take some time to digest all the implications rather than running quickly into any such switch for a domain as large and as important to the ecosystem as ocaml.org. |
Excellent summary thanks @mtelvers. Additionally if we run our own DNS services, we need a plan for providing support for the MirageOS based stack. We (Tarides people working on this) have limited time to cover the work already on our roadmap (#26, #25, docs-ci, and supporting ocaml.org website development). |
Taking a step back, we can see that there's been fruitful discussion on how we could start dogfooding OCaml-DNS. It is clear that everyone's intent here is to ensure we do it right. I must second @avsm on that this process must be deliberate. We're all very grateful for your patience, @hannesm, as this will take time, and your continued collaboration and contributions are very much appreciated by all. |
A point in favour of switching authoritative DNS server to one we control (while debugging #42) is that Gandi offers secondary DNS hosting, so we would be protected against our primary DNS temporarily going down. |
Dear Madam or Sir,
with huge interest I read through some of the issues in this repository. Thanks for being open and transparent what you like to achieve.
Every other issue when it comes to migrations, I see that there are issues related to let's encrypt certificates and migration of services. The underlying reason, as far as I can tell, stems from the methodology of retrieving let's encrypt certificates: run a "certbot" locally, which requires (a) a web server on port 80, and (b) some ad-hoc configuration to serve static files, and (c) DNS changes being propagated for the desired hostname(s). This means, only once the actual service is deployed to live it can retrieve its certificate. This also makes moving services hard (without downtime).
Over the years, I worked on (fully open source, fully developed in OCaml as MirageOS unikernels) automation to push the whole let's encrypt interaction into DNS (a secondary server on steroids), and thus decoupling the actual service deployment from the certificate provisioning.
The idea is pretty simple: both certificates and signing requests are public data anyways (they're stored in the certificate transparency log, ...). DNS is a fault-tolerant key-value store. Each CSR and certificate is embedded as TLSA (https://www.rfc-editor.org/rfc/rfc6698.html) record (in DER encoding, i.e. no base64/pem, just the bare minimal stuff). Thanks to DNS TSIG we also have authentication (so not everyone may upload CSR) ;).
The mechanism is as follows: the primary DNS sends out DNS NOTIFY whenever the zone changes. The dns-letsencrypt-secondary observes zone(s), and whenever a fresh CSR is detected (or a soon expiring certificate, or a CSR without matching certificate (i.e. key rollover)), the let's encrypt DNS challenge is used to provision a new certificate.
The services behind just download (dig tlsa _letsencrypt._tcp.robur.coop) the certificate, and only need to have their private key distributed.
The operator can use nsupdate to upload a new certificate signing request.
If you're interested in using such a system (and run your own DNS servers - of course you can keep gandi's as advertised ones / public ones), don't hesitate to reach out. I'm happy to help figuring out how to work in that area. :)
The text was updated successfully, but these errors were encountered: