-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a clustering example with Docker Swarm #2589
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job 👍 👍 👍
Could you change the base branch to v1.5?
``` | ||
|
||
For listening on different ports, we need to create an entrypoint for each. The CLI syntaxe is `--entrypoints=Name:a_name Address:an_ip_or_empty:a_port options`. | ||
If you want to redirect traffic from one entrypoint to another, it's the option Redirect.EntryPoint:entrypoint_name`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing `
For more information about challenge: [Automatic Certificate Management Environment (ACME)](/~https://github.com/ietf-wg-acme/acme/blob/master/draft-ietf-acme-acme.md#tls-with-server-name-indication-tls-sni) | ||
|
||
## Prerequisites | ||
You will need a working Docker Swarm cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add one empty line between all titles and the content
93ffbe4
to
1d536aa
Compare
@ldez Done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @jmaitrehenry.
Many thanks for this really useful PR! 👍
Sure this kind of example will help a lot of users!!!
I have few remarks.
In particulary the part about ACME + KV store.
A PR allows fixing the behavior and is enable in the v1.5-rc3 version.
|
||
Why we need Traefik in cluster mode? Running multiple instances should work out of the box? | ||
|
||
If you don't use Let's Encrypt with Traefik, you may not need Traefik cluster/HA. But, if you use Let's Encrypt, you need to store certificates somewhere shared by all the Traefik instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not agree with If you don't use Let's Encrypt with Traefik, you may not need Traefik cluster/HA.
IMHO, you can use cluster mode to share configuration, TLS certificates. Not only ACME certifcates.
WDYT? Can you change this sentence?
Can you split in two lines please? One line = one sentence
|
||
What Traefik should do: | ||
- Listen to 80 and 443 | ||
- Redirect HTTP traffic to HTTPs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTTPS
--acme.email=contact@mydomain.ca | ||
``` | ||
|
||
Let's Encrypt need 3 parameters: en entrypoint to listen on, a storage for certificates, and en email for the registration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs, s/en entrypoint to listen on/an entryPoint to listen to/
|
||
Let's Encrypt need 3 parameters: en entrypoint to listen on, a storage for certificates, and en email for the registration. | ||
|
||
For activing Let's Encrypt support, you need to add `--acme` flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/For activing/To enable
|
||
For activing Let's Encrypt support, you need to add `--acme` flag. | ||
|
||
Now, Traefik need to know where to store the certificates, we can choose between a key in a Key-Value store, or a file path: `--acme.storage=my/key` or `--acme.storage=/path/to/acme.json`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs
[...] | ||
``` | ||
|
||
If you have some update to do, update the initializer service and re-deploy it. The new configuration will be store on Consul, and you need to restart the Traefik node: `docker service update --force traefik_traefik`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/will be store on Consul/will be stored in Consul
s/Traefik/Træfik
Can you split in two lines please? One line = one sentence
|
||
If you have some update to do, update the initializer service and re-deploy it. The new configuration will be store on Consul, and you need to restart the Traefik node: `docker service update --force traefik_traefik`. | ||
|
||
## Complete Docker compose file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT of Full docker-compose file?
version: "3.4" | ||
services: | ||
traefik_init: | ||
image: traefik:1.4@sha256:9c299d9613cb01564c8219f4bc56ecc55f30d8f06d35cf3ecf83a85426c13225 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use træfik 1.5-rc3 please?
depends_on: | ||
- consul | ||
traefik: | ||
image: traefik:1.4@sha256:9c299d9613cb01564c8219f4bc56ecc55f30d8f06d35cf3ecf83a85426c13225 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use træfik 1.5-rc3 please?
- "--consul" | ||
- "--consul.endpoint=consul:8500" | ||
- "--consul.prefix=traefik" | ||
- "--acme.storage=traefik/acme/account" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you delete this argument?
Hello @nmengin thanks for your feedback. Else, we will need to update the documentation once the 1.5 version will be out. |
1d536aa
to
114fe43
Compare
Yes you're right, 1.5 is better than the full version name 👍 |
114fe43
to
f631e4e
Compare
I made the changes asked. One of the change is: -If you don't use Let's Encrypt with Traefik, you may not need Traefik cluster/HA.
-But, if you use Let's Encrypt, you need to store certificates somewhere shared by all the Traefik instances.
+If you want to use Let's Encrypt with Traefik, sharing configuration or TLS certificates, you need Traefik cluster/HA. What do you think? |
Can you add |
f631e4e
to
a7920a2
Compare
@nmandery done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @jmaitrehenry and happy new year ;)
The PR SGTM but I still have few little comments.
|
||
This guide explains how to use Træfik in high availability mode in a Docker Swarm and with Let's Encrypt. | ||
|
||
Why we need Traefik in cluster mode? Running multiple instances should work out of the box? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need
Running multiple instances should work out of the box?
Not sure to understand your sentence. Do you mean How a cluster shoud work out of the box? ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running multiple instances should work out of the box?
It's because you can run multiple instances of Traefik and it works but you miss some stuff like sharing challenge for LE and more. And it's what I want to explain in this guide: you can't just start traefik instances and hope it will magically work with LE.
--acme.email=contact@mydomain.ca | ||
``` | ||
|
||
Let's Encrypt needs 3 parameters: an entryPoint to listen to, a storage for certificates, and en email for the registration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/en email/an email
|
||
Now, Traefik needs to know where to store the certificates, we can choose between a key in a Key-Value store, or a file path: `--acme.storage=my/key` or `--acme.storage=/path/to/acme.json`. | ||
|
||
For your email and the entrypoints, it's `--acme.entryPoint` and `--acme.email` flags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/entrypoints/entryPoint/
--docker.domain=mydomain.ca \ | ||
--docker.watch | ||
``` | ||
To enable docker and support, you need to add `--docker` and `--docker.swarmmode` flags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the word swarm-mode is missing?
--docker.watch | ||
``` | ||
To enable docker and support, you need to add `--docker` and `--docker.swarmmode` flags. | ||
To enable watch docker changes, add `--docker.watch`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about simplying the sentence : To watch docker changes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT of: To watch docker events ?
@nmengin Happy new year 🎉! I made the change asked, but, for the last one, I add a comment:
WDYT of: To watch docker events? |
@wryfi you right, it's really focussed on Docker Swarm, it's on the name of the menu: And the title of the guide show it clearly:
Could we make another guide for another use-case? Sure we can, but I never build it without docker and I prefer write about something I build and tested before :) . I don't speak about the virtual IP or how Traefik is exposed and why I expose the port on host-mode and I could complete the guide. I write it based on my blog post where I give more detail about the architecture. You can read it here: https://jmaitrehenry.ca/2017/12/15/using-traefik-with-docker-swarm-and-consul-as-your-load-balancer/ But, maybe we can merge this one first and complete after, what do you think @nmengin @mmatur @ldez ? |
@jmaitrehenry sure, no arguments from me. I put my comments here because the title of the issue is "DOCS - Add a clustering example." So maybe if this ticket is about a specific docker example, the title could be updated to reflect reality. ;) |
@wryfi Right, I just updated the title :) |
@wryfi I understand why you would think that the docker method would be different however, there are more similarities than you might think:
The issues you are raising are in regards to the configuration of the network upstream of the binary (aka the IP/etc that traefik listens on). Whether you choose containerization (and use docker/kubernetes), Virtualization (and use vms), or baremetal (and use VRRP [keepalived/pacemaker etc]), the upstream networking is up to you or your devop/sysadmin. Being that Traefik is at its core To answer your question in regards to pacemaker, sure, you could float an IP through that, you could also round robin between nodes, as traefik clusters in an active-active state. You could also use VRRP and use keepalived to float IPs from node to node. All will work, but again, we will probably not come out with any "recommended" guide for baremetal installations, due to the specificity of baremetal installations and networking requirements that are abstracted when using containers. Finally, I don't feel that there is any reason that Traefik cannot or should not be marketed as having built-in HA. We understand that currently to implement the full featureset (including letsencrypt) you will require a KV store to handle the transactional creation and updating of letsencrypt certificates, but for those people that want HA, running a KV store should not be out of the realm of possibility. I would like to give a huge 🎉 to @jmaitrehenry for writing this, as it has been a much requested document. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@dtomcej I think you take my comments a bit too critically. I just came here trying to understand how your software works. IMHO, more precise use of the terms "highly available" and "clustered" would be helpful for new users trying to understand what your software does (and does not). Those terms are not interchangeable. The concept of HA encompasses the entire service architecture, including the networking layer. Clustering or shared configuration is a requirement for an HA architecture, but does not itself provide high availability. I did not come here to disparage your project, express any opinion about Docker, or tell @jmaitrehenry to change his documentation. I posted here to ask what features your code supports, because it is not clear in the existing docs. That github does not provide a better forum for this type of question is another (unfortunate) issue. Thanks! |
@wryfi I apologize for the trite response, we get a fair bit of naysaying through these tickets sometimes, and I might have misinterpreted your intentions for your comments. You would not believe how many times we get told "traefik doesn't work for my use case X, therefore you can't say it has feature Y". You are correct on all accounts about the difference between HA and clustering. Hopefully having better documentation (such as the current ticket) will allow us to move forward to better implementations, and use cases for further development. Sorry again for the argumentative tone of my previous response. Please hit us up on slack if you would like to further discuss your use case (I have run Traefik in production without docker as well). Thanks! |
@dtomcej thanks for your reply, and no worries. I understand the trolling that projects get. Assuming my preliminary tests are positive, you will likely see more of me (maybe even some contributions). Cheers! :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
36c50da
to
434df61
Compare
Hi there, thanks for the docs on the integration Swarm/Consul/Traefik. I have set up Traefik on Docker Swarm in HA mode with Consul as KV-Store as described. It works fine when starting the setup for the first time. However, when I need to reboot the host that runs the consul container, e.g. on critical system updates, then consul is not able to find a a leader after reboot and remains in a loop and does not recover. In that case Traefik is not able to fetch the ACME certificates anymore and all my https-clients are unreachable. I can reset everything by deleting all my consul data and rebuild the stack but then the previously stored traefik configuration-key in consul including the certificates is of course empty and traefik restarts requesting all the Let's Encrypt certificates one by one once again. So my question is with the documented setup how to properly restart a consul container on docker swarm so consul will actually get properly back into action when restarted, that is, selects itself as a leader and serves the existing kv-store. Greets -act |
@actraiser You right, the problem you have is that consul is not HA in this example. You need to have a 3 node cluster and restart one consul at a time. I had this problem myself in a not HA cluster (1 consul node). |
@actraiser check this compose-file for consul: https://gist.github.com/jmaitrehenry/40d8272f622a45ecca53cefa16362fb5 It create a 3 nodes cluster but, restricted to a single node each with a local volume. If you already have a distributed volume driver like rexray, ceph, nfs, or something else, you can change the volume definition and the placement constraint. |
@jmaitrehenry Thank you but I switched to etcd where the cluster deployment process into Docker Swarm went out of the box much smoother than using Consul. |
Can this be updated to 1.7? |
Why do we still this traefik_init ... We won't that, why it could not simply work with k/v store directly? Instead of start traefik with json file, and a traefik "init" that push json file into kv. It's so complicated. There is a way to simplify that ? Direct bind K/v to traefik without that init vodoo hack traefik init. Traefik could not init itself when booting with kv options enabled ? |
I won't mind the hack provided it works in swarm mode. To date I haven't gotten traefik working in HA mode in Swarm. |
What does this PR do?
Add a clustering example in the user-guide documentation
Motivation
The first time I check what the cluster mode do, I can't find the information.
I think it's a really cool feature and adding more information about it, it is important.
More
Additional Notes
It's base on my blog post: https://jmaitrehenry.ca/2017/12/15/using-traefik-with-docker-swarm-and-consul-as-your-load-balancer/ but I change the format and rewrite a good part of it for the documentation.
In the beginning, I would like to improve the Clustering/HA user-guide, but, as this example use Docker Swarm and Consul with a specific docker-compose file for Docker Swarm, I think it's better to have a page just for it.
Fixes #1200, #736