TraefikEE's architecture consists of nodes spread into two different planes: the control plane, and the data plane. The notions of control planes / data planes are not specific to Traefik and are well-known patterns.
As explained in the introduction:
- The data plane hosts horizontally scalable nodes that forward ingress traffic to your services.
- The control plane hosts distributed nodes that watch your platform and its services to configure the data plane.
This distributed architecture is the cornerstone of TraefikEE’s strengths: natively highly available, scalable, and secure.
Of course, every (non-deprecated) feature that is available for Traefik is also available for TraefikEE.
Nodes & Cluster¶
TraefikEE consists of a cluster of nodes communicating with one another. Each node is independent and belongs to either the control plane or the data plane.
Nodes in the Data Plane¶
These nodes are the workers that route the incoming requests. You may view these nodes as regular Traefik instances that are automatically configured by other nodes from the control plane.
These nodes know nothing about your other infrastructure components and won't need to communicate with them.
Nodes in the Control Plane¶
These nodes are responsible for querying your infrastructure components, creating the resulting configuration, and sending the routing instructions to the nodes in the data plane. These nodes will not handle incoming traffic.
Their main responsibilities are:
- Storing the data of the cluster, including events, certificates, and Traefik configuration.
- Sharing the data of the cluster as a distributed key-value store with all other nodes in the control plane.
- Propagating the latest configuration to the nodes in the data plane.
- Providing an API to configure the cluster and to query its state.
- Managing the dynamic configuration.
TraefikEE's nodes work together to form a high-availability cluster.
HA in the Data Plane¶
Each node in the data plane can handle the routing for every configured request that arrives in your cluster. If a node becomes unavailable, the other nodes will take over and accept more incoming requests.
The data plane only needs one healthy node to be functional.
HA in the Control Plane¶
Nodes in the control plane are responsible for the routing configuration and applying changes to the state of the cluster. Nodes in the control plane can have three different roles:
- Leader: responsible for updating/querying the state of the cluster
- Controller: responsible for querying your infrastructure components
- Agent: accept requests on behalf of the leader
When a leader node becomes unavailable, the other nodes elect a new leader among the healthy nodes.
When a controller node becomes unavailable, the leader gives a new node the responsibility of being the new controller.
When an agent node becomes unavailable, the cluster state doesn't change and it doesn't need to react. If a new healthy node wants to join the cluster, it will participate as an agent.
A node in the control plane can either be a leader or an agent (it can't be both at the same time). Any node can be the controller. If the controller node is an agent, it will send the configuration change requests to the leader who will then update the cluster state. The cluster can have many agents, but only one controller, and one leader.
TraefikEE uses the Raft protocol to elect its leader.
If a node temporarily leaves the cluster, it will have to update its state (this is done automatically) before being considered healthy again.
How Many Nodes Do You Need in the Control Plane?¶
The cluster is healthy as long as the nodes in the control plane can reach a quorum to elect a new leader in case of failure.
The quorum is
(N-1)/2 where N is the initial number of nodes in the control plane.
If the number of healthy nodes goes below this number, the cluster must recover before it resumes normal operations.
What Happens if the Control Plane is Unhealthy?¶
The data plane will continue to route the requests based on the latest known configuration.
Each node in the data plane can handle every configured request to your cluster. Load balancing the incoming requests between TraefikEE's data plane nodes is achieved by your infrastructure components (e.g., using Services in Kubernetes).
Note about Auto Scaling
Depending on your infrastructure components, you may benefit from auto-scaling tools that will help you automatically scale up/down your nodes as needed. For instance, for Kubernetes, here is a guide to set up a Horizontal Pod Autoscaler.
TraefikEE makes sure your data is safe. Nodes discuss using GRPC and encrypt their communication using a two way SSL authentication.
On top of that, not only having two separate planes for handling different responsibilities improves the resilience of your cluster, but it also improves its security: nodes in the control plane are not exposed to the outside, making an attack against your infrastructure more difficult.