1---
2layout: "docs"
3page_title: "Consul Architecture"
4sidebar_current: "docs-internals-architecture"
5description: |-
6  Consul is a complex system that has many different moving parts. To help users and developers of Consul form a mental model of how it works, this page documents the system architecture.
7---
8
9# Consul Architecture
10
11Consul is a complex system that has many different moving parts. To help
12users and developers of Consul form a mental model of how it works, this
13page documents the system architecture.
14
15-> Before describing the architecture, we recommend reading the
16[glossary](/docs/glossary.html) of terms to help
17clarify what is being discussed.
18
19The architecture concepts in this document can be used with the [Reference Architecture guide](https://learn.hashicorp.com/consul/datacenter-deploy/reference-architecture?utm_source=consul.io&utm_medium=docs) when deploying Consul in production.
20
21## 10,000 foot view
22
23From a 10,000 foot altitude the architecture of Consul looks like this:
24
25<div class="center">
26[![Consul Architecture](/assets/images/consul-arch.png)](/assets/images/consul-arch.png)
27</div>
28
29Let's break down this image and describe each piece. First of all, we can see
30that there are two datacenters, labeled "one" and "two". Consul has first
31class support for [multiple datacenters](https://learn.hashicorp.com/consul/security-networking/datacenters) and
32expects this to be the common case.
33
34Within each datacenter, we have a mixture of clients and servers. It is expected
35that there be between three to five servers. This strikes a balance between
36availability in the case of failure and performance, as consensus gets progressively
37slower as more machines are added. However, there is no limit to the number of clients,
38and they can easily scale into the thousands or tens of thousands.
39
40All the agents that are in a datacenter participate in a [gossip protocol](/docs/internals/gossip.html).
41This means there is a gossip pool that contains all the agents for a given datacenter. This serves
42a few purposes: first, there is no need to configure clients with the addresses of servers;
43discovery is done automatically. Second, the work of detecting agent failures
44is not placed on the servers but is distributed. This makes failure detection much more
45scalable than naive heartbeating schemes. It also provides failure detection for the nodes; if the agent is not reachable, then the node may have experienced a failure. Thirdly, it is used as a messaging layer to notify
46when important events such as leader election take place.
47
48The servers in each datacenter are all part of a single Raft peer set. This means that
49they work together to elect a single leader, a selected server which has extra duties. The leader
50is responsible for processing all queries and transactions. Transactions must also be replicated to
51all peers as part of the [consensus protocol](/docs/internals/consensus.html). Because of this
52requirement, when a non-leader server receives an RPC request, it forwards it to the cluster leader.
53
54The server agents also operate as part of a WAN gossip pool. This pool is different from the LAN pool
55as it is optimized for the higher latency of the internet and is expected to contain only
56other Consul server agents. The purpose of this pool is to allow datacenters to discover each
57other in a low-touch manner. Bringing a new datacenter online is as easy as joining the existing
58WAN gossip pool. Because the servers are all operating in this pool, it also enables cross-datacenter
59requests. When a server receives a request for a different datacenter, it forwards it to a random
60server in the correct datacenter. That server may then forward to the local leader.
61
62This results in a very low coupling between datacenters, but because of failure detection,
63connection caching and multiplexing, cross-datacenter requests are relatively fast and reliable.
64
65In general, data is not replicated between different Consul datacenters. When a
66request is made for a resource in another datacenter, the local Consul servers forward
67an RPC request to the remote Consul servers for that resource and return the results.
68If the remote datacenter is not available, then those resources will also not be
69available, but that won't otherwise affect the local datacenter. There are some special
70situations where a limited subset of data can be replicated, such as with Consul's built-in
71[ACL replication](https://learn.hashicorp.com/consul/day-2-operations/acl-replication) capability, or
72external tools like [consul-replicate](https://github.com/hashicorp/consul-replicate).
73
74In some places, client agents may cache data from the servers to make it
75available locally for performance and reliability. Examples include Connect
76certificates and intentions which allow the client agent to make local decisions
77about inbound connection requests without a round trip to the servers. Some API
78endpoints also support optional result caching. This helps reliability because
79the local agent can continue to respond to some queries like service-discovery
80or Connect authorization from cache even if the connection to the servers is
81disrupted or the servers are temporarily unavailable.
82
83## Getting in depth
84
85At this point we've covered the high level architecture of Consul, but there are many
86more details for each of the subsystems. The [consensus protocol](/docs/internals/consensus.html) is
87documented in detail as is the [gossip protocol](/docs/internals/gossip.html). The [documentation](/docs/internals/security.html)
88for the security model and protocols used are also available.
89
90For other details, either consult the code, ask in IRC, or reach out to the mailing list.
91