1Load Balancing in gRPC
2======================
3
4# Scope
5
6This document explains the design for load balancing within gRPC.
7
8# Background
9
10## Per-Call Load Balancing
11
12It is worth noting that load-balancing within gRPC happens on a per-call
13basis, not a per-connection basis.  In other words, even if all requests
14come from a single client, we still want them to be load-balanced across
15all servers.
16
17## Approaches to Load Balancing
18
19Prior to any gRPC specifics, we explore some usual ways to approach load
20balancing.
21
22### Proxy Model
23
24Using a proxy provides a solid trustable client that can report load to the load
25balancing system. Proxies typically require more resources to operate since they
26have temporary copies of the RPC request and response. This model also increases
27latency to the RPCs.
28
29The proxy model was deemed inefficient when considering request heavy services
30like storage.
31
32### Balancing-aware Client
33
34This thicker client places more of the load balancing logic in the client. For
35example, the client could contain many load balancing policies (Round Robin,
36Random, etc) used to select servers from a list. In this model, a list of
37servers would be either statically configured in the client, provided by the
38name resolution system, an external load balancer, etc. In any case, the client
39is responsible for choosing the preferred server from the list.
40
41One of the drawbacks of this approach is writing and maintaining the load
42balancing policies in multiple languages and/or versions of the clients. These
43policies can be fairly complicated. Some of the algorithms also require client
44to server communication so the client would need to get thicker to support
45additional RPCs to get health or load information in addition to sending RPCs
46for user requests.
47
48It would also significantly complicate the client's code: the new design hides
49the load balancing complexity of multiple layers and presents it as a simple
50list of servers to the client.
51
52### External Load Balancing Service
53
54The client load balancing code is kept simple and portable, implementing
55well-known algorithms (e.g., Round Robin) for server selection.
56Complex load balancing algorithms are instead provided by the load
57balancer. The client relies on the load balancer to provide _load
58balancing configuration_ and _the list of servers_ to which the client
59should send requests. The balancer updates the server list as needed
60to balance the load as well as handle server unavailability or health
61issues. The load balancer will make any necessary complex decisions and
62inform the client. The load balancer may communicate with the backend
63servers to collect load and health information.
64
65# Requirements
66
67## Simple API and client
68
69The gRPC client load balancing code must be simple and portable. The
70client should only contain simple algorithms (e.g., Round Robin) for
71server selection.  For complex algorithms, the client should rely on
72a load balancer to provide load balancing configuration and the list of
73servers to which the client should send requests. The balancer will update
74the server list as needed to balance the load as well as handle server
75unavailability or health issues. The load balancer will make any necessary
76complex decisions and inform the client. The load balancer may communicate
77with the backend servers to collect load and health information.
78
79## Security
80
81The load balancer may be separate from the actual server backends and a
82compromise of the load balancer should only lead to a compromise of the
83loadbalancing functionality. In other words, a compromised load balancer should
84not be able to cause a client to trust a (potentially malicious) backend server
85any more than in a comparable situation without loadbalancing.
86
87# Architecture
88
89## Overview
90
91The primary mechanism for load-balancing in gRPC is external
92load-balancing, where an external load balancer provides simple clients
93with an up-to-date list of servers.
94
95The gRPC client does support an API for built-in load balancing policies.
96However, there are only a small number of these (one of which is the
97`grpclb` policy, which implements external load balancing), and users
98are discouraged from trying to extend gRPC by adding more.  Instead, new
99load balancing policies should be implemented in external load balancers.
100
101## Workflow
102
103Load-balancing policies fit into the gRPC client workflow in between
104name resolution and the connection to the server.  Here's how it all
105works:
106
107![image](images/load-balancing.png)
108
1091. On startup, the gRPC client issues a [name resolution](naming.md) request
110   for the server name.  The name will resolve to one or more IP addresses,
111   each of which will indicate whether it is a server address or
112   a load balancer address, and a [service config](service_config.md)
113   that indicates which client-side load-balancing policy to use (e.g.,
114   `round_robin` or `grpclb`).
1152. The client instantiates the load balancing policy.
116   - Note: If any one of the addresses returned by the resolver is a balancer
117     address, then the client will use the `grpclb` policy, regardless
118     of what load-balancing policy was requested by the service config.
119     Otherwise, the client will use the load-balancing policy requested
120     by the service config.  If no load-balancing policy is requested
121     by the service config, then the client will default to a policy
122     that picks the first available server address.
1233. The load balancing policy creates a subchannel to each server address.
124   - For all policies *except* `grpclb`, this means one subchannel for each
125     address returned by the resolver. Note that these policies
126     ignore any balancer addresses returned by the resolver.
127   - In the case of the `grpclb` policy, the workflow is as follows:
128     1. The policy opens a stream to one of the balancer addresses returned
129        by the resolver. It asks the balancer for the server addresses to
130        use for the server name originally requested by the client (i.e.,
131        the same one originally passed to the name resolver).
132        - Note: In the `grpclb` policy, the non-balancer addresses returned
133          by the resolver are used as a fallback in case no balancers can be
134          contacted when the LB policy is started.
135     2. The gRPC servers to which the load balancer is directing the client
136        may report load to the load balancers, if that information is needed
137        by the load balancer's configuration.
138     3. The load balancer returns a server list to the gRPC client's `grpclb`
139        policy. The `grpclb` policy will then create a subchannel to each of
140        server in the list.
1414. For each RPC sent, the load balancing policy decides which
142   subchannel (i.e., which server) the RPC should be sent to.
143   - In the case of the `grpclb` policy, the client will send requests
144     to the servers in the order in which they were returned by the load
145     balancer.  If the server list is empty, the call will block until a
146     non-empty one is received.
147