1--- 2layout: "guides" 3page_title: "Spread" 4sidebar_current: "guides-operating-a-job-spread" 5description: |- 6 The following guide walks the user through using the spread stanza in Nomad. 7--- 8 9# Increasing Failure Tolerance with Spread 10 11The Nomad scheduler uses a bin packing algorithm when making job placements on nodes to optimize resource utilization and density of applications. Although bin packing ensures optimal resource utilization, it can lead to some nodes carrying a majority of allocations for a given job. This can cause cascading failures where the failure of a single node or a single data center can lead to application unavailability. 12 13The [spread stanza][spread-stanza] solves this problem by allowing operators to distribute their workloads in a customized way based on [attributes][attributes] and/or [client metadata][client-metadata]. By using spread criteria in their job specification, Nomad job operators can ensure that failures across a domain such as datacenter or rack don't affect application availability. 14 15## Reference Material 16 17- The [spread][spread-stanza] stanza documentation 18- [Scheduling][scheduling] with Nomad 19 20## Estimated Time to Complete 21 2220 minutes 23 24## Challenge 25 26Consider a Nomad application that needs to be deployed to multiple datacenters within a region. Datacenter `dc1` has four nodes while `dc2` has one node. This application has 10 instances and 7 of them must be deployed to `dc1` since it receives more user traffic and we need to make sure the application doesn't suffer downtime due to not enough running instances to process requests. The remaining 3 allocations can be deployed to `dc2`. 27 28## Solution 29 30Use the `spread` stanza in the Nomad [job specification][job-specification] to ensure the 70% of the workload is being placed in datacenter `dc1` and 30% is being placed in `dc2`. The Nomad operator can use the [percent][percent] option with a [target][target] to customize the spread. 31 32## Prerequisites 33 34To perform the tasks described in this guide, you need to have a Nomad 35environment with Consul installed. You can use this [repo](https://github.com/hashicorp/nomad/tree/master/terraform#provision-a-nomad-cluster-in-the-cloud) to easily provision a sandbox environment. This guide will assume a cluster with one server node and five client nodes. 36 37-> **Please Note:** This guide is for demo purposes and is only using a single 38server 39node. In a production cluster, 3 or 5 server nodes are recommended. 40 41## Steps 42 43### Step 1: Place One of the Client Nodes in a Different Datacenter 44 45We are going to customize the spread for our job placement between the datacenters our nodes are located in. Choose one of your client nodes and edit `/etc/nomad.d/nomad.hcl` to change its location to `dc2`. A snippet of an example configuration file is show below with the required change is shown below. 46 47```shell 48data_dir = "/opt/nomad/data" 49bind_addr = "0.0.0.0" 50datacenter = "dc2" 51 52# Enable the client 53client { 54 enabled = true 55... 56``` 57After making the change on your chosen client node, restart the Nomad service 58 59```shell 60$ sudo systemctl restart nomad 61``` 62 63If everything worked correctly, you should be able to run the `nomad` [node status][node-status] command and see that one of your nodes is now in datacenter `dc2`. 64 65```shell 66$ nomad node status 67ID DC Name Class Drain Eligibility Status 685d16d949 dc2 ip-172-31-62-240 <none> false eligible ready 697b381152 dc1 ip-172-31-59-115 <none> false eligible ready 7010cc48cc dc1 ip-172-31-58-46 <none> false eligible ready 7193f1e628 dc1 ip-172-31-58-113 <none> false eligible ready 7212894b80 dc1 ip-172-31-62-90 <none> false eligible ready 73``` 74 75### Step 2: Create a Job with the `spread` Stanza 76 77Create a file with the name `redis.nomad` and place the following content in it: 78 79```hcl 80job "redis" { 81 datacenters = ["dc1", "dc2"] 82 type = "service" 83 84 spread { 85 attribute = "${node.datacenter}" 86 weight = 100 87 target "dc1" { 88 percent = 70 89 } 90 target "dc2" { 91 percent = 30 92 } 93 } 94 95 group "cache1" { 96 count = 10 97 98 task "redis" { 99 driver = "docker" 100 101 config { 102 image = "redis:latest" 103 port_map { 104 db = 6379 105 } 106 } 107 108 resources { 109 network { 110 port "db" {} 111 } 112 } 113 114 service { 115 name = "redis-cache" 116 port = "db" 117 check { 118 name = "alive" 119 type = "tcp" 120 interval = "10s" 121 timeout = "2s" 122 } 123 } 124 } 125 } 126} 127``` 128Note that we used the `spread` stanza and specified the [datacenter][attributes] 129attribute while targeting `dc1` and `dc2` with the percent options. This will tell the Nomad scheduler to make an attempt to distribute 70% of the workload on `dc1` and 30% of the workload on `dc2`. 130 131### Step 3: Register the Job `redis.nomad` 132 133Run the Nomad job with the following command: 134 135```shell 136$ nomad run redis.nomad 137==> Monitoring evaluation "c3dc5ebd" 138 Evaluation triggered by job "redis" 139 Allocation "7a374183" created: node "5d16d949", group "cache1" 140 Allocation "f4361df1" created: node "7b381152", group "cache1" 141 Allocation "f7af42dc" created: node "5d16d949", group "cache1" 142 Allocation "0638edf2" created: node "10cc48cc", group "cache1" 143 Allocation "49bc6038" created: node "12894b80", group "cache1" 144 Allocation "c7e5679a" created: node "5d16d949", group "cache1" 145 Allocation "cf91bf65" created: node "7b381152", group "cache1" 146 Allocation "d16b606c" created: node "12894b80", group "cache1" 147 Allocation "27866df0" created: node "93f1e628", group "cache1" 148 Allocation "8531a6fc" created: node "7b381152", group "cache1" 149 Evaluation status changed: "pending" -> "complete" 150``` 151 152Note that three of the ten allocations have been placed on node `5d16d949`. This is the node we configured to be in datacenter `dc2`. The Nomad scheduler has distributed 30% of the workload to `dc2` as we specified in the `spread` stanza. 153 154Keep in mind that the Nomad scheduler still factors in other components into the overall scoring of nodes when making placements, so you should not expect the spread stanza to strictly implement your distribution preferences like a [constraint][constraint-stanza]. We will take a detailed look at the scoring in the next few steps. 155 156### Step 4: Check the Status of the `redis` Job 157 158At this point, we are going to check the status of our job and verify where our allocations have been placed. Run the following command: 159 160```shell 161$ nomad status redis 162``` 163 164You should see 10 instances of your job running in the `Summary` section of the output as show below: 165 166```shell 167... 168Summary 169Task Group Queued Starting Running Failed Complete Lost 170cache1 0 0 10 0 0 0 171 172Allocations 173ID Node ID Task Group Version Desired Status Created Modified 1740638edf2 10cc48cc cache1 0 run running 2m20s ago 2m ago 17527866df0 93f1e628 cache1 0 run running 2m20s ago 1m57s ago 17649bc6038 12894b80 cache1 0 run running 2m20s ago 1m58s ago 1777a374183 5d16d949 cache1 0 run running 2m20s ago 2m1s ago 1788531a6fc 7b381152 cache1 0 run running 2m20s ago 2m2s ago 179c7e5679a 5d16d949 cache1 0 run running 2m20s ago 1m55s ago 180cf91bf65 7b381152 cache1 0 run running 2m20s ago 1m57s ago 181d16b606c 12894b80 cache1 0 run running 2m20s ago 2m1s ago 182f4361df1 7b381152 cache1 0 run running 2m20s ago 2m3s ago 183f7af42dc 5d16d949 cache1 0 run running 2m20s ago 1m54s ago 184``` 185 186You can cross-check this output with the results of the `nomad node status` command to verify that 30% of your workload has been placed on the node in `dc2` (in our case, that node is `5d16d949`). 187 188### Step 5: Obtain Detailed Scoring Information on Job Placement 189 190The Nomad scheduler will not always spread your 191workload in the way you have specified in the `spread` stanza even if the 192resources are available. This is because spread scoring is factored in with 193other metrics as well before making a scheduling decision. In this step, we will take a look at some of those other factors. 194 195Using the output from the previous step, take any allocation that has been placed on a node and use the nomad [alloc status][alloc status] command with the [verbose][verbose] option to obtain detailed scoring information on it. In this example, we will use the allocation ID `0638edf2` (your allocation IDs will be different). 196 197```shell 198$ nomad alloc status -verbose 0638edf2 199``` 200The resulting output will show the `Placement Metrics` section at the bottom. 201 202```shell 203... 204Placement Metrics 205Node node-affinity allocation-spread binpack job-anti-affinity node-reschedule-penalty final score 20610cc48cc-2913-af54-74d5-d7559f373ff2 0 0.429 0.33 0 0 0.379 20793f1e628-e509-b1ab-05b7-0944056f781d 0 0.429 0.515 -0.2 0 0.248 20812894b80-4943-4d5c-5716-c626c6b99be3 0 0.429 0.515 -0.2 0 0.248 2097b381152-3802-258b-4155-6d7dfb344dd4 0 0.429 0.515 -0.2 0 0.248 2105d16d949-85aa-3fd3-b5f4-51094cbeb77a 0 0.333 0.515 -0.2 0 0.216 211``` 212 213Note that the results from the `allocation-spread`, `binpack`, `job-anti-affinity`, `node-reschedule-penalty`, and `node-affinity` columns are combined to produce the numbers listed in the `final score` column for each node. The Nomad scheduler uses the final score for each node in deciding where to make placements. 214 215## Next Steps 216 217Change the values of the `percent` options on your targets in the `spread` stanza and observe how the placement behavior along with the final score given to each node changes (use the `nomad alloc status` command as shown in the previous step). 218 219[alloc status]: /docs/commands/alloc/status.html 220[attributes]: /docs/runtime/interpolation.html#node-variables- 221[client-metadata]: /docs/configuration/client.html#meta 222[constraint-stanza]: /docs/job-specification/constraint.html 223[job-specification]: /docs/job-specification/index.html 224[node-status]: /docs/commands/node/status.html 225[percent]: /docs/job-specification/spread.html#percent 226[spread-stanza]: /docs/job-specification/spread.html 227[scheduling]: /docs/internals/scheduling/scheduling.html 228[target]: /docs/job-specification/spread.html#target 229[verbose]: /docs/commands/alloc/status.html#verbose 230