1## Disaster recovery 2 3etcd is designed to withstand machine failures. An etcd cluster automatically recovers from temporary failures (e.g., machine reboots) and tolerates up to *(N-1)/2* permanent failures for a cluster of N members. When a member permanently fails, whether due to hardware failure or disk corruption, it loses access to the cluster. If the cluster permanently loses more than *(N-1)/2* members then it disastrously fails, irrevocably losing quorum. Once quorum is lost, the cluster cannot reach consensus and therefore cannot continue accepting updates. 4 5To recover from disastrous failure, etcd v3 provides snapshot and restore facilities to recreate the cluster without v3 key data loss. To recover v2 keys, refer to the [v2 admin guide][v2_recover]. 6 7[v2_recover]: ../v2/admin_guide.md#disaster-recovery 8 9### Snapshotting the keyspace 10 11Recovering a cluster first needs a snapshot of the keyspace from an etcd member. A snapshot may either be taken from a live member with the `etcdctl snapshot save` command or by copying the `member/snap/db` file from an etcd data directory. For example, the following command snapshots the keyspace served by `$ENDPOINT` to the file `snapshot.db`: 12 13```sh 14$ ETCDCTL_API=3 etcdctl --endpoints $ENDPOINT snapshot save snapshot.db 15``` 16 17### Restoring a cluster 18 19To restore a cluster, all that is needed is a single snapshot "db" file. A cluster restore with `etcdctl snapshot restore` creates new etcd data directories; all members should restore using the same snapshot. Restoring overwrites some snapshot metadata (specifically, the member ID and cluster ID); the member loses its former identity. This metadata overwrite prevents the new member from inadvertently joining an existing cluster. Therefore in order to start a cluster from a snapshot, the restore must start a new logical cluster. 20 21Snapshot integrity may be optionally verified at restore time. If the snapshot is taken with `etcdctl snapshot save`, it will have an integrity hash that is checked by `etcdctl snapshot restore`. If the snapshot is copied from the data directory, there is no integrity hash and it will only restore by using `--skip-hash-check`. 22 23A restore initializes a new member of a new cluster, with a fresh cluster configuration using `etcd`'s cluster configuration flags, but preserves the contents of the etcd keyspace. Continuing from the previous example, the following creates new etcd data directories (`m1.etcd`, `m2.etcd`, `m3.etcd`) for a three member cluster: 24 25```sh 26$ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \ 27 --name m1 \ 28 --initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \ 29 --initial-cluster-token etcd-cluster-1 \ 30 --initial-advertise-peer-urls http://host1:2380 31$ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \ 32 --name m2 \ 33 --initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \ 34 --initial-cluster-token etcd-cluster-1 \ 35 --initial-advertise-peer-urls http://host2:2380 36$ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \ 37 --name m3 \ 38 --initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \ 39 --initial-cluster-token etcd-cluster-1 \ 40 --initial-advertise-peer-urls http://host3:2380 41``` 42 43Next, start `etcd` with the new data directories: 44 45```sh 46$ etcd \ 47 --name m1 \ 48 --listen-client-urls http://host1:2379 \ 49 --advertise-client-urls http://host1:2379 \ 50 --listen-peer-urls http://host1:2380 & 51$ etcd \ 52 --name m2 \ 53 --listen-client-urls http://host2:2379 \ 54 --advertise-client-urls http://host2:2379 \ 55 --listen-peer-urls http://host2:2380 & 56$ etcd \ 57 --name m3 \ 58 --listen-client-urls http://host3:2379 \ 59 --advertise-client-urls http://host3:2379 \ 60 --listen-peer-urls http://host3:2380 & 61``` 62 63Now the restored etcd cluster should be available and serving the keyspace given by the snapshot. 64