1Kubernetes and Helm
2===================
3
4It is easy to launch a Dask cluster and a Jupyter_ notebook server on cloud
5resources using Kubernetes_ and Helm_.
6
7.. _Kubernetes: https://kubernetes.io/
8.. _Helm: https://helm.sh/
9.. _Jupyter: https://jupyter.org/
10
11This is particularly useful when you want to deploy a fresh Python environment
12on Cloud services like Amazon Web Services, Google Compute Engine, or
13Microsoft Azure.
14
15If you already have Python environments running in a pre-existing Kubernetes
16cluster, then you may prefer the :doc:`Kubernetes native<kubernetes-native>`
17documentation, which is a bit lighter weight.
18
19Launch Kubernetes Cluster
20-------------------------
21
22This document assumes that you have a Kubernetes cluster and Helm installed.
23
24If this is not the case, then you might consider setting up a Kubernetes cluster
25on one of the common cloud providers like Google, Amazon, or
26Microsoft.  We recommend the first part of the documentation in the guide
27`Zero to JupyterHub <https://zero-to-jupyterhub.readthedocs.io/en/latest/>`_
28that focuses on Kubernetes and Helm (you do not need to follow all of these
29instructions). In particular, you don't need to install JupyterHub.
30
31- `Creating a Kubernetes Cluster <https://zero-to-jupyterhub.readthedocs.io/en/latest/create-k8s-cluster.html>`_
32- `Setting up Helm <https://zero-to-jupyterhub.readthedocs.io/en/latest/setup-helm.html>`_
33
34Alternatively, you may want to experiment with Kubernetes locally using
35`Minikube <https://kubernetes.io/docs/getting-started-guides/minikube/>`_.
36
37Which Chart is Right for You?
38-----------------------------
39
40Dask maintains a Helm chart repository containing various charts for the Dask community
41https://helm.dask.org/ .
42You will need to add this to your known channels and update your local charts::
43
44   helm repo add dask https://helm.dask.org/
45   helm repo update
46
47We provides two Helm charts. The right one to choose depends on whether you're
48deploying Dask for a single user or for many users.
49
50
51================  =====================================================================
52Helm Chart        Use Case
53================  =====================================================================
54``dask/dask``     Single-user deployment with one notebook server and one Dask Cluster.
55``dask/daskhub``  Multi-user deployment with JupyterHub and Dask Gateway.
56================  =====================================================================
57
58See :ref:`kubernetes-helm.single` or :ref:`kubernetes-helm.multi` for detailed
59instructions on deploying either of these.
60As you might suspect, deploying ``dask/daskhub`` is a bit more complicated since
61there are more components. If you're just deploying for a single user we'd recommend
62using ``dask/dask``.
63
64.. _kubernetes-helm.single:
65
66Helm Install Dask for a Single User
67-----------------------------------
68
69Once your Kubernetes cluster is ready, you can deploy dask using the Dask Helm_ chart::
70
71   helm install my-dask dask/dask
72
73This deploys a ``dask-scheduler``, several ``dask-worker`` processes, and
74also an optional Jupyter server.
75
76
77Verify Deployment
78^^^^^^^^^^^^^^^^^
79
80This might take a minute to deploy.  You can check its status with
81``kubectl``::
82
83   kubectl get pods
84   kubectl get services
85
86   $ kubectl get pods
87   NAME                                  READY     STATUS              RESTARTS    AGE
88   bald-eel-jupyter-924045334-twtxd      0/1       ContainerCreating   0            1m
89   bald-eel-scheduler-3074430035-cn1dt   1/1       Running             0            1m
90   bald-eel-worker-3032746726-202jt      1/1       Running             0            1m
91   bald-eel-worker-3032746726-b8nqq      1/1       Running             0            1m
92   bald-eel-worker-3032746726-d0chx      0/1       ContainerCreating   0            1m
93
94   $ kubectl get services
95   NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                      AGE
96   bald-eel-jupyter     LoadBalancer   10.11.247.201   35.226.183.149   80:30173/TCP                  2m
97   bald-eel-scheduler   LoadBalancer   10.11.245.241   35.202.201.129   8786:31166/TCP,80:31626/TCP   2m
98   kubernetes           ClusterIP      10.11.240.1     <none>           443/TCP
99   48m
100
101You can use the addresses under ``EXTERNAL-IP`` to connect to your now-running
102Jupyter and Dask systems.
103
104Notice the name ``bald-eel``.  This is the name that Helm has given to your
105particular deployment of Dask.  You could, for example, have multiple
106Dask-and-Jupyter clusters running at once, and each would be given a different
107name.  Note that you will need to use this name to refer to your deployment in the future.
108Additionally, you can list all active helm deployments with::
109
110   helm list
111
112   NAME            REVISION        UPDATED                         STATUS      CHART           NAMESPACE
113   bald-eel        1               Wed Dec  6 11:19:54 2017        DEPLOYED    dask-0.1.0      default
114
115
116Connect to Dask and Jupyter
117^^^^^^^^^^^^^^^^^^^^^^^^^^^
118
119When we ran ``kubectl get services``, we saw some externally visible IPs:
120
121.. code-block:: bash
122
123   mrocklin@pangeo-181919:~$ kubectl get services
124   NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                       AGE
125   bald-eel-jupyter     LoadBalancer   10.11.247.201   35.226.183.149   80:30173/TCP                  2m
126   bald-eel-scheduler   LoadBalancer   10.11.245.241   35.202.201.129   8786:31166/TCP,80:31626/TCP   2m
127   kubernetes           ClusterIP      10.11.240.1     <none>           443/TCP                       48m
128
129We can navigate to these services from any web browser. Here, one is the Dask diagnostic
130dashboard, and the other is the Jupyter server.  You can log into the Jupyter
131notebook server with the password, ``dask``.
132
133You can create a notebook and create a Dask client from there.  The
134``DASK_SCHEDULER_ADDRESS`` environment variable has been populated with the
135address of the Dask scheduler.  This is available in Python from the ``dask.config`` object.
136
137.. code-block:: python
138
139   >>> import dask
140   >>> dask.config.get('scheduler_address')
141   'bald-eel-scheduler:8786'
142
143Although you don't need to use this address, the Dask client will find this
144variable automatically.
145
146.. code-block:: python
147
148   from dask.distributed import Client, config
149   client = Client()
150
151
152Configure Environment
153^^^^^^^^^^^^^^^^^^^^^
154
155By default, the Helm deployment launches three workers using one core each and
156a standard conda environment. We can customize this environment by creating a
157small yaml file that implements a subset of the values in the
158`dask helm chart values.yaml file <https://github.com/dask/helm-chart/blob/main/dask/values.yaml>`_.
159
160For example, we can increase the number of workers, and include extra conda and
161pip packages to install on the both the workers and Jupyter server (these two
162environments should be matched).
163
164.. code-block:: yaml
165
166   # config.yaml
167
168   worker:
169     replicas: 8
170     resources:
171       limits:
172         cpu: 2
173         memory: 7.5G
174       requests:
175         cpu: 2
176         memory: 7.5G
177     env:
178       - name: EXTRA_CONDA_PACKAGES
179         value: numba xarray -c conda-forge
180       - name: EXTRA_PIP_PACKAGES
181         value: s3fs dask-ml --upgrade
182
183   # We want to keep the same packages on the worker and jupyter environments
184   jupyter:
185     enabled: true
186     env:
187       - name: EXTRA_CONDA_PACKAGES
188         value: numba xarray matplotlib -c conda-forge
189       - name: EXTRA_PIP_PACKAGES
190         value: s3fs dask-ml --upgrade
191
192This config file overrides the configuration for the number and size of workers and the
193conda and pip packages installed on the worker and Jupyter containers.  In
194general, we will want to make sure that these two software environments match.
195
196Update your deployment to use this configuration file.  Note that *you will not
197use helm install* for this stage: that would create a *new* deployment on the
198same Kubernetes cluster.  Instead, you will upgrade your existing deployment by
199using the current name::
200
201    helm upgrade bald-eel dask/dask -f config.yaml
202
203This will update those containers that need to be updated.  It may take a minute or so.
204
205As a reminder, you can list the names of deployments you have using ``helm
206list``
207
208
209Check status and logs
210^^^^^^^^^^^^^^^^^^^^^
211
212For standard issues, you should be able to see the worker status and logs using the
213Dask dashboard (in particular, you can see the worker links from the ``info/`` page).
214However, if your workers aren't starting, you can check the status of pods and
215their logs with the following commands:
216
217.. code-block:: bash
218
219   kubectl get pods
220   kubectl logs <PODNAME>
221
222.. code-block:: bash
223
224   mrocklin@pangeo-181919:~$ kubectl get pods
225   NAME                                  READY     STATUS    RESTARTS   AGE
226   bald-eel-jupyter-3805078281-n1qk2     1/1       Running   0          18m
227   bald-eel-scheduler-3074430035-cn1dt   1/1       Running   0          58m
228   bald-eel-worker-1931881914-1q09p      1/1       Running   0          18m
229   bald-eel-worker-1931881914-856mm      1/1       Running   0          18m
230   bald-eel-worker-1931881914-9lgzb      1/1       Running   0          18m
231   bald-eel-worker-1931881914-bdn2c      1/1       Running   0          16m
232   bald-eel-worker-1931881914-jq70m      1/1       Running   0          17m
233   bald-eel-worker-1931881914-qsgj7      1/1       Running   0          18m
234   bald-eel-worker-1931881914-s2phd      1/1       Running   0          17m
235   bald-eel-worker-1931881914-srmmg      1/1       Running   0          17m
236
237   mrocklin@pangeo-181919:~$ kubectl logs bald-eel-worker-1931881914-856mm
238   EXTRA_CONDA_PACKAGES environment variable found.  Installing.
239   Fetching package metadata ...........
240   Solving package specifications: .
241   Package plan for installation in environment /opt/conda/envs/dask:
242   The following NEW packages will be INSTALLED:
243       fasteners: 0.14.1-py36_2 conda-forge
244       monotonic: 1.3-py36_0    conda-forge
245       zarr:      2.1.4-py36_0  conda-forge
246   Proceed ([y]/n)?
247   monotonic-1.3- 100% |###############################| Time: 0:00:00  11.16 MB/s
248   fasteners-0.14 100% |###############################| Time: 0:00:00 576.56 kB/s
249   ...
250
251
252Delete a Helm deployment
253^^^^^^^^^^^^^^^^^^^^^^^^
254
255You can always delete a helm deployment using its name::
256
257   helm delete bald-eel --purge
258
259Note that this does not destroy any clusters that you may have allocated on a
260Cloud service (you will need to delete those explicitly).
261
262
263Avoid the Jupyter Server
264^^^^^^^^^^^^^^^^^^^^^^^^
265
266Sometimes you do not need to run a Jupyter server alongside your Dask cluster.
267
268.. code-block:: yaml
269
270   jupyter:
271     enabled: false
272
273.. _kubernetes-helm.multi:
274
275Helm Install Dask for Mulitple Users
276------------------------------------
277
278The ``dask/daskhub`` Helm Chart deploys JupyterHub_, `Dask Gateway`_, and configures
279the two to work well together. In particular, Dask Gateway is registered as
280a JupyterHub service so that Dask Gateway can re-use JupyterHub's authentication,
281and the JupyterHub environment is configured to connect to the Dask Gateway
282without any arguments.
283
284.. note::
285
286   The ``dask/daskhub`` helm chart came out of the `Pangeo`_ project, a community
287   platform for big data geoscience.
288
289.. _Pangeo: http://pangeo.io/
290.. _Dask Gateway: https://gateway.dask.org/
291.. _JupyterHub: https://jupyterhub.readthedocs.io/en/stable/
292
293The ``dask/daskhub`` helm chart uses the JupyterHub and Dask-Gateway helm charts.
294You'll want to consult the `JupyterHub helm documentation <https://zero-to-jupyterhub.readthedocs.io/en/latest/setup-jupyterhub/setup-jupyterhub.html>`_ and
295and `Dask Gateway helm documentation <https://gateway.dask.org/install-kube.html>`_ for further customization. The default values
296are at https://github.com/dask/helm-chart/blob/main/daskhub/values.yaml.
297
298Verify that you've set up a Kubernetes cluster and added Dask's helm charts:
299
300.. code-block:: console
301
302   $ helm repo add dask https://helm.dask.org/
303   $ helm repo update
304
305JupyterHub and Dask Gateway require a few secret tokens. We'll generate them
306on the command line and insert the tokens in a ``secrets.yaml`` file that will
307be passed to Helm.
308
309Run the following command, and copy the output. This is our `token-1`.
310
311.. code-block:: console
312
313   $ openssl rand -hex 32  # generate token-1
314
315Run command again and copy the output again. This is our `token-2`.
316
317.. code-block:: console
318
319   $ openssl rand -hex 32  # generate token-2
320
321Now substitute those two values for ``<token-1>`` and ``<token-2>`` below.
322Note that ``<token-2>`` is used twice, once for ``jupyterhub.hub.services.dask-gateway.apiToken``, and a second time for ``dask-gateway.gateway.auth.jupyterhub.apiToken``.
323
324.. code-block:: yaml
325
326   # file: secrets.yaml
327   jupyterhub:
328     proxy:
329       secretToken: "<token-1>"
330     hub:
331       services:
332         dask-gateway:
333           apiToken: "<token-2>"
334
335   dask-gateway:
336     gateway:
337       auth:
338         jupyterhub:
339           apiToken: "<token-2>"
340
341Now we're ready to install DaskHub
342
343.. code-block:: console
344
345   $ helm upgrade --wait --install --render-subchart-notes \
346       dhub dask/daskhub \
347       --values=secrets.yaml
348
349
350The output explains how to find the IPs for your JupyterHub depoyment.
351
352.. code-block:: console
353
354   $ kubectl get service proxy-public
355   NAME           TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                      AGE
356   proxy-public   LoadBalancer   10.43.249.239   35.202.158.223   443:31587/TCP,80:30500/TCP   2m40s
357
358
359Creating a Dask Cluster
360^^^^^^^^^^^^^^^^^^^^^^^
361
362To create a Dask cluster on this deployment, users need to connect to the Dask Gateway
363
364.. code-block:: python
365
366   >>> from dask_gateway import GatewayCluster
367   >>> cluster = GatewayCluster()
368   >>> client = cluster.get_client()
369   >>> cluster
370
371Depending on the configuration, users may need to ``cluster.scale(n)`` to
372get workers. See https://gateway.dask.org/ for more on Dask Gateway.
373
374Matching the User Environment
375^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
376
377Dask Clients will be running the JupyterHub's singleuser environment. To ensure
378that the same environment is used for the scheduler and workers, you can provide
379it as a Gateway option and configure the ``singleuser`` environment to default
380to the value set by JupyterHub.
381
382.. code-block:: yaml
383
384   # config.yaml
385   jupyterhub:
386     singleuser:
387       extraEnv:
388         DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE: '{JUPYTER_IMAGE_SPEC}'
389
390   dask-gateway:
391     gateway:
392       extraConfig:
393         optionHandler: |
394           from dask_gateway_server.options import Options, Integer, Float, String
395           def option_handler(options):
396               if ":" not in options.image:
397                   raise ValueError("When specifying an image you must also provide a tag")
398               return {
399                   "image": options.image,
400               }
401           c.Backend.cluster_options = Options(
402               String("image", default="pangeo/base-notebook:2020.07.28", label="Image"),
403               handler=option_handler,
404           )
405
406The user environment will need to include ``dask-gateway``. Any packages installed
407manually after the ``singleuser`` pod started will not be included in the worker
408environment.
409