1*********************
2Hacking on cloud-init
3*********************
4
5This document describes how to contribute changes to cloud-init.
6It assumes you have a `GitHub`_ account, and refers to your GitHub user
7as ``GH_USER`` throughout.
8
9Submitting your first pull request
10==================================
11
12Follow these steps to submit your first pull request to cloud-init:
13
14* To contribute to cloud-init, you must sign the Canonical `contributor
15  license agreement`_
16
17  * If you have already signed it as an individual, your Launchpad user
18    will be listed in the `contributor-agreement-canonical`_ group.
19    (Unfortunately there is no easy way to check if an organization or
20    company you are doing work for has signed.)
21
22  * When signing it:
23
24    * ensure that you fill in the GitHub username field.
25    * when prompted for 'Project contact' or 'Canonical Project
26      Manager', enter 'James Falcon'.
27
28  * If your company has signed the CLA for you, please contact us to
29    help in verifying which Launchpad/GitHub accounts are associated
30    with the company.
31
32  * For any questions or help with the process, please email `James
33    Falcon <mailto:james.falcon@canonical.com>`_ with the subject,
34    "Cloud-Init CLA"
35
36  * You also may contact user ``falcojr`` in the ``#cloud-init``
37    channel on the Libera IRC network.
38
39* Configure git with your email and name for commit messages.
40
41  Your name will appear in commit messages and will also be used in
42  changelogs or release notes.  Give yourself credit!::
43
44    git config user.name "Your Name"
45    git config user.email "Your Email"
46
47* Sign into your `GitHub`_ account
48
49* Fork the upstream `repository`_ on Github and clicking on the ``Fork`` button
50
51* Create a new remote pointing to your personal GitHub repository.
52
53  .. code:: sh
54
55    git clone git://github.com/canonical/cloud-init
56    cd cloud-init
57    git remote add GH_USER git@github.com:GH_USER/cloud-init.git
58    git push GH_USER main
59
60* Read through the cloud-init `Code Review Process`_, so you understand
61  how your changes will end up in cloud-init's codebase.
62
63* Submit your first cloud-init pull request, adding yourself to the
64  in-repository list that we use to track CLA signatures:
65  `tools/.github-cla-signers`_
66
67  * See `PR #344`_ and `PR #345`_ for examples of what this pull
68    request should look like.
69
70  * Note that ``.github-cla-signers`` is sorted alphabetically.
71
72  * (If you already have a change that you want to submit, you can
73    also include the change to ``tools/.github-cla-signers`` in that
74    pull request, there is no need for two separate PRs.)
75
76.. _GitHub: https://github.com
77.. _Launchpad: https://launchpad.net
78.. _repository: https://github.com/canonical/cloud-init
79.. _contributor license agreement: https://ubuntu.com/legal/contributors
80.. _contributor-agreement-canonical: https://launchpad.net/%7Econtributor-agreement-canonical/+members
81.. _tools/.github-cla-signers: https://github.com/canonical/cloud-init/blob/main/tools/.github-cla-signers
82.. _PR #344: https://github.com/canonical/cloud-init/pull/344
83.. _PR #345: https://github.com/canonical/cloud-init/pull/345
84
85Transferring CLA Signatures from Launchpad to Github
86----------------------------------------------------
87
88For existing contributors who have signed the agreement in Launchpad
89before the Github username field was included, we need to verify the
90link between your `Launchpad`_ account and your `GitHub`_ account.  To
91enable us to do this, we ask that you create a branch with both your
92Launchpad and GitHub usernames against both the Launchpad and GitHub
93cloud-init repositories.  We've added a tool
94(``tools/migrate-lp-user-to-github``) to the cloud-init repository to
95handle this migration as automatically as possible.
96
97The cloud-init team will review the two merge proposals and verify that
98the CLA has been signed for the Launchpad user and record the
99associated GitHub account.
100
101.. note::
102   If you are a first time contributor, you will not need to touch
103   Launchpad to contribute to cloud-init: all new CLA signatures are
104   handled as part of the GitHub pull request process described above.
105
106Do these things for each feature or bug
107=======================================
108
109* Create a new topic branch for your work::
110
111    git checkout -b my-topic-branch
112
113* Make and commit your changes (note, you can make multiple commits,
114  fixes, more commits.)::
115
116    git commit
117
118* Run unit tests and lint/formatting checks with `tox`_::
119
120    tox
121
122* Push your changes to your personal GitHub repository::
123
124    git push -u GH_USER my-topic-branch
125
126* Use your browser to create a merge request:
127
128  - Open the branch on GitHub
129
130    - You can see a web view of your repository and navigate to the branch at:
131
132      ``https://github.com/GH_USER/cloud-init/tree/my-topic-branch``
133
134  - Click 'Pull Request`
135  - Fill out the pull request title, summarizing the change and a longer
136    message indicating important details about the changes included, like ::
137
138      Activate the frobnicator.
139
140      The frobnicator was previously inactive and now runs by default.
141      This may save the world some day.  Then, list the bugs you fixed
142      as footers with syntax as shown here.
143
144      The commit message should be one summary line of less than
145      74 characters followed by a blank line, and then one or more
146      paragraphs describing the change and why it was needed.
147
148      This is the message that will be used on the commit when it
149      is sqaushed and merged into trunk.
150
151      LP: #1
152
153    Note that the project continues to use LP: #NNNNN format for closing
154    launchpad bugs rather than GitHub Issues.
155
156  - Click 'Create Pull Request`
157
158Then, someone in the `Ubuntu Server`_ team will review your changes and
159follow up in the pull request.  Look at the `Code Review Process`_ doc
160to understand the following steps.
161
162Feel free to ping and/or join ``#cloud-init`` on Libera irc if you
163have any questions.
164
165.. _tox: https://tox.readthedocs.io/en/latest/
166.. _Ubuntu Server: https://github.com/orgs/canonical/teams/ubuntu-server
167.. _Code Review Process: https://cloudinit.readthedocs.io/en/latest/topics/code_review.html
168
169Design
170======
171
172This section captures design decisions that are helpful to know when
173hacking on cloud-init.
174
175Cloud Config Modules
176--------------------
177
178* Any new modules should use underscores in any new config options and not
179  hyphens (e.g. `new_option` and *not* `new-option`).
180
181Tests
182-----
183
184Submissions to cloud-init must include testing.  See :ref:`testing` for
185details on these requirements.
186
187Type Annotations
188----------------
189
190The cloud-init codebase uses Python's annotation support for storing
191type annotations in the style specified by `PEP-484`_.  Their use in
192the codebase is encouraged but with one important caveat: types from
193the ``typing`` module cannot be used.
194
195cloud-init still supports Python 3.4, which doesn't have the ``typing``
196module in the stdlib.  This means that the use of any types from the
197``typing`` module in the codebase would require installation of an
198additional Python module on platforms using Python 3.4.  As such
199platforms are generally in maintenance mode, the introduction of a new
200dependency may act as a break in compatibility in practical terms.
201
202Similarly, only function annotations are appropriate for use, as the
203variable annotations specified in `PEP-526`_ were introduced in Python
2043.6.
205
206.. _PEP-484: https://www.python.org/dev/peps/pep-0484/
207.. _PEP-526: https://www.python.org/dev/peps/pep-0526/
208
209Feature Flags
210-------------
211
212.. automodule:: cloudinit.features
213   :members:
214
215
216Ongoing Refactors
217=================
218
219This captures ongoing refactoring projects in the codebase.  This is
220intended as documentation for developers involved in the refactoring,
221but also for other developers who may interact with the code being
222refactored in the meantime.
223
224``cloudinit.net`` -> ``cloudinit.distros.networking`` Hierarchy
225---------------------------------------------------------------
226
227``cloudinit.net`` was imported from the curtin codebase as a chunk, and
228then modified enough that it integrated with the rest of the cloud-init
229codebase.  Over the ~4 years since, the fact that it is not fully
230integrated into the ``Distro`` hierarchy has caused several issues.
231
232The common pattern of these problems is that the commands used for
233networking are different across distributions and operating systems.
234This has lead to ``cloudinit.net`` developing its own "distro
235determination" logic: `get_interfaces_by_mac`_ is probably the clearest
236example of this.  Currently, these differences are primarily split
237along Linux/BSD lines.  However, it would be short-sighted to only
238refactor in a way that captures this difference: we can anticipate that
239differences will develop between Linux-based distros in future, or
240there may already be differences in tooling that we currently
241work around in less obvious ways.
242
243The high-level plan is to introduce a hierarchy of networking classes
244in ``cloudinit.distros.networking``, which each ``Distro`` subclass
245will reference.  These will capture the differences between networking
246on our various distros, while still allowing easy reuse of code between
247distros that share functionality (e.g. most of the Linux networking
248behaviour).  ``Distro`` objects will instantiate the networking classes
249at ``self.networking``, so callers will call
250``distro.networking.<func>`` instead of ``cloudinit.net.<func>``; this
251will necessitate access to an instantiated ``Distro`` object.
252
253An implementation note: there may be external consumers of the
254``cloudinit.net`` module.  We don't consider this a public API, so we
255will be removing it as part of this refactor.  However, we will ensure
256that the new API is complete from its introduction, so that any such
257consumers can move over to it wholesale.  (Note, however, that this new
258API is still not considered public or stable, and may not replicate the
259existing API exactly.)
260
261In more detail:
262
263* The root of this hierarchy will be the
264  ``cloudinit.distros.networking.Networking`` class.  This class will
265  have a corresponding method for every ``cloudinit.net`` function that
266  we identify to be involved in refactoring.  Initially, these methods'
267  implementations will simply call the corresponding ``cloudinit.net``
268  function.  (This gives us the complete API from day one, for existing
269  consumers.)
270* As the biggest differentiator in behaviour, the next layer of the
271  hierarchy will be two subclasses: ``LinuxNetworking`` and
272  ``BSDNetworking``.  These will be introduced in the initial PR.
273* When a difference in behaviour for a particular distro is identified,
274  a new ``Networking`` subclass will be created.  This new class should
275  generally subclass either ``LinuxNetworking`` or ``BSDNetworking``.
276* To be clear: ``Networking`` subclasses will only be created when
277  needed, we will not create a full hierarchy of per-``Distro``
278  subclasses up-front.
279* Each ``Distro`` class will have a class variable
280  (``cls.networking_cls``) which points at the appropriate
281  networking class (initially this will be either ``LinuxNetworking``
282  or ``BSDNetworking``).
283* When ``Distro`` classes are instantiated, they will instantiate
284  ``cls.networking_cls`` and store the instance at ``self.networking``.
285  (This will be implemented in ``cloudinit.distros.Distro.__init__``.)
286* A helper function will be added which will determine the appropriate
287  ``Distro`` subclass for the current system, instantiate it and return
288  its ``networking`` attribute.  (This is the entry point for existing
289  consumers to migrate to.)
290* Callers of refactored functions will change from calling
291  ``cloudinit.net.<func>`` to ``distro.networking.<func>``, where
292  ``distro`` is an instance of the appropriate ``Distro`` class for
293  this system.  (This will require making such an instance available to
294  callers, which will constitute a large part of the work in this
295  project.)
296
297After the initial structure is in place, the work in this refactor will
298consist of replacing the ``cloudinit.net.some_func`` call in each
299``cloudinit.distros.networking.Networking`` method with the actual
300implementation.  This can be done incrementally, one function at a
301time:
302
303* pick an unmigrated ``cloudinit.distros.networking.Networking`` method
304* find it in the `the list of bugs tagged net-refactor`_ and assign
305  yourself to it (see :ref:`Managing Work/Tracking Progress` below for
306  more details)
307* refactor all of its callers to call the ``distro.networking.<func>``
308  method on ``Distro`` instead of the ``cloudinit.net.<func>``
309  function. (This is likely to be the most time-consuming step, as it
310  may require plumbing ``Distro`` objects through to places that
311  previously have not consumed them.)
312* refactor its implementation from ``cloudinit.net`` into the
313  ``Networking`` hierarchy (e.g. if it has an if/else on BSD, this is
314  the time to put the implementations in their respective subclasses)
315
316  * if part of the method contains distro-independent logic, then you
317    may need to create new methods to capture this distro-specific
318    logic; we don't want to replicate common logic in different
319    ``Networking`` subclasses
320  * if after the refactor, the method on the root ``Networking`` class
321    no longer has any implementation, it should be converted to an
322    `abstractmethod`_
323
324* ensure that the new implementation has unit tests (either by moving
325  existing tests, or by writing new ones)
326* ensure that the new implementation has a docstring
327* add any appropriate type annotations
328
329  * note that we must follow the constraints described in the "Type
330    Annotations" section above, so you may not be able to write
331    complete annotations
332  * we have `type aliases`_ defined in ``cloudinit.distros.networking``
333    which should be used when applicable
334
335* finally, remove it (and any other now-unused functions) from
336  cloudinit.net (to avoid having two parallel implementations)
337
338``cloudinit.net`` Functions/Classes
339~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
340
341The functions/classes that need refactoring break down into some broad
342categories:
343
344* helpers for accessing ``/sys`` (that should not be on the top-level
345  ``Networking`` class as they are Linux-specific):
346
347  * ``get_sys_class_path``
348  * ``sys_dev_path``
349  * ``read_sys_net``
350  * ``read_sys_net_safe``
351  * ``read_sys_net_int``
352
353* those that directly access ``/sys`` (via helpers) and should (IMO) be
354  included in the API of the ``Networking`` class:
355
356  * ``generate_fallback_config``
357
358    * the ``config_driver`` parameter is used and passed as a boolean,
359      so we can change the default value to ``False`` (instead of
360      ``None``)
361
362  * ``get_ib_interface_hwaddr``
363  * ``get_interface_mac``
364  * ``interface_has_own_mac``
365  * ``is_bond``
366  * ``is_bridge``
367  * ``is_physical``
368  * ``is_renamed``
369  * ``is_up``
370  * ``is_vlan``
371  * ``wait_for_physdevs``
372
373* those that directly access ``/sys`` (via helpers) but may be
374  Linux-specific concepts or names:
375
376  * ``get_master``
377  * ``device_devid``
378  * ``device_driver``
379
380* those that directly use ``ip``:
381
382  * ``_get_current_rename_info``
383
384    * this has non-distro-specific logic so should potentially be
385      refactored to use helpers on ``self`` instead of ``ip`` directly
386      (rather than being wholesale reimplemented in each of
387      ``BSDNetworking`` or ``LinuxNetworking``)
388    * we can also remove the ``check_downable`` argument, it's never
389      specified so is always ``True``
390
391  * ``_rename_interfaces``
392
393    * this has several internal helper functions which use ``ip``
394      directly, and it calls ``_get_current_rename_info``.  That said,
395      there appears to be a lot of non-distro-specific logic that could
396      live in a function on ``Networking``, so this will require some
397      careful refactoring to avoid duplicating that logic in each of
398      ``BSDNetworking`` and ``LinuxNetworking``.
399    * only the ``renames`` and ``current_info`` parameters are ever
400      passed in (and ``current_info`` only by tests), so we can remove
401      the others from the definition
402
403  * ``EphemeralIPv4Network``
404
405    * this is another case where it mixes distro-specific and
406      non-specific functionality.  Specifically, ``__init__``,
407      ``__enter__`` and ``__exit__`` are non-specific, and the
408      remaining methods are distro-specific.
409    * when refactoring this, the need to track ``cleanup_cmds`` likely
410      means that the distro-specific behaviour cannot be captured only
411      in the ``Networking`` class.  See `this comment in PR #363`_ for
412      more thoughts.
413
414* those that implicitly use ``/sys`` via their call dependencies:
415
416  * ``master_is_bridge_or_bond``
417
418    * appends to ``get_master`` return value, which is a ``/sys`` path
419
420  * ``extract_physdevs``
421
422    * calls ``device_driver`` and ``device_devid`` in both
423      ``_version_*`` impls
424
425  * ``apply_network_config_names``
426
427    * calls ``extract_physdevs``
428    * there is already a ``Distro.apply_network_config_names`` which in
429      the default implementation calls this function; this and its BSD
430      subclass implementations should be refactored at the same time
431    * the ``strict_present`` and ``strict_busy`` parameters are never
432      passed, nor are they used in the function definition, so they can
433      be removed
434
435  * ``get_interfaces``
436
437    * calls ``device_driver``, ``device_devid`` amongst others
438
439  * ``get_ib_hwaddrs_by_interface``
440
441    * calls ``get_interfaces``
442
443* those that may fall into the above categories, but whose use is only
444  related to netfailover (which relies on a Linux-specific network
445  driver, so is unlikely to be relevant elsewhere without a substantial
446  refactor; these probably only need implementing in
447  ``LinuxNetworking``):
448
449  * ``get_dev_features``
450
451  * ``has_netfail_standby_feature``
452
453    * calls ``get_dev_features``
454
455  * ``is_netfailover``
456  * ``is_netfail_master``
457
458    * this is called from ``generate_fallback_config``
459
460  * ``is_netfail_primary``
461  * ``is_netfail_standby``
462
463  * N.B. all of these take an optional ``driver`` argument which is
464    used to pass around a value to avoid having to look it up by
465    calling ``device_driver`` every time.  This is something of a leaky
466    abstraction, and is better served by caching on ``device_driver``
467    or storing the cached value on ``self``, so we can drop the
468    parameter from the new API.
469
470* those that use ``/sys`` (via helpers) and have non-exhaustive BSD
471  logic:
472
473  * ``get_devicelist``
474
475* those that already have separate Linux/BSD implementations:
476
477  * ``find_fallback_nic``
478  * ``get_interfaces_by_mac``
479
480* those that have no OS-specific functionality (so do not need to be
481  refactored):
482
483  * ``ParserError``
484  * ``RendererNotFoundError``
485  * ``has_url_connectivity``
486  * ``is_ip_address``
487  * ``is_ipv4_address``
488  * ``natural_sort_key``
489
490Note that the functions in ``cloudinit.net`` use inconsistent parameter
491names for "string that contains a device name"; we can standardise on
492``devname`` (the most common one) in the refactor.
493
494Managing Work/Tracking Progress
495~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
496
497To ensure that we won't have multiple people working on the same part
498of the refactor at the same time, there is a bug for each function.
499You can see the current status by looking at `the list of bugs tagged
500net-refactor`_.
501
502When you're working on refactoring a particular method, ensure that you
503have assigned yourself to the corresponding bug, to avoid duplicate
504work.
505
506Generally, when considering what to pick up to refactor, it is best to
507start with functions in ``cloudinit.net`` which are not called by
508anything else in ``cloudinit.net``.  This allows you to focus only on
509refactoring that function and its callsites, rather than having to
510update the other ``cloudinit.net`` function also.
511
512References
513~~~~~~~~~~
514
515* `Mina Galić's email the the cloud-init ML in 2018`_ (plus its thread)
516* `Mina Galić's email to the cloud-init ML in 2019`_ (plus its thread)
517* `PR #363`_, the discussion which prompted finally starting this
518  refactor (and where a lot of the above details were hashed out)
519
520.. _get_interfaces_by_mac: https://github.com/canonical/cloud-init/blob/961239749106daead88da483e7319e9268c67cde/cloudinit/net/__init__.py#L810-L818
521.. _Mina Galić's email the the cloud-init ML in 2018: https://lists.launchpad.net/cloud-init/msg00185.html
522.. _Mina Galić's email to the cloud-init ML in 2019: https://lists.launchpad.net/cloud-init/msg00237.html
523.. _PR #363: https://github.com/canonical/cloud-init/pull/363
524.. _this comment in PR #363: https://github.com/canonical/cloud-init/pull/363#issuecomment-628829489
525.. _abstractmethod: https://docs.python.org/3/library/abc.html#abc.abstractmethod
526.. _type aliases: https://docs.python.org/3/library/typing.html#type-aliases
527.. _the list of bugs tagged net-refactor: https://bugs.launchpad.net/cloud-init/+bugs?field.tag=net-refactor
528