1.. Licensed to the Apache Software Foundation (ASF) under one 2.. or more contributor license agreements. See the NOTICE file 3.. distributed with this work for additional information 4.. regarding copyright ownership. The ASF licenses this file 5.. to you under the Apache License, Version 2.0 (the 6.. "License"); you may not use this file except in compliance 7.. with the License. You may obtain a copy of the License at 8 9.. http://www.apache.org/licenses/LICENSE-2.0 10 11.. Unless required by applicable law or agreed to in writing, 12.. software distributed under the License is distributed on an 13.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14.. KIND, either express or implied. See the License for the 15.. specific language governing permissions and limitations 16.. under the License. 17 18Packaging and Testing with Crossbow 19=================================== 20 21The content of ``arrow/dev/tasks`` directory aims for automating the process of 22Arrow packaging and integration testing. 23 24Packages: 25 - C++ and Python `conda-forge packages`_ for Linux, Mac and Windows 26 - Python `Wheels`_ for Linux, Mac and Windows 27 - C++ and GLib `Linux packages`_ for multiple distributions 28 - Java for Gandiva 29 30Integration tests: 31 - Various docker tests 32 - Pandas 33 - Dask 34 - Turbodbc 35 - HDFS 36 - Spark 37 38Architecture 39------------ 40 41Executors 42~~~~~~~~~ 43 44Individual jobs are executed on public CI services, currently: 45 46- Linux: TravisCI, CircleCI, Azure Pipelines 47- Mac: TravisCI, Azure Pipelines 48- Windows: AppVeyor, Azure Pipelines 49 50Queue 51~~~~~ 52 53Because of the nature of how the CI services work, the scheduling of 54jobs happens through an additional git repository, which acts like a job 55queue for the tasks. Anyone can host a ``queue`` repository which is usually 56called as ``crossbow``. 57 58A job is a git commit on a particular git branch, containing only the required 59configuration file to run the requested build (like ``.travis.yml``, 60``appveyor.yml`` or ``azure-pipelines.yml``). 61 62Scheduler 63~~~~~~~~~ 64 65`Crossbow.py`_ handles version generation, task rendering and 66submission. The tasks are defined in ``tasks.yml``. 67 68Install 69------- 70 71 The following guide depends on GitHub, but theoretically any git 72 server can be used. 73 741. `Create the queue repository`_ 75 762. Enable `TravisCI`_, `Appveyor`_, `Azure Pipelines_` and `CircleCI`_ 77 integrations on for the newly created queue repository. 78 79 - turn off Travis’ `auto cancellation`_ feature on branches 80 813. Clone the newly created repository next to the arrow repository: 82 83 By default the scripts looks for ``crossbow`` next to arrow repository, but 84 this can configured through command line arguments. 85 86 .. code:: bash 87 88 git clone https://github.com/<user>/crossbow crossbow 89 90 **Important note:** Crossbow only supports GitHub token based 91 authentication. Although it overwrites the repository urls provided with ssh 92 protocol, it's advisable to use the HTTPS repository URLs. 93 944. `Create a Personal Access Token`_ with ``repo`` permissions (other 95 permissions are not needed) 96 975. Locally export the token as an environment variable: 98 99 .. code:: bash 100 101 export CROSSBOW_GITHUB_TOKEN=<token> 102 103 .. 104 105 or pass as an argument to the CLI script ``--github-token`` 106 1076. Export the previously created GitHub token on both CI services: 108 109 Use ``CROSSBOW_GITHUB_TOKEN`` encrypted environment variable. You can 110 set them at the following URLs, where ``ghuser`` is the GitHub 111 username and ``ghrepo`` is the GitHub repository name (typically 112 ``crossbow``): 113 114 - TravisCI: ``https://travis-ci.org/<ghuser>/<ghrepo>/settings`` 115 - Appveyor: 116 ``https://ci.appveyor.com/project/<ghuser>/<ghrepo>/settings/environment`` 117 - CircleCI: 118 ``https://circleci.com/gh/<ghuser>/<ghrepo>/edit#env-vars`` 119 120 On Appveyor check the ``skip branches without appveyor.yml`` checkbox 121 on the web UI under crossbow repository’s settings. 122 1237. Install Python (minimum supported version is 3.6): 124 125 Miniconda is preferred, see installation instructions: 126 https://conda.io/docs/user-guide/install/index.html 127 1288. Install the python dependencies for the script: 129 130 .. code:: bash 131 132 conda install -c conda-forge -y --file arrow/ci/conda_env_crossbow.txt 133 134 .. code:: bash 135 136 # pygit2 requires libgit2: http://www.pygit2.org/install.html 137 pip install \ 138 jinja2 \ 139 pygit2 \ 140 click \ 141 ruamel.yaml \ 142 setuptools_scm \ 143 github3.py \ 144 toolz \ 145 jira 146 1479. Try running it: 148 149 .. code:: bash 150 151 $ python crossbow.py --help 152 153Usage 154----- 155 156The script does the following: 157 1581. Detects the current repository, thus supports forks. The following 159 snippet will build kszucs’s fork instead of the upstream apache/arrow 160 repository. 161 162 .. code:: bash 163 164 $ git clone https://github.com/kszucs/arrow 165 $ git clone https://github.com/kszucs/crossbow 166 167 $ cd arrow/dev/tasks 168 $ python crossbow.py submit --help # show the available options 169 $ python crossbow.py submit conda-win conda-linux conda-osx 170 1712. Gets the HEAD commit of the currently checked out branch and 172 generates the version number based on `setuptools_scm`_. So to build 173 a particular branch check out before running the script: 174 175 .. code:: bash 176 177 git checkout ARROW-<ticket number> 178 python dev/tasks/crossbow.py submit --dry-run conda-linux conda-osx 179 180 .. 181 182 Note that the arrow branch must be pushed beforehand, because the 183 script will clone the selected branch. 184 1853. Reads and renders the required build configurations with the 186 parameters substituted. 187 1884. Create a branch per task, prefixed with the job id. For example to 189 build conda recipes on linux it will create a new branch: 190 ``crossbow@build-<id>-conda-linux``. 191 1925. Pushes the modified branches to GitHub which triggers the builds. For 193 authentication it uses GitHub OAuth tokens described in the install 194 section. 195 196Query the build status 197~~~~~~~~~~~~~~~~~~~~~~ 198 199Build id (which has a corresponding branch in the queue repository) is returned 200by the ``submit`` command. 201 202.. code:: bash 203 204 python crossbow.py status <build id / branch name> 205 206Download the build artifacts 207~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 208 209.. code:: bash 210 211 python crossbow.py artifacts <build id / branch name> 212 213Examples 214~~~~~~~~ 215 216Submit command accepts a list of task names and/or a list of task-group names 217to select which tasks to build. 218 219Run multiple builds: 220 221.. code:: bash 222 223 $ python crossbow.py submit debian-stretch conda-linux-gcc-py37 224 Repository: https://github.com/kszucs/arrow@tasks 225 Commit SHA: 810a718836bb3a8cefc053055600bdcc440e6702 226 Version: 0.9.1.dev48+g810a7188.d20180414 227 Pushed branches: 228 - debian-stretch 229 - conda-linux-gcc-py37 230 231Just render without applying or committing the changes: 232 233.. code:: bash 234 235 $ python crossbow.py submit --dry-run task_name 236 237Run only ``conda`` package builds and a Linux one: 238 239.. code:: bash 240 241 $ python crossbow.py submit --group conda centos-7 242 243Run ``wheel`` builds: 244 245.. code:: bash 246 247 $ python crossbow.py submit --group wheel 248 249There are multiple task groups in the ``tasks.yml`` like docker, integration 250and cpp-python for running docker based tests. 251 252``python crossbow.py submit`` supports multiple options and arguments, for more 253see its help page: 254 255.. code:: bash 256 257 $ python crossbow.py submit --help 258 259 260.. _conda-forge packages: conda-recipes 261.. _Wheels: python-wheels 262.. _Linux packages: linux-packages 263.. _Crossbow.py: crossbow.py 264.. _Create the queue repository: https://help.github.com/articles/creating-a-new-repository 265.. _TravisCI: https://travis-ci.org/getting_started 266.. _Appveyor: https://www.appveyor.com/docs/ 267.. _CircleCI: https://circleci.com/docs/2.0/getting-started/ 268.. _Azure Pipelines: https://docs.microsoft.com/en-us/azure/devops/pipelines/get-started/pipelines-sign-up 269.. _auto cancellation: https://docs.travis-ci.com/user/customizing-the-build/#Building-only-the-latest-commit 270.. _Create a Personal Access Token: https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/ 271.. _setuptools_scm: https://pypi.python.org/pypi/setuptools_scm 272