1.. _scipy-roadmap:
2
3SciPy Roadmap
4=============
5
6This roadmap page contains only the most important ideas and needs for SciPy
7going forward.  For a more detailed roadmap, including per-subpackage status,
8many more ideas, API stability and more, see :ref:`scipy-roadmap-detailed`.
9
10
11Support for distributed arrays and GPU arrays
12---------------------------------------------
13
14NumPy has split its API from its execution engine with
15``__array_function__`` and ``__array_ufunc__``.  This will enable parts of SciPy
16to accept distributed arrays (e.g. ``dask.array.Array``) and GPU arrays (e.g.
17``cupy.ndarray``) that implement the ``ndarray`` interface.  At the moment it is
18not yet clear which algorithms will work out of the box, and if there are
19significant performance gains when they do.  We want to create a map of which
20parts of the SciPy API work, and improve support over time.
21
22In addition to making use of NumPy protocols like ``__array_function__``, we can
23make use of these protocols in SciPy as well.  That will make it possible to
24(re)implement SciPy functions like, e.g., those in ``scipy.signal`` for Dask
25or GPU arrays (see
26`NEP 18 - use outside of NumPy <http://www.numpy.org/neps/nep-0018-array-function-protocol.html#use-outside-of-numpy>`__).  NumPy's features in this areas are still evolving,
27see e.g. `NEP 37 - A dispatch protocol for NumPy-like modules <https://numpy.org/neps/nep-0037-array-module.html>`__,
28and SciPy is an important "client" for those features.
29
30
31Performance improvements
32------------------------
33
34Speed improvements, lower memory usage and the ability to parallelize
35algorithms are beneficial to most science domains and use cases.  We have
36established an API design pattern for multiprocessing - using the ``workers``
37keyword - that can be adopted in many more functions.
38
39Enabling the use of an accelerator like Pythran, possibly via Transonic, and
40making it easier for users to use Numba's ``@njit`` in their code that relies
41on SciPy functionality would unlock a lot of performance gain.  That needs a
42strategy though, all solutions are still maturing (see for example
43`this overview <https://fluiddyn.bitbucket.io/transonic-vision.html>`__).
44
45Finally, many individual functions can be optimized for performance.
46``scipy.optimize`` and ``scipy.interpolate`` functions are particularly often
47requested in this respect.
48
49
50Statistics enhancements
51-----------------------
52
53The `scipy.stats` enhancements listed in the :ref:`scipy-roadmap-detailed` are of
54particularly high importance to the project.
55
56- Improve the options for fitting a probability distribution to data.
57- Expand the set of hypothesis tests.  In particular, include all the basic
58  variations of analysis of variance.
59- Add confidence intervals for all statistical tests.
60
61
62Support for more hardware platforms
63-----------------------------------
64
65SciPy now has continuous integration for ARM64 (or ``aarch64``) and POWER8/9
66(or ``ppc64le``), and binaries are available via
67`Miniforge <https://github.com/conda-forge/miniforge>`__.  Wheels on PyPI for
68these platforms are now also possible (with the ``manylinux2014`` standard),
69and requests for those are becoming more frequent.
70
71Additionally, having IBM Z (or ``s390x``) in CI is now possible with TravisCI
72but not yet done - and ``manylinux2014`` wheels for that platform are also
73possible then.  Finally, resolving open AIX build issues would help users.
74
75
76Implement sparse arrays in addition to sparse matrices
77------------------------------------------------------
78
79The sparse matrix formats are mostly feature-complete, however the main issue
80is that they act like ``numpy.matrix`` (which will be deprecated in NumPy at
81some point).  What we want is sparse *arrays* that act like ``numpy.ndarray``.
82This is being worked on in https://github.com/pydata/sparse, which is quite far
83along.  The tentative plan is:
84
85- Start depending on ``pydata/sparse`` once it's feature-complete enough (it
86  still needs a CSC/CSR equivalent) and okay performance-wise.
87- Indicate in the documentation that for new code users should prefer
88  ``pydata/sparse`` over sparse matrices.
89- When NumPy deprecates ``numpy.matrix``, vendor that or maintain it as a
90  stand-alone package.
91