1.. include:: _contributors.rst
2
3.. currentmodule:: sklearn
4
5.. _changes_1_0_2:
6
7Version 1.0.2
8=============
9
10**December 2021**
11
12- |Fix| :class:`cluster.Birch`,
13  :class:`feature_selection.RFECV`, :class:`ensemble.RandomForestRegressor`,
14  :class:`ensemble.RandomForestClassifier`,
15  :class:`ensemble.GradientBoostingRegressor`, and
16  :class:`ensemble.GradientBoostingClassifier` do not raise warning when fitted
17  on a pandas DataFrame anymore. :pr:`21578` by `Thomas Fan`_.
18
19Changelog
20---------
21
22:mod:`sklearn.cluster`
23......................
24
25- |Fix| Fixed an infinite loop in :func:`cluster.SpectralClustering` by
26  moving an iteration counter from try to except.
27  :pr:`21271` by :user:`Tyler Martin <martintb>`.
28
29:mod:`sklearn.datasets`
30.......................
31
32- |Fix| :func:`datasets.fetch_openml` is now thread safe. Data is first
33  downloaded to a temporary subfolder and then renamed.
34  :pr:`21833` by :user:`Siavash Rezazadeh <siavrez>`.
35
36:mod:`sklearn.decomposition`
37............................
38
39- |Fix| Fixed the constraint on the objective function of
40  :class:`decomposition.DictionaryLearning`,
41  :class:`decomposition.MiniBatchDictionaryLearning`, :class:`decomposition.SparsePCA`
42  and :class:`decomposition.MiniBatchSparsePCA` to be convex and match the referenced
43  article. :pr:`19210` by :user:`Jérémie du Boisberranger <jeremiedbb>`.
44
45:mod:`sklearn.ensemble`
46.......................
47
48- |Fix| :class:`ensemble.RandomForestClassifier`,
49  :class:`ensemble.RandomForestRegressor`,
50  :class:`ensemble.ExtraTreesClassifier`, :class:`ensemble.ExtraTreesRegressor`,
51  and :class:`ensemble.RandomTreesEmbedding` now raise a ``ValueError`` when
52  ``bootstrap=False`` and ``max_samples`` is not ``None``.
53  :pr:`21295` :user:`Haoyin Xu <PSSF23>`.
54
55- |Fix| Solve a bug in :class:`ensemble.GradientBoostingClassifier` where the
56  exponential loss was computing the positive gradient instead of the
57  negative one.
58  :pr:`22050` by :user:`Guillaume Lemaitre <glemaitre>`.
59
60:mod:`sklearn.feature_selection`
61................................
62
63- |Fix| Fixed :class:`feature_selection.SelectFromModel` by improving support
64  for base estimators that do not set `feature_names_in_`. :pr:`21991` by
65  `Thomas Fan`_.
66
67:mod:`sklearn.impute`
68.....................
69
70- |Fix| Fix a bug in :class:`linear_model.RidgeClassifierCV` where the method
71  `predict` was performing an `argmax` on the scores obtained from
72  `decision_function` instead of returning the multilabel indicator matrix.
73  :pr:`19869` by :user:`Guillaume Lemaitre <glemaitre>`.
74
75:mod:`sklearn.linear_model`
76...........................
77
78- |Fix| :class:`linear_model.LassoLarsIC` now correctly computes AIC
79  and BIC. An error is now raised when `n_features > n_samples` and
80  when the noise variance is not provided.
81  :pr:`21481` by :user:`Guillaume Lemaitre <glemaitre>` and
82  :user:`Andrés Babino <ababino>`.
83
84:mod:`sklearn.manifold`
85.......................
86
87- |Fix| Fixed an unnecessary error when fitting :class:`manifold.Isomap` with a
88  precomputed dense distance matrix where the neighbors graph has multiple
89  disconnected components. :pr:`21915` by `Tom Dupre la Tour`_.
90
91:mod:`sklearn.metrics`
92......................
93
94- |Fix| All :class:`sklearn.metrics.DistanceMetric` subclasses now correctly support
95  read-only buffer attributes.
96  This fixes a regression introduced in 1.0.0 with respect to 0.24.2.
97  :pr:`21694` by :user:`Julien Jerphanion <jjerphan>`.
98
99- |Fix| All :class:`sklearn.metrics.MinkowskiDistance` now accepts a weight
100  parameter that makes it possible to write code that behaves consistently both
101  with scipy 1.8 and earlier versions. In turns this means that all
102  neighbors-based estimators (except those that use `algorithm="kd_tree"`) now
103  accept a weight parameter with `metric="minknowski"` to yield results that
104  are always consistent with `scipy.spatial.distance.cdist`.
105  :pr:`21741` by :user:`Olivier Grisel <ogrisel>`.
106
107:mod:`sklearn.neighbors`
108........................
109
110- |Fix| :class:`neighbors.KDTree` and :class:`neighbors.BallTree` correctly supports
111  read-only buffer attributes. :pr:`21845` by `Thomas Fan`_.
112
113:mod:`sklearn.preprocessing`
114............................
115
116- |Fix| Fixes compatibility bug with NumPy 1.22 in :class:`preprocessing.OneHotEncoder`.
117  :pr:`21517` by `Thomas Fan`_.
118
119:mod:`sklearn.tree`
120...................
121
122- |Fix| Prevents :func:`tree.plot_tree` from drawing out of the boundary of
123  the figure. :pr:`21917` by `Thomas Fan`_.
124
125- |Fix| Support loading pickles of decision tree models when the pickle has
126  been generated on a platform with a different bitness. A typical example is
127  to train and pickle the model on 64 bit machine and load the model on a 32
128  bit machine for prediction. :pr:`21552` by :user:`Loïc Estève <lesteve>`.
129
130:mod:`sklearn.utils`
131....................
132
133- |Fix| :func:`utils.estimator_html_repr` now escapes all the estimator
134  descriptions in the generated HTML. :pr:`21493` by
135  :user:`Aurélien Geron <ageron>`.
136
137.. _changes_1_0_1:
138
139Version 1.0.1
140=============
141
142**October 2021**
143
144Changelog
145---------
146
147Fixed models
148------------
149
150- |Fix| Non-fit methods in the following classes do not raise a UserWarning
151  when fitted on DataFrames with valid feature names:
152  :class:`covariance.EllipticEnvelope`, :class:`ensemble.IsolationForest`,
153  :class:`ensemble.AdaBoostClassifier`, :class:`neighbors.KNeighborsClassifier`,
154  :class:`neighbors.KNeighborsRegressor`,
155  :class:`neighbors.RadiusNeighborsClassifier`,
156  :class:`neighbors.RadiusNeighborsRegressor`. :pr:`21199` by `Thomas Fan`_.
157
158:mod:`sklearn.calibration`
159..........................
160
161- |Fix| Fixed :class:`calibration.CalibratedClassifierCV` to take into account
162  `sample_weight` when computing the base estimator prediction when
163  `ensemble=False`.
164  :pr:`20638` by :user:`Julien Bohné <JulienB-78>`.
165
166- |Fix| Fixed a bug in :class:`calibration.CalibratedClassifierCV` with
167  `method="sigmoid"` that was ignoring the `sample_weight` when computing the
168  the Bayesian priors.
169  :pr:`21179` by :user:`Guillaume Lemaitre <glemaitre>`.
170
171:mod:`sklearn.cluster`
172......................
173
174- |Fix| Fixed a bug in :class:`cluster.KMeans`, ensuring reproducibility and equivalence
175  between sparse and dense input. :pr:`21195`
176  by :user:`Jérémie du Boisberranger <jeremiedbb>`.
177
178:mod:`sklearn.ensemble`
179.......................
180
181- |Fix| Fixed a bug that could produce a segfault in rare cases for
182  :class:`ensemble.HistGradientBoostingClassifier` and
183  :class:`ensemble.HistGradientBoostingRegressor`.
184  :pr:`21130` :user:`Christian Lorentzen <lorentzenchr>`.
185
186:mod:`sklearn.gaussian_process`
187...............................
188
189- |Fix| Compute `y_std` properly with multi-target in
190  :class:`sklearn.gaussian_process.GaussianProcessRegressor` allowing
191  proper normalization in multi-target scene.
192  :pr:`20761` by :user:`Patrick de C. T. R. Ferreira <patrickctrf>`.
193
194:mod:`sklearn.feature_extraction`
195.................................
196
197- |Efficiency| Fixed an efficiency regression introduced in version 1.0.0 in the
198  `transform` method of :class:`feature_extraction.text.CountVectorizer` which no
199  longer checks for uppercase characters in the provided vocabulary. :pr:`21251`
200  by :user:`Jérémie du Boisberranger <jeremiedbb>`.
201
202- |Fix| Fixed a bug in :class:`feature_extraction.CountVectorizer` and
203  :class:`feature_extraction.TfidfVectorizer` by raising an
204  error when 'min_idf' or 'max_idf' are floating-point numbers greater than 1.
205  :pr:`20752` by :user:`Alek Lefebvre <AlekLefebvre>`.
206
207:mod:`sklearn.linear_model`
208...........................
209
210- |Fix| Improves stability of :class:`linear_model.LassoLars` for different
211  versions of openblas. :pr:`21340` by `Thomas Fan`_.
212
213- |Fix| :class:`linear_model.LogisticRegression` now raises a better error
214  message when the solver does not support sparse matrices with int64 indices.
215  :pr:`21093` by `Tom Dupre la Tour`_.
216
217:mod:`sklearn.neighbors`
218........................
219
220- |Fix| :class:`neighbors.KNeighborsClassifier`,
221  :class:`neighbors.KNeighborsRegressor`,
222  :class:`neighbors.RadiusNeighborsClassifier`,
223  :class:`neighbors.RadiusNeighborsRegressor` with `metric="precomputed"` raises
224  an error for `bsr` and `dok` sparse matrices in methods: `fit`, `kneighbors`
225  and `radius_neighbors`, due to handling of explicit zeros in `bsr` and `dok`
226  :term:`sparse graph` formats. :pr:`21199` by `Thomas Fan`_.
227
228:mod:`sklearn.pipeline`
229.......................
230
231- |Fix| :meth:`pipeline.Pipeline.get_feature_names_out` correctly passes feature
232  names out from one step of a pipeline to the next. :pr:`21351` by
233  `Thomas Fan`_.
234
235:mod:`sklearn.svm`
236..................
237
238- |Fix| :class:`svm.SVC` and :class:`svm.SVR` check for an inconsistency
239  in its internal representation and raise an error instead of segfaulting.
240  This fix also resolves
241  `CVE-2020-28975 <https://nvd.nist.gov/vuln/detail/CVE-2020-28975>`__.
242  :pr:`21336` by `Thomas Fan`_.
243
244:mod:`sklearn.utils`
245....................
246
247- |Enhancement| :func:`utils.validation._check_sample_weight` can perform a
248  non-negativity check on the sample weights. It can be turned on
249  using the only_non_negative bool parameter.
250  Estimators that check for non-negative weights are updated:
251  :func:`linear_model.LinearRegression` (here the previous
252  error message was misleading),
253  :func:`ensemble.AdaBoostClassifier`,
254  :func:`ensemble.AdaBoostRegressor`,
255  :func:`neighbors.KernelDensity`.
256  :pr:`20880` by :user:`Guillaume Lemaitre <glemaitre>`
257  and :user:`András Simon <simonandras>`.
258
259- |Fix| Solve a bug in :func:`~sklearn.utils.metaestimators.if_delegate_has_method`
260  where the underlying check for an attribute did not work with NumPy arrays.
261  :pr:`21145` by :user:`Zahlii <Zahlii>`.
262
263Miscellaneous
264.............
265
266- |Fix| Fitting an estimator on a dataset that has no feature names, that was previously
267  fitted on a dataset with feature names no longer keeps the old feature names stored in
268  the `feature_names_in_` attribute. :pr:`21389` by
269  :user:`Jérémie du Boisberranger <jeremiedbb>`.
270
271.. _changes_1_0:
272
273Version 1.0.0
274=============
275
276**September 2021**
277
278For a short description of the main highlights of the release, please
279refer to
280:ref:`sphx_glr_auto_examples_release_highlights_plot_release_highlights_1_0_0.py`.
281
282.. include:: changelog_legend.inc
283
284Minimal dependencies
285--------------------
286
287Version 1.0.0 of scikit-learn requires python 3.7+, numpy 1.14.6+ and
288scipy 1.1.0+. Optional minimal dependency is matplotlib 2.2.2+.
289
290Enforcing keyword-only arguments
291--------------------------------
292
293In an effort to promote clear and non-ambiguous use of the library, most
294constructor and function parameters must now be passed as keyword arguments
295(i.e. using the `param=value` syntax) instead of positional. If a keyword-only
296parameter is used as positional, a `TypeError` is now raised.
297:issue:`15005` :pr:`20002` by `Joel Nothman`_, `Adrin Jalali`_, `Thomas Fan`_,
298`Nicolas Hug`_, and `Tom Dupre la Tour`_. See `SLEP009
299<https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep009/proposal.html>`_
300for more details.
301
302Changed models
303--------------
304
305The following estimators and functions, when fit with the same data and
306parameters, may produce different models from the previous version. This often
307occurs due to changes in the modelling logic (bug fixes or enhancements), or in
308random sampling procedures.
309
310- |Fix| :class:`manifold.TSNE` now avoids numerical underflow issues during
311  affinity matrix computation.
312
313- |Fix| :class:`manifold.Isomap` now connects disconnected components of the
314  neighbors graph along some minimum distance pairs, instead of changing
315  every infinite distances to zero.
316
317- |Fix| The splitting criterion of :class:`tree.DecisionTreeClassifier` and
318  :class:`tree.DecisionTreeRegressor` can be impacted by a fix in the handling
319  of rounding errors. Previously some extra spurious splits could occur.
320
321Details are listed in the changelog below.
322
323(While we are trying to better inform users by providing this information, we
324cannot assure that this list is complete.)
325
326
327Changelog
328---------
329
330..
331    Entries should be grouped by module (in alphabetic order) and prefixed with
332    one of the labels: |MajorFeature|, |Feature|, |Efficiency|, |Enhancement|,
333    |Fix| or |API| (see whats_new.rst for descriptions).
334    Entries should be ordered by those labels (e.g. |Fix| after |Efficiency|).
335    Changes not specific to a module should be listed under *Multiple Modules*
336    or *Miscellaneous*.
337    Entries should end with:
338    :pr:`123456` by :user:`Joe Bloggs <joeongithub>`.
339    where 123456 is the *pull request* number, not the issue number.
340
341- |API| The option for using the squared error via ``loss`` and
342  ``criterion`` parameters was made more consistent. The preferred way is by
343  setting the value to `"squared_error"`. Old option names are still valid,
344  produce the same models, but are deprecated and will be removed in version
345  1.2.
346  :pr:`19310` by :user:`Christian Lorentzen <lorentzenchr>`.
347
348  - For :class:`ensemble.ExtraTreesRegressor`, `criterion="mse"` is deprecated,
349    use `"squared_error"` instead which is now the default.
350
351  - For :class:`ensemble.GradientBoostingRegressor`, `loss="ls"` is deprecated,
352    use `"squared_error"` instead which is now the default.
353
354  - For :class:`ensemble.RandomForestRegressor`, `criterion="mse"` is deprecated,
355    use `"squared_error"` instead which is now the default.
356
357  - For :class:`ensemble.HistGradientBoostingRegressor`, `loss="least_squares"`
358    is deprecated, use `"squared_error"` instead which is now the default.
359
360  - For :class:`linear_model.RANSACRegressor`, `loss="squared_loss"` is
361    deprecated, use `"squared_error"` instead.
362
363  - For :class:`linear_model.SGDRegressor`, `loss="squared_loss"` is
364    deprecated, use `"squared_error"` instead which is now the default.
365
366  - For :class:`tree.DecisionTreeRegressor`, `criterion="mse"` is deprecated,
367    use `"squared_error"` instead which is now the default.
368
369  - For :class:`tree.ExtraTreeRegressor`, `criterion="mse"` is deprecated,
370    use `"squared_error"` instead which is now the default.
371
372- |API| The option for using the absolute error via ``loss`` and
373  ``criterion`` parameters was made more consistent. The preferred way is by
374  setting the value to `"absolute_error"`. Old option names are still valid,
375  produce the same models, but are deprecated and will be removed in version
376  1.2.
377  :pr:`19733` by :user:`Christian Lorentzen <lorentzenchr>`.
378
379  - For :class:`ensemble.ExtraTreesRegressor`, `criterion="mae"` is deprecated,
380    use `"absolute_error"` instead.
381
382  - For :class:`ensemble.GradientBoostingRegressor`, `loss="lad"` is deprecated,
383    use `"absolute_error"` instead.
384
385  - For :class:`ensemble.RandomForestRegressor`, `criterion="mae"` is deprecated,
386    use `"absolute_error"` instead.
387
388  - For :class:`ensemble.HistGradientBoostingRegressor`,
389    `loss="least_absolute_deviation"` is deprecated, use `"absolute_error"`
390    instead.
391
392  - For :class:`linear_model.RANSACRegressor`, `loss="absolute_loss"` is
393    deprecated, use `"absolute_error"` instead which is now the default.
394
395  - For :class:`tree.DecisionTreeRegressor`, `criterion="mae"` is deprecated,
396    use `"absolute_error"` instead.
397
398  - For :class:`tree.ExtraTreeRegressor`, `criterion="mae"` is deprecated,
399    use `"absolute_error"` instead.
400
401- |API| `np.matrix` usage is deprecated in 1.0 and will raise a `TypeError` in
402  1.2. :pr:`20165` by `Thomas Fan`_.
403
404- |API| :term:`get_feature_names_out` has been added to the transformer API
405  to get the names of the output features. :term:`get_feature_names` has in
406  turn been deprecated. :pr:`18444` by `Thomas Fan`_.
407
408- |API| All estimators store `feature_names_in_` when fitted on pandas Dataframes.
409  These feature names are compared to names seen in non-`fit` methods, e.g.
410  `transform` and will raise a `FutureWarning` if they are not consistent.
411  These ``FutureWarning`` s will become ``ValueError`` s in 1.2. :pr:`18010` by
412  `Thomas Fan`_.
413
414:mod:`sklearn.base`
415...................
416
417- |Fix| :func:`config_context` is now threadsafe. :pr:`18736` by `Thomas Fan`_.
418
419:mod:`sklearn.calibration`
420..........................
421
422- |Feature| :func:`calibration.CalibrationDisplay` added to plot
423  calibration curves. :pr:`17443` by :user:`Lucy Liu <lucyleeow>`.
424
425- |Fix| The ``predict`` and ``predict_proba`` methods of
426  :class:`calibration.CalibratedClassifierCV` can now properly be used on
427  prefitted pipelines. :pr:`19641` by :user:`Alek Lefebvre <AlekLefebvre>`.
428
429- |Fix| Fixed an error when using a :class:`ensemble.VotingClassifier`
430  as `base_estimator` in :class:`calibration.CalibratedClassifierCV`.
431  :pr:`20087` by :user:`Clément Fauchereau <clement-f>`.
432
433
434:mod:`sklearn.cluster`
435......................
436
437- |Efficiency| The ``"k-means++"`` initialization of :class:`cluster.KMeans`
438  and :class:`cluster.MiniBatchKMeans` is now faster, especially in multicore
439  settings. :pr:`19002` by :user:`Jon Crall <Erotemic>` and :user:`Jérémie du
440  Boisberranger <jeremiedbb>`.
441
442- |Efficiency| :class:`cluster.KMeans` with `algorithm='elkan'` is now faster
443  in multicore settings. :pr:`19052` by
444  :user:`Yusuke Nagasaka <YusukeNagasaka>`.
445
446- |Efficiency| :class:`cluster.MiniBatchKMeans` is now faster in multicore
447  settings. :pr:`17622` by :user:`Jérémie du Boisberranger <jeremiedbb>`.
448
449- |Efficiency| :class:`cluster.OPTICS` can now cache the output of the
450  computation of the tree, using the `memory` parameter.  :pr:`19024` by
451  :user:`Frankie Robertson <frankier>`.
452
453- |Enhancement| The `predict` and `fit_predict` methods of
454  :class:`cluster.AffinityPropagation` now accept sparse data type for input
455  data.
456  :pr:`20117` by :user:`Venkatachalam Natchiappan <venkyyuvy>`
457
458- |Fix| Fixed a bug in :class:`cluster.MiniBatchKMeans` where the sample
459  weights were partially ignored when the input is sparse. :pr:`17622` by
460  :user:`Jérémie du Boisberranger <jeremiedbb>`.
461
462- |Fix| Improved convergence detection based on center change in
463  :class:`cluster.MiniBatchKMeans` which was almost never achievable.
464  :pr:`17622` by :user:`Jérémie du Boisberranger <jeremiedbb>`.
465
466- |FIX| :class:`cluster.AgglomerativeClustering` now supports readonly
467  memory-mapped datasets.
468  :pr:`19883` by :user:`Julien Jerphanion <jjerphan>`.
469
470- |Fix| :class:`cluster.AgglomerativeClustering` correctly connects components
471  when connectivity and affinity are both precomputed and the number
472  of connected components is greater than 1. :pr:`20597` by
473  `Thomas Fan`_.
474
475- |Fix| :class:`cluster.FeatureAgglomeration` does not accept a ``**params`` kwarg in
476  the ``fit`` function anymore, resulting in a more concise error message. :pr:`20899`
477  by :user:`Adam Li <adam2392>`.
478
479- |Fix| Fixed a bug in :class:`cluster.KMeans`, ensuring reproducibility and equivalence
480  between sparse and dense input. :pr:`20200`
481  by :user:`Jérémie du Boisberranger <jeremiedbb>`.
482
483- |API| :class:`cluster.Birch` attributes, `fit_` and `partial_fit_`, are
484  deprecated and will be removed in 1.2. :pr:`19297` by `Thomas Fan`_.
485
486- |API| the default value for the `batch_size` parameter of
487  :class:`cluster.MiniBatchKMeans` was changed from 100 to 1024 due to
488  efficiency reasons. The `n_iter_` attribute of
489  :class:`cluster.MiniBatchKMeans` now reports the number of started epochs and
490  the `n_steps_` attribute reports the number of mini batches processed.
491  :pr:`17622` by :user:`Jérémie du Boisberranger <jeremiedbb>`.
492
493- |API| :func:`cluster.spectral_clustering` raises an improved error when passed
494  a `np.matrix`. :pr:`20560` by `Thomas Fan`_.
495
496:mod:`sklearn.compose`
497......................
498
499- |Enhancement| :class:`compose.ColumnTransformer` now records the output
500  of each transformer in `output_indices_`. :pr:`18393` by
501  :user:`Luca Bittarello <lbittarello>`.
502
503- |Enhancement| :class:`compose.ColumnTransformer` now allows DataFrame input to
504  have its columns appear in a changed order in `transform`. Further, columns that
505  are dropped will not be required in transform, and additional columns will be
506  ignored if `remainder='drop'`. :pr:`19263` by `Thomas Fan`_.
507
508- |Enhancement| Adds `**predict_params` keyword argument to
509  :meth:`compose.TransformedTargetRegressor.predict` that passes keyword
510  argument to the regressor.
511  :pr:`19244` by :user:`Ricardo <ricardojnf>`.
512
513- |FIX| :meth:`compose.ColumnTransformer.get_feature_names` supports
514  non-string feature names returned by any of its transformers. However, note
515  that ``get_feature_names`` is deprecated, use ``get_feature_names_out``
516  instead. :pr:`18459` by :user:`Albert Villanova del Moral <albertvillanova>`
517  and :user:`Alonso Silva Allende <alonsosilvaallende>`.
518
519- |Fix| :class:`compose.TransformedTargetRegressor` now takes nD targets with
520  an adequate transformer.
521  :pr:`18898` by :user:`Oras Phongpanagnam <panangam>`.
522
523- |API| Adds `verbose_feature_names_out` to :class:`compose.ColumnTransformer`.
524  This flag controls the prefixing of feature names out in
525  :term:`get_feature_names_out`. :pr:`18444` and :pr:`21080` by `Thomas Fan`_.
526
527:mod:`sklearn.covariance`
528.........................
529
530- |Fix| Adds arrays check to :func:`covariance.ledoit_wolf` and
531  :func:`covariance.ledoit_wolf_shrinkage`. :pr:`20416` by :user:`Hugo Defois
532  <defoishugo>`.
533
534- |API| Deprecates the following keys in `cv_results_`: `'mean_score'`,
535  `'std_score'`, and `'split(k)_score'` in favor of `'mean_test_score'`
536  `'std_test_score'`, and `'split(k)_test_score'`. :pr:`20583` by `Thomas Fan`_.
537
538:mod:`sklearn.datasets`
539.......................
540
541- |Enhancement| :func:`datasets.fetch_openml` now supports categories with
542  missing values when returning a pandas dataframe. :pr:`19365` by
543  `Thomas Fan`_ and :user:`Amanda Dsouza <amy12xx>` and
544  :user:`EL-ATEIF Sara <elateifsara>`.
545
546- |Enhancement| :func:`datasets.fetch_kddcup99` raises a better message
547  when the cached file is invalid. :pr:`19669` `Thomas Fan`_.
548
549- |Enhancement| Replace usages of ``__file__`` related to resource file I/O
550  with ``importlib.resources`` to avoid the assumption that these resource
551  files (e.g. ``iris.csv``) already exist on a filesystem, and by extension
552  to enable compatibility with tools such as ``PyOxidizer``.
553  :pr:`20297` by :user:`Jack Liu <jackzyliu>`.
554
555- |Fix| Shorten data file names in the openml tests to better support
556  installing on Windows and its default 260 character limit on file names.
557  :pr:`20209` by `Thomas Fan`_.
558
559- |Fix| :func:`datasets.fetch_kddcup99` returns dataframes when
560  `return_X_y=True` and `as_frame=True`. :pr:`19011` by `Thomas Fan`_.
561
562- |API| Deprecates :func:`datasets.load_boston` in 1.0 and it will be removed
563  in 1.2. Alternative code snippets to load similar datasets are provided.
564  Please report to the docstring of the function for details.
565  :pr:`20729` by `Guillaume Lemaitre`_.
566
567
568:mod:`sklearn.decomposition`
569............................
570
571- |Enhancement| added a new approximate solver (randomized SVD, available with
572  `eigen_solver='randomized'`) to :class:`decomposition.KernelPCA`. This
573  significantly accelerates computation when the number of samples is much
574  larger than the desired number of components.
575  :pr:`12069` by :user:`Sylvain Marié <smarie>`.
576
577- |Fix| Fixes incorrect multiple data-conversion warnings when clustering
578  boolean data. :pr:`19046` by :user:`Surya Prakash <jdsurya>`.
579
580- |Fix| Fixed :func:`dict_learning`, used by
581  :class:`decomposition.DictionaryLearning`, to ensure determinism of the
582  output. Achieved by flipping signs of the SVD output which is used to
583  initialize the code. :pr:`18433` by :user:`Bruno Charron <brcharron>`.
584
585- |Fix| Fixed a bug in :class:`decomposition.MiniBatchDictionaryLearning`,
586  :class:`decomposition.MiniBatchSparsePCA` and
587  :func:`decomposition.dict_learning_online` where the update of the dictionary
588  was incorrect. :pr:`19198` by :user:`Jérémie du Boisberranger <jeremiedbb>`.
589
590- |Fix| Fixed a bug in :class:`decomposition.DictionaryLearning`,
591  :class:`decomposition.SparsePCA`,
592  :class:`decomposition.MiniBatchDictionaryLearning`,
593  :class:`decomposition.MiniBatchSparsePCA`,
594  :func:`decomposition.dict_learning` and
595  :func:`decomposition.dict_learning_online` where the restart of unused atoms
596  during the dictionary update was not working as expected. :pr:`19198` by
597  :user:`Jérémie du Boisberranger <jeremiedbb>`.
598
599- |API| In :class:`decomposition.DictionaryLearning`,
600  :class:`decomposition.MiniBatchDictionaryLearning`,
601  :func:`decomposition.dict_learning` and
602  :func:`decomposition.dict_learning_online`, `transform_alpha` will be equal
603  to `alpha` instead of 1.0 by default starting from version 1.2 :pr:`19159` by
604  :user:`Benoît Malézieux <bmalezieux>`.
605
606- |API| Rename variable names in :class:`KernelPCA` to improve
607  readability. `lambdas_` and `alphas_` are renamed to `eigenvalues_`
608  and `eigenvectors_`, respectively. `lambdas_` and `alphas_` are
609  deprecated and will be removed in 1.2.
610  :pr:`19908` by :user:`Kei Ishikawa <kstoneriv3>`.
611
612- |API| The `alpha` and `regularization` parameters of :class:`decomposition.NMF` and
613  :func:`decomposition.non_negative_factorization` are deprecated and will be removed
614  in 1.2. Use the new parameters `alpha_W` and `alpha_H` instead. :pr:`20512` by
615  :user:`Jérémie du Boisberranger <jeremiedbb>`.
616
617:mod:`sklearn.dummy`
618....................
619
620- |API| Attribute `n_features_in_` in :class:`dummy.DummyRegressor` and
621  :class:`dummy.DummyRegressor` is deprecated and will be removed in 1.2.
622  :pr:`20960` by `Thomas Fan`_.
623
624:mod:`sklearn.ensemble`
625.......................
626
627- |Enhancement| :class:`~sklearn.ensemble.HistGradientBoostingClassifier` and
628  :class:`~sklearn.ensemble.HistGradientBoostingRegressor` take cgroups quotas
629  into account when deciding the number of threads used by OpenMP. This
630  avoids performance problems caused by over-subscription when using those
631  classes in a docker container for instance. :pr:`20477`
632  by `Thomas Fan`_.
633
634- |Enhancement| :class:`~sklearn.ensemble.HistGradientBoostingClassifier` and
635  :class:`~sklearn.ensemble.HistGradientBoostingRegressor` are no longer
636  experimental. They are now considered stable and are subject to the same
637  deprecation cycles as all other estimators. :pr:`19799` by `Nicolas Hug`_.
638
639- |Enhancement| Improve the HTML rendering of the
640  :class:`ensemble.StackingClassifier` and :class:`ensemble.StackingRegressor`.
641  :pr:`19564` by `Thomas Fan`_.
642
643- |Enhancement| Added Poisson criterion to
644  :class:`ensemble.RandomForestRegressor`. :pr:`19836` by :user:`Brian Sun
645  <bsun94>`.
646
647- |Fix| Do not allow to compute out-of-bag (OOB) score in
648  :class:`ensemble.RandomForestClassifier` and
649  :class:`ensemble.ExtraTreesClassifier` with multiclass-multioutput target
650  since scikit-learn does not provide any metric supporting this type of
651  target. Additional private refactoring was performed.
652  :pr:`19162` by :user:`Guillaume Lemaitre <glemaitre>`.
653
654- |Fix| Improve numerical precision for weights boosting in
655  :class:`ensemble.AdaBoostClassifier` and :class:`ensemble.AdaBoostRegressor`
656  to avoid underflows.
657  :pr:`10096` by :user:`Fenil Suchak <fenilsuchak>`.
658
659- |Fix| Fixed the range of the argument ``max_samples`` to be ``(0.0, 1.0]``
660  in :class:`ensemble.RandomForestClassifier`,
661  :class:`ensemble.RandomForestRegressor`, where `max_samples=1.0` is
662  interpreted as using all `n_samples` for bootstrapping. :pr:`20159` by
663  :user:`murata-yu`.
664
665- |Fix| Fixed a bug in :class:`ensemble.AdaBoostClassifier` and
666  :class:`ensemble.AdaBoostRegressor` where the `sample_weight` parameter
667  got overwritten during `fit`.
668  :pr:`20534` by :user:`Guillaume Lemaitre <glemaitre>`.
669
670- |API| Removes `tol=None` option in
671  :class:`ensemble.HistGradientBoostingClassifier` and
672  :class:`ensemble.HistGradientBoostingRegressor`. Please use `tol=0` for
673  the same behavior. :pr:`19296` by `Thomas Fan`_.
674
675:mod:`sklearn.feature_extraction`
676.................................
677
678- |Fix| Fixed a bug in :class:`feature_extraction.text.HashingVectorizer`
679  where some input strings would result in negative indices in the transformed
680  data. :pr:`19035` by :user:`Liu Yu <ly648499246>`.
681
682- |Fix| Fixed a bug in :class:`feature_extraction.DictVectorizer` by raising an
683  error with unsupported value type.
684  :pr:`19520` by :user:`Jeff Zhao <kamiyaa>`.
685
686- |Fix| Fixed a bug in :func:`feature_extraction.image.img_to_graph`
687  and :func:`feature_extraction.image.grid_to_graph` where singleton connected
688  components were not handled properly, resulting in a wrong vertex indexing.
689  :pr:`18964` by `Bertrand Thirion`_.
690
691- |Fix| Raise a warning in :class:`feature_extraction.text.CountVectorizer`
692  with `lowercase=True` when there are vocabulary entries with uppercase
693  characters to avoid silent misses in the resulting feature vectors.
694  :pr:`19401` by :user:`Zito Relova <zitorelova>`
695
696:mod:`sklearn.feature_selection`
697................................
698
699- |Feature| :func:`feature_selection.r_regression` computes Pearson's R
700  correlation coefficients between the features and the target.
701  :pr:`17169` by :user:`Dmytro Lituiev <DSLituiev>`
702  and :user:`Julien Jerphanion <jjerphan>`.
703
704- |Enhancement| :func:`feature_selection.RFE.fit` accepts additional estimator
705  parameters that are passed directly to the estimator's `fit` method.
706  :pr:`20380` by :user:`Iván Pulido <ijpulidos>`, :user:`Felipe Bidu <fbidu>`,
707  :user:`Gil Rutter <g-rutter>`, and :user:`Adrin Jalali <adrinjalali>`.
708
709- |FIX| Fix a bug in :func:`isotonic.isotonic_regression` where the
710  `sample_weight` passed by a user were overwritten during ``fit``.
711  :pr:`20515` by :user:`Carsten Allefeld <allefeld>`.
712
713- |Fix| Change :func:`feature_selection.SequentialFeatureSelector` to
714  allow for unsupervised modelling so that the `fit` signature need not
715  do any `y` validation and allow for `y=None`.
716  :pr:`19568` by :user:`Shyam Desai <ShyamDesai>`.
717
718- |API| Raises an error in :class:`feature_selection.VarianceThreshold`
719  when the variance threshold is negative.
720  :pr:`20207` by :user:`Tomohiro Endo <europeanplaice>`
721
722- |API| Deprecates `grid_scores_` in favor of split scores in `cv_results_` in
723  :class:`feature_selection.RFECV`. `grid_scores_` will be removed in
724  version 1.2.
725  :pr:`20161` by :user:`Shuhei Kayawari <wowry>` and :user:`arka204`.
726
727:mod:`sklearn.inspection`
728.........................
729
730- |Enhancement| Add `max_samples` parameter in
731  :func:`inspection.permutation_importance`. It enables to draw a subset of the
732  samples to compute the permutation importance. This is useful to keep the
733  method tractable when evaluating feature importance on large datasets.
734  :pr:`20431` by :user:`Oliver Pfaffel <o1iv3r>`.
735
736- |Enhancement| Add kwargs to format ICE and PD lines separately in partial
737  dependence plots :func:`inspection.plot_partial_dependence` and
738  :meth:`inspection.PartialDependenceDisplay.plot`. :pr:`19428` by :user:`Mehdi
739  Hamoumi <mhham>`.
740
741- |Fix| Allow multiple scorers input to
742  :func:`inspection.permutation_importance`. :pr:`19411` by :user:`Simona
743  Maggio <simonamaggio>`.
744
745- |API| :class:`inspection.PartialDependenceDisplay` exposes a class method:
746  :func:`~inspection.PartialDependenceDisplay.from_estimator`.
747  :func:`inspection.plot_partial_dependence` is deprecated in favor of the
748  class method and will be removed in 1.2. :pr:`20959` by `Thomas Fan`_.
749
750:mod:`sklearn.kernel_approximation`
751...................................
752
753- |Fix| Fix a bug in :class:`kernel_approximation.Nystroem`
754  where the attribute `component_indices_` did not correspond to the subset of
755  sample indices used to generate the approximated kernel. :pr:`20554` by
756  :user:`Xiangyin Kong <kxytim>`.
757
758:mod:`sklearn.linear_model`
759...........................
760
761- |Feature| Added :class:`linear_model.QuantileRegressor` which implements
762  linear quantile regression with L1 penalty.
763  :pr:`9978` by :user:`David Dale <avidale>` and
764  :user:`Christian Lorentzen <lorentzenchr>`.
765
766- |Feature| The new :class:`linear_model.SGDOneClassSVM` provides an SGD
767  implementation of the linear One-Class SVM. Combined with kernel
768  approximation techniques, this implementation approximates the solution of
769  a kernelized One Class SVM while benefitting from a linear
770  complexity in the number of samples.
771  :pr:`10027` by :user:`Albert Thomas <albertcthomas>`.
772
773- |Feature| Added `sample_weight` parameter to
774  :class:`linear_model.LassoCV` and :class:`linear_model.ElasticNetCV`.
775  :pr:`16449` by :user:`Christian Lorentzen <lorentzenchr>`.
776
777- |Feature| Added new solver `lbfgs` (available with `solver="lbfgs"`)
778  and `positive` argument to :class:`linear_model.Ridge`. When `positive` is
779  set to `True`, forces the coefficients to be positive (only supported by
780  `lbfgs`). :pr:`20231` by :user:`Toshihiro Nakae <tnakae>`.
781
782- |Efficiency| The implementation of :class:`linear_model.LogisticRegression`
783  has been optimised for dense matrices when using `solver='newton-cg'` and
784  `multi_class!='multinomial'`.
785  :pr:`19571` by :user:`Julien Jerphanion <jjerphan>`.
786
787- |Enhancement| `fit` method preserves dtype for numpy.float32 in
788  :class:`linear_model.Lars`, :class:`linear_model.LassoLars`,
789  :class:`linear_model.LassoLars`, :class:`linear_model.LarsCV` and
790  :class:`linear_model.LassoLarsCV`. :pr:`20155` by :user:`Takeshi Oura
791  <takoika>`.
792
793- |Enhancement| Validate user-supplied gram matrix passed to linear models
794  via the `precompute` argument. :pr:`19004` by :user:`Adam Midvidy <amidvidy>`.
795
796- |Fix| :meth:`linear_model.ElasticNet.fit` no longer modifies `sample_weight`
797  in place. :pr:`19055` by `Thomas Fan`_.
798
799- |Fix| :class:`linear_model.Lasso` and :class:`linear_model.ElasticNet` no
800  longer have a `dual_gap_` not corresponding to their objective. :pr:`19172`
801  by :user:`Mathurin Massias <mathurinm>`
802
803- |Fix| `sample_weight` are now fully taken into account in linear models
804  when `normalize=True` for both feature centering and feature
805  scaling.
806  :pr:`19426` by :user:`Alexandre Gramfort <agramfort>` and
807  :user:`Maria Telenczuk <maikia>`.
808
809- |Fix| Points with residuals equal to  ``residual_threshold`` are now considered
810  as inliers for :class:`linear_model.RANSACRegressor`. This allows fitting
811  a model perfectly on some datasets when `residual_threshold=0`.
812  :pr:`19499` by :user:`Gregory Strubel <gregorystrubel>`.
813
814- |Fix| Sample weight invariance for :class:`linear_model.Ridge` was fixed in
815  :pr:`19616` by :user:`Oliver Grisel <ogrisel>` and :user:`Christian Lorentzen
816  <lorentzenchr>`.
817
818- |Fix| The dictionary `params` in :func:`linear_model.enet_path` and
819  :func:`linear_model.lasso_path` should only contain parameter of the
820  coordinate descent solver. Otherwise, an error will be raised.
821  :pr:`19391` by :user:`Shao Yang Hong <hongshaoyang>`.
822
823- |API| Raise a warning in :class:`linear_model.RANSACRegressor` that from
824  version 1.2, `min_samples` need to be set explicitly for models other than
825  :class:`linear_model.LinearRegression`. :pr:`19390` by :user:`Shao Yang Hong
826  <hongshaoyang>`.
827
828- |API|: The parameter ``normalize`` of :class:`linear_model.LinearRegression`
829  is deprecated and will be removed in 1.2. Motivation for this deprecation:
830  ``normalize`` parameter did not take any effect if ``fit_intercept`` was set
831  to False and therefore was deemed confusing. The behavior of the deprecated
832  ``LinearModel(normalize=True)`` can be reproduced with a
833  :class:`~sklearn.pipeline.Pipeline` with ``LinearModel`` (where
834  ``LinearModel`` is :class:`~linear_model.LinearRegression`,
835  :class:`~linear_model.Ridge`, :class:`~linear_model.RidgeClassifier`,
836  :class:`~linear_model.RidgeCV` or :class:`~linear_model.RidgeClassifierCV`)
837  as follows: ``make_pipeline(StandardScaler(with_mean=False),
838  LinearModel())``. The ``normalize`` parameter in
839  :class:`~linear_model.LinearRegression` was deprecated in :pr:`17743` by
840  :user:`Maria Telenczuk <maikia>` and :user:`Alexandre Gramfort <agramfort>`.
841  Same for :class:`~linear_model.Ridge`,
842  :class:`~linear_model.RidgeClassifier`, :class:`~linear_model.RidgeCV`, and
843  :class:`~linear_model.RidgeClassifierCV`, in: :pr:`17772` by :user:`Maria
844  Telenczuk <maikia>` and :user:`Alexandre Gramfort <agramfort>`. Same for
845  :class:`~linear_model.BayesianRidge`, :class:`~linear_model.ARDRegression`
846  in: :pr:`17746` by :user:`Maria Telenczuk <maikia>`. Same for
847  :class:`~linear_model.Lasso`, :class:`~linear_model.LassoCV`,
848  :class:`~linear_model.ElasticNet`, :class:`~linear_model.ElasticNetCV`,
849  :class:`~linear_model.MultiTaskLasso`,
850  :class:`~linear_model.MultiTaskLassoCV`,
851  :class:`~linear_model.MultiTaskElasticNet`,
852  :class:`~linear_model.MultiTaskElasticNetCV`, in: :pr:`17785` by :user:`Maria
853  Telenczuk <maikia>` and :user:`Alexandre Gramfort <agramfort>`.
854
855- |API| The ``normalize`` parameter of
856  :class:`~linear_model.OrthogonalMatchingPursuit` and
857  :class:`~linear_model.OrthogonalMatchingPursuitCV` will default to False in
858  1.2 and will be removed in 1.4. :pr:`17750` by :user:`Maria Telenczuk
859  <maikia>` and :user:`Alexandre Gramfort <agramfort>`. Same for
860  :class:`~linear_model.Lars` :class:`~linear_model.LarsCV`
861  :class:`~linear_model.LassoLars` :class:`~linear_model.LassoLarsCV`
862  :class:`~linear_model.LassoLarsIC`, in :pr:`17769` by :user:`Maria Telenczuk
863  <maikia>` and :user:`Alexandre Gramfort <agramfort>`.
864
865- |API| Keyword validation has moved from `__init__` and `set_params` to `fit`
866  for the following estimators conforming to scikit-learn's conventions:
867  :class:`~linear_model.SGDClassifier`,
868  :class:`~linear_model.SGDRegressor`,
869  :class:`~linear_model.SGDOneClassSVM`,
870  :class:`~linear_model.PassiveAggressiveClassifier`, and
871  :class:`~linear_model.PassiveAggressiveRegressor`.
872  :pr:`20683` by `Guillaume Lemaitre`_.
873
874:mod:`sklearn.manifold`
875.......................
876
877- |Enhancement| Implement `'auto'` heuristic for the `learning_rate` in
878  :class:`manifold.TSNE`. It will become default in 1.2. The default
879  initialization will change to `pca` in 1.2. PCA initialization will
880  be scaled to have standard deviation 1e-4 in 1.2.
881  :pr:`19491` by :user:`Dmitry Kobak <dkobak>`.
882
883- |Fix| Change numerical precision to prevent underflow issues
884  during affinity matrix computation for :class:`manifold.TSNE`.
885  :pr:`19472` by :user:`Dmitry Kobak <dkobak>`.
886
887- |Fix| :class:`manifold.Isomap` now uses `scipy.sparse.csgraph.shortest_path`
888  to compute the graph shortest path. It also connects disconnected components
889  of the neighbors graph along some minimum distance pairs, instead of changing
890  every infinite distances to zero. :pr:`20531` by `Roman Yurchak`_ and `Tom
891  Dupre la Tour`_.
892
893- |Fix| Decrease the numerical default tolerance in the lobpcg call
894  in :func:`manifold.spectral_embedding` to prevent numerical instability.
895  :pr:`21194` by :user:`Andrew Knyazev <lobpcg>`.
896
897:mod:`sklearn.metrics`
898......................
899
900- |Feature| :func:`metrics.mean_pinball_loss` exposes the pinball loss for
901  quantile regression. :pr:`19415` by :user:`Xavier Dupré <sdpython>`
902  and :user:`Oliver Grisel <ogrisel>`.
903
904- |Feature| :func:`metrics.d2_tweedie_score` calculates the D^2 regression
905  score for Tweedie deviances with power parameter ``power``. This is a
906  generalization of the `r2_score` and can be interpreted as percentage of
907  Tweedie deviance explained.
908  :pr:`17036` by :user:`Christian Lorentzen <lorentzenchr>`.
909
910- |Feature|  :func:`metrics.mean_squared_log_error` now supports
911  `squared=False`.
912  :pr:`20326` by :user:`Uttam kumar <helper-uttam>`.
913
914- |Efficiency| Improved speed of :func:`metrics.confusion_matrix` when labels
915  are integral.
916  :pr:`9843` by :user:`Jon Crall <Erotemic>`.
917
918- |Enhancement| A fix to raise an error in :func:`metrics.hinge_loss` when
919  ``pred_decision`` is 1d whereas it is a multiclass classification or when
920  ``pred_decision`` parameter is not consistent with the ``labels`` parameter.
921  :pr:`19643` by :user:`Pierre Attard <PierreAttard>`.
922
923- |Fix| :meth:`metrics.ConfusionMatrixDisplay.plot` uses the correct max
924  for colormap. :pr:`19784` by `Thomas Fan`_.
925
926- |Fix| Samples with zero `sample_weight` values do not affect the results
927  from :func:`metrics.det_curve`, :func:`metrics.precision_recall_curve`
928  and :func:`metrics.roc_curve`.
929  :pr:`18328` by :user:`Albert Villanova del Moral <albertvillanova>` and
930  :user:`Alonso Silva Allende <alonsosilvaallende>`.
931
932- |Fix| avoid overflow in :func:`metrics.cluster.adjusted_rand_score` with
933  large amount of data. :pr:`20312` by :user:`Divyanshu Deoli
934  <divyanshudeoli>`.
935
936- |API| :class:`metrics.ConfusionMatrixDisplay` exposes two class methods
937  :func:`~metrics.ConfusionMatrixDisplay.from_estimator` and
938  :func:`~metrics.ConfusionMatrixDisplay.from_predictions` allowing to create
939  a confusion matrix plot using an estimator or the predictions.
940  :func:`metrics.plot_confusion_matrix` is deprecated in favor of these two
941  class methods and will be removed in 1.2.
942  :pr:`18543` by `Guillaume Lemaitre`_.
943
944- |API| :class:`metrics.PrecisionRecallDisplay` exposes two class methods
945  :func:`~metrics.PrecisionRecallDisplay.from_estimator` and
946  :func:`~metrics.PrecisionRecallDisplay.from_predictions` allowing to create
947  a precision-recall curve using an estimator or the predictions.
948  :func:`metrics.plot_precision_recall_curve` is deprecated in favor of these
949  two class methods and will be removed in 1.2.
950  :pr:`20552` by `Guillaume Lemaitre`_.
951
952- |API| :class:`metrics.DetCurveDisplay` exposes two class methods
953  :func:`~metrics.DetCurveDisplay.from_estimator` and
954  :func:`~metrics.DetCurveDisplay.from_predictions` allowing to create
955  a confusion matrix plot using an estimator or the predictions.
956  :func:`metrics.plot_det_curve` is deprecated in favor of these two
957  class methods and will be removed in 1.2.
958  :pr:`19278` by `Guillaume Lemaitre`_.
959
960:mod:`sklearn.mixture`
961......................
962
963- |Fix| Ensure that the best parameters are set appropriately
964  in the case of divergency for :class:`mixture.GaussianMixture` and
965  :class:`mixture.BayesianGaussianMixture`.
966  :pr:`20030` by :user:`Tingshan Liu <tliu68>` and
967  :user:`Benjamin Pedigo <bdpedigo>`.
968
969:mod:`sklearn.model_selection`
970..............................
971
972- |Feature| added :class:`model_selection.StratifiedGroupKFold`, that combines
973  :class:`model_selection.StratifiedKFold` and
974  :class:`model_selection.GroupKFold`, providing an ability to split data
975  preserving the distribution of classes in each split while keeping each
976  group within a single split.
977  :pr:`18649` by :user:`Leandro Hermida <hermidalc>` and
978  :user:`Rodion Martynov <marrodion>`.
979
980- |Enhancement| warn only once in the main process for per-split fit failures
981  in cross-validation. :pr:`20619` by :user:`Loïc Estève <lesteve>`
982
983- |Enhancement| The :class:`model_selection.BaseShuffleSplit` base class is
984  now public. :pr:`20056` by :user:`pabloduque0`.
985
986- |Fix| Avoid premature overflow in :func:`model_selection.train_test_split`.
987  :pr:`20904` by :user:`Tomasz Jakubek <t-jakubek>`.
988
989:mod:`sklearn.naive_bayes`
990..........................
991
992- |Fix| The `fit` and `partial_fit` methods of the discrete naive Bayes
993  classifiers (:class:`naive_bayes.BernoulliNB`,
994  :class:`naive_bayes.CategoricalNB`, :class:`naive_bayes.ComplementNB`,
995  and :class:`naive_bayes.MultinomialNB`) now correctly handle the degenerate
996  case of a single class in the training set.
997  :pr:`18925` by :user:`David Poznik <dpoznik>`.
998
999- |API| The attribute ``sigma_`` is now deprecated in
1000  :class:`naive_bayes.GaussianNB` and will be removed in 1.2.
1001  Use ``var_`` instead.
1002  :pr:`18842` by :user:`Hong Shao Yang <hongshaoyang>`.
1003
1004:mod:`sklearn.neighbors`
1005........................
1006
1007- |Enhancement| The creation of :class:`neighbors.KDTree` and
1008  :class:`neighbors.BallTree` has been improved for their worst-cases time
1009  complexity from :math:`\mathcal{O}(n^2)` to :math:`\mathcal{O}(n)`.
1010  :pr:`19473` by :user:`jiefangxuanyan <jiefangxuanyan>` and
1011  :user:`Julien Jerphanion <jjerphan>`.
1012
1013- |FIX| :class:`neighbors.DistanceMetric` subclasses now support readonly
1014  memory-mapped datasets. :pr:`19883` by :user:`Julien Jerphanion <jjerphan>`.
1015
1016- |FIX| :class:`neighbors.NearestNeighbors`, :class:`neighbors.KNeighborsClassifier`,
1017  :class:`neighbors.RadiusNeighborsClassifier`, :class:`neighbors.KNeighborsRegressor`
1018  and :class:`neighbors.RadiusNeighborsRegressor` do not validate `weights` in
1019  `__init__` and validates `weights` in `fit` instead. :pr:`20072` by
1020  :user:`Juan Carlos Alfaro Jiménez <alfaro96>`.
1021
1022- |API| The parameter `kwargs` of :class:`neighbors.RadiusNeighborsClassifier` is
1023  deprecated and will be removed in 1.2.
1024  :pr:`20842` by :user:`Juan Martín Loyola <jmloyola>`.
1025
1026:mod:`sklearn.neural_network`
1027.............................
1028
1029- |Fix| :class:`neural_network.MLPClassifier` and
1030  :class:`neural_network.MLPRegressor` now correctly support continued training
1031  when loading from a pickled file. :pr:`19631` by `Thomas Fan`_.
1032
1033:mod:`sklearn.pipeline`
1034.......................
1035
1036- |API| The `predict_proba` and `predict_log_proba` methods of the
1037  :class:`pipeline.Pipeline` now support passing prediction kwargs to the final
1038  estimator. :pr:`19790` by :user:`Christopher Flynn <crflynn>`.
1039
1040:mod:`sklearn.preprocessing`
1041............................
1042
1043- |Feature| The new :class:`preprocessing.SplineTransformer` is a feature
1044  preprocessing tool for the generation of B-splines, parametrized by the
1045  polynomial ``degree`` of the splines, number of knots ``n_knots`` and knot
1046  positioning strategy ``knots``.
1047  :pr:`18368` by :user:`Christian Lorentzen <lorentzenchr>`.
1048  :class:`preprocessing.SplineTransformer` also supports periodic
1049  splines via the ``extrapolation`` argument.
1050  :pr:`19483` by :user:`Malte Londschien <mlondschien>`.
1051  :class:`preprocessing.SplineTransformer` supports sample weights for
1052  knot position strategy ``"quantile"``.
1053  :pr:`20526` by :user:`Malte Londschien <mlondschien>`.
1054
1055- |Feature| :class:`preprocessing.OrdinalEncoder` supports passing through
1056  missing values by default. :pr:`19069` by `Thomas Fan`_.
1057
1058- |Feature| :class:`preprocessing.OneHotEncoder` now supports
1059  `handle_unknown='ignore'` and dropping categories. :pr:`19041` by
1060  `Thomas Fan`_.
1061
1062- |Feature| :class:`preprocessing.PolynomialFeatures` now supports passing
1063  a tuple to `degree`, i.e. `degree=(min_degree, max_degree)`.
1064  :pr:`20250` by :user:`Christian Lorentzen <lorentzenchr>`.
1065
1066- |Efficiency| :class:`preprocessing.StandardScaler` is faster and more memory
1067  efficient. :pr:`20652` by `Thomas Fan`_.
1068
1069- |Efficiency| Changed ``algorithm`` argument for :class:`cluster.KMeans` in
1070  :class:`preprocessing.KBinsDiscretizer` from ``auto`` to ``full``.
1071  :pr:`19934` by :user:`Gleb Levitskiy <GLevV>`.
1072
1073- |Efficiency| The implementation of `fit` for
1074  :class:`preprocessing.PolynomialFeatures` transformer is now faster. This is
1075  especially noticeable on large sparse input. :pr:`19734` by :user:`Fred
1076  Robinson <frrad>`.
1077
1078- |Fix| The :func:`preprocessing.StandardScaler.inverse_transform` method
1079  now raises error when the input data is 1D. :pr:`19752` by :user:`Zhehao Liu
1080  <Max1993Liu>`.
1081
1082- |Fix| :func:`preprocessing.scale`, :class:`preprocessing.StandardScaler`
1083  and similar scalers detect near-constant features to avoid scaling them to
1084  very large values. This problem happens in particular when using a scaler on
1085  sparse data with a constant column with sample weights, in which case
1086  centering is typically disabled. :pr:`19527` by :user:`Oliver Grisel
1087  <ogrisel>` and :user:`Maria Telenczuk <maikia>` and :pr:`19788` by
1088  :user:`Jérémie du Boisberranger <jeremiedbb>`.
1089
1090- |Fix| :meth:`preprocessing.StandardScaler.inverse_transform` now
1091  correctly handles integer dtypes. :pr:`19356` by :user:`makoeppel`.
1092
1093- |Fix| :meth:`preprocessing.OrdinalEncoder.inverse_transform` is not
1094  supporting sparse matrix and raises the appropriate error message.
1095  :pr:`19879` by :user:`Guillaume Lemaitre <glemaitre>`.
1096
1097- |Fix| The `fit` method of :class:`preprocessing.OrdinalEncoder` will not
1098  raise error when `handle_unknown='ignore'` and unknown categories are given
1099  to `fit`.
1100  :pr:`19906` by :user:`Zhehao Liu <MaxwellLZH>`.
1101
1102- |Fix| Fix a regression in :class:`preprocessing.OrdinalEncoder` where large
1103  Python numeric would raise an error due to overflow when casted to C type
1104  (`np.float64` or `np.int64`).
1105  :pr:`20727` by `Guillaume Lemaitre`_.
1106
1107- |Fix| :class:`preprocessing.FunctionTransformer` does not set `n_features_in_`
1108  based on the input to `inverse_transform`. :pr:`20961` by `Thomas Fan`_.
1109
1110- |API| The `n_input_features_` attribute of
1111  :class:`preprocessing.PolynomialFeatures` is deprecated in favor of
1112  `n_features_in_` and will be removed in 1.2. :pr:`20240` by
1113  :user:`Jérémie du Boisberranger <jeremiedbb>`.
1114
1115:mod:`sklearn.svm`
1116...................
1117
1118- |API| The parameter `**params` of :func:`svm.OneClassSVM.fit` is
1119  deprecated and will be removed in 1.2.
1120  :pr:`20843` by :user:`Juan Martín Loyola <jmloyola>`.
1121
1122:mod:`sklearn.tree`
1123...................
1124
1125- |Enhancement| Add `fontname` argument in :func:`tree.export_graphviz`
1126  for non-English characters. :pr:`18959` by :user:`Zero <Zeroto521>`
1127  and :user:`wstates <wstates>`.
1128
1129- |Fix| Improves compatibility of :func:`tree.plot_tree` with high DPI screens.
1130  :pr:`20023` by `Thomas Fan`_.
1131
1132- |Fix| Fixed a bug in :class:`tree.DecisionTreeClassifier`,
1133  :class:`tree.DecisionTreeRegressor` where a node could be split whereas it
1134  should not have been due to incorrect handling of rounding errors.
1135  :pr:`19336` by :user:`Jérémie du Boisberranger <jeremiedbb>`.
1136
1137- |API| The `n_features_` attribute of :class:`tree.DecisionTreeClassifier`,
1138  :class:`tree.DecisionTreeRegressor`, :class:`tree.ExtraTreeClassifier` and
1139  :class:`tree.ExtraTreeRegressor` is deprecated in favor of `n_features_in_`
1140  and will be removed in 1.2. :pr:`20272` by
1141  :user:`Jérémie du Boisberranger <jeremiedbb>`.
1142
1143:mod:`sklearn.utils`
1144....................
1145
1146- |Enhancement| Deprecated the default value of the `random_state=0` in
1147  :func:`~sklearn.utils.extmath.randomized_svd`. Starting in 1.2,
1148  the default value of `random_state` will be set to `None`.
1149  :pr:`19459` by :user:`Cindy Bezuidenhout <cinbez>` and
1150  :user:`Clifford Akai-Nettey<cliffordEmmanuel>`.
1151
1152- |Enhancement| Added helper decorator :func:`utils.metaestimators.available_if`
1153  to provide flexiblity in metaestimators making methods available or
1154  unavailable on the basis of state, in a more readable way.
1155  :pr:`19948` by `Joel Nothman`_.
1156
1157- |Enhancement| :func:`utils.validation.check_is_fitted` now uses
1158  ``__sklearn_is_fitted__`` if available, instead of checking for attributes
1159  ending with an underscore. This also makes :class:`pipeline.Pipeline` and
1160  :class:`preprocessing.FunctionTransformer` pass
1161  ``check_is_fitted(estimator)``. :pr:`20657` by `Adrin Jalali`_.
1162
1163- |Fix| Fixed a bug in :func:`utils.sparsefuncs.mean_variance_axis` where the
1164  precision of the computed variance was very poor when the real variance is
1165  exactly zero. :pr:`19766` by :user:`Jérémie du Boisberranger <jeremiedbb>`.
1166
1167- |Fix| The docstrings of propreties that are decorated with
1168  :func:`utils.deprecated` are now properly wrapped. :pr:`20385` by `Thomas
1169  Fan`_.
1170
1171- |Fix| :func:`utils.stats._weighted_percentile` now correctly ignores
1172  zero-weighted observations smaller than the smallest observation with
1173  positive weight for ``percentile=0``. Affected classes are
1174  :class:`dummy.DummyRegressor` for ``quantile=0`` and
1175  :class:`ensemble.HuberLossFunction` and :class:`ensemble.HuberLossFunction`
1176  for ``alpha=0``. :pr:`20528` by :user:`Malte Londschien <mlondschien>`.
1177
1178- |Fix| :func:`utils._safe_indexing` explicitly takes a dataframe copy when
1179  integer indices are provided avoiding to raise a warning from Pandas. This
1180  warning was previously raised in resampling utilities and functions using
1181  those utilities (e.g. :func:`model_selection.train_test_split`,
1182  :func:`model_selection.cross_validate`,
1183  :func:`model_selection.cross_val_score`,
1184  :func:`model_selection.cross_val_predict`).
1185  :pr:`20673` by :user:`Joris Van den Bossche  <jorisvandenbossche>`.
1186
1187- |Fix| Fix a regression in :func:`utils.is_scalar_nan` where large Python
1188  numbers would raise an error due to overflow in C types (`np.float64` or
1189  `np.int64`).
1190  :pr:`20727` by `Guillaume Lemaitre`_.
1191
1192- |Fix| Support for `np.matrix` is deprecated in
1193  :func:`~sklearn.utils.check_array` in 1.0 and will raise a `TypeError` in
1194  1.2. :pr:`20165` by `Thomas Fan`_.
1195
1196- |API| :func:`utils._testing.assert_warns` and
1197  :func:`utils._testing.assert_warns_message` are deprecated in 1.0 and will
1198  be removed in 1.2. Used `pytest.warns` context manager instead. Note that
1199  these functions were not documented and part from the public API.
1200  :pr:`20521` by :user:`Olivier Grisel <ogrisel>`.
1201
1202- |API| Fixed several bugs in :func:`utils.graph.graph_shortest_path`, which is
1203  now deprecated. Use `scipy.sparse.csgraph.shortest_path` instead. :pr:`20531`
1204  by `Tom Dupre la Tour`_.
1205
1206Code and Documentation Contributors
1207-----------------------------------
1208
1209Thanks to everyone who has contributed to the maintenance and improvement of
1210the project since version 0.24, including:
1211
1212Abdulelah S. Al Mesfer, Abhinav Gupta, Abhishek Gupta, Adam J. Stewart, Adam
1213Li, Adam Midvidy, adijohar, Aditya Kumawat, Adrian Garcia Badaracco, Adrian
1214Sadłocha, Adrin Jalali, Agamemnon Krasoulis, AJ Druck, Albert Thomas, Albert
1215Villanova del Moral, Alberto Mario Ceballos-Arroyo, Alberto Rubiales, Alek
1216Lefebvre, Alessia Marcolini, Alexandr Fonari, Alihan Zihna, Aline Ribeiro de
1217Almeida, almeidayoel, Amanda, Amanda Dsouza, Amol Deshmukh, amrcode, Ana
1218Pessoa, Anavelyz, András Simon, Andreas Mueller, Andrew Delong, Andrew Knyazev,
1219Angus L'Herrou, Arisa, Arth, Arturo Amor, Ashish, Ashvith Shetty, Atsushi
1220Nukariya, Aurélien Geron, Avi Gupta, Ayush Singh, baam, BaptBillard, Benjamin
1221Pedigo, Bertrand Thirion, Bharat Raghunathan, bmalezieux, Brian Rice, Brian
1222Sun, Bruno Charron, Bryan Chen, bumblebee, caherrera-meli, Carsten Allefeld,
1223CeeThinwa, Chiara Marmo, Chitteti Srinath Reddy, chrissobel, Christian
1224Lorentzen, Christian Ritter, Christopher Yeh, christopherlim98, Christos
1225Aridas, Chuliang Xiao, Clément Fauchereau, cliffordEmmanuel, Conner Shen,
1226Connor Tann, David Dale, David Katz, David Poznik, Dimitri Papadopoulos
1227Orfanos, Divyanshu Deoli, dmallia17, Dmitry Kobak, DS_anas, Eduardo Jardim,
1228EdwinWenink, EL-ATEIF Sara, Eleni Markou, Eric Fiegel, Eric Larson, Eric
1229Ndirangu, EricEllwanger, Erich Schubert, Estefania Barreto-Ojeda, eyast,
1230Ezri-Mudde, Fatos Morina, Federico Luna, Felipe Rodrigues, Felix Glushchenkov,
1231Felix Hafner, Fenil Suchak, flyingdutchman23, Flynn, Fortune Uwha, FPGAwesome,
1232Francois Berenger, Frankie Robertson, Frans Larsson, Frederick Robinson, Gabor
1233Kertesz, Gabriel S Vicente, Gabriel Stefanini Vicente, Gael Varoquaux, Gauthier
1234I, genvalen, Geoffrey Thomas, geroldcsendes, Giancarlo Pablo, Gleb Levitskiy,
1235Glen, glennfrutiz, Glòria Macià Muñoz, gregorystrubel, groceryheist, Guillaume
1236Lemaitre, guiweber, Haidar Almubarak, Hans Moritz Günther, Haoyin Xu, Harris
1237Mirza, Harry Wei, Harutaka Kawamura, Hassan Alsawadi, Helder Geovane Gomes de
1238Lima, Himanshu Kumar, Hugo DEFOIS, Igor Ilic, Ikko Ashimine, iofall, Isaack
1239Mungui, Ishaan Bhat, Ishan Kumar, Ishan Mishra, Iván Pulido, iwhalvic, J
1240Alexander, Jack Liu, jalexand3r, James Alan Preiss, James Budarz, James Lamb,
1241Jannik, Jauhar, Jeff Hale, Jeff Zhao, Jennifer Maldonado, Jenny Vo, Jérémie du
1242Boisberranger, Jesse Lima, Jianzhu Guo, Jirka Borovec, jnboehm, Joel Nothman,
1243JohanWork, John Paton, Jon Crall, Jon Haitz Legarreta Gorroño, Jonathan
1244Schneider, Jorge Loayza, Joris Van den Bossche, José Manuel Nápoles Duarte,
1245Juan Carlos Alfaro Jiménez, Juan Martin Loyola, Julien Jerphanion, Julio
1246Batista Silva, julyrashchenko, JVM, Kadatatlu Kishore, Karen Palacio, katotten,
1247Kaushik Roy Chowdhury, Kei Ishikawa, Ken4git, KimAYoung, kmatt10, kobaski,
1248Kot271828, Kranthi Sedamaki, krumetoft, Kunj, KurumeYuta, kxytim, lacrosse91,
1249LalliAcqua, Laveen Bagai, Leonardo Rocco, Leonardo Uieda, Leopoldo Corona, Loic
1250Esteve, LSturtew, Luca Bittarello, Luccas Quadros, LucieClair, Lucy Jiménez,
1251Lucy Liu, Luiz Eduardo Amaral, ly648499246, Mabu Manaileng, MaggieChege,
1252makoeppel, mandjevant, Manimaran, Marco Gorelli, Maren Westermann, Maria
1253Telenczuk, Mariangela, marielaraj, Martin Hirzel, Mateo Noreña, Mathieu
1254Blondel, Mathis Batoul, mathurinm, Matthew Calcote, Maxime Prieur, Maxwell,
1255Mehdi Hamoumi, Mehmet Ali Özer, melemo2, Miao Cai, Michal Karbownik,
1256michalkrawczyk, milana2, millawell, Mitzi, mlant, mlondschien, Mohamed Haseeb,
1257Mohamed Khoualed, Mr. Leu, MrinalTyagi, Muhammad Jarir Kanji, murata-yu, N,
1258Nadim Kawwa, Nanshan Li, naozin555, nastegiano, Nate Parsons, Neal Fultz, Nic
1259Annau, Nico Stefani, Nicolas Hug, Nicolas Miller, Nigel Bosch, Niket Jain,
1260Nikita Titov, Nikolay Kondratyev, Nodar Okroshiashvili, Norbert Preining,
1261novaya, Ogbonna Chibuike Stephen, OGordon100, Oliver Pfaffel, Olivier Grisel,
1262Oras Phongpanangam, Pablo Duque, Pablo Ibieta-Jimenez, partev, Patric Lacouth,
1263Patrick Ferreira, Paul, Paulo S. Costa, Paweł Olszewski, pelennor, Peter Dye,
1264Pierre-Yves Le Borgne, PierreAttard, Pinky, Pramod Anantharam, PranayAnchuri,
1265Prince Canuma, puhuk, putschblos, qdeffense, RamyaNP, Randall Boyes,
1266ranjanikrishnan, Ray Bell, Rene Jean Corneille, Reshama Shaikh, ricardojnf,
1267RichardScottOZ, Rishabh, Rodion Martynov, Rohan Paul, Roman Lutz, Roman
1268Yurchak, Ross Barnowski, Samuel Brice, Sandy Khosasi, Sean Benhur J, Sebastian
1269Flores, Sebastian Pölsterl, Shao Yang Hong, shinehide, shinnar, shivamgargsya,
1270Shooter23, Shuhei Kayawari, Shyam Desai, siavrez, simonamaggio, Sina
1271Tootoonian, solosilence, spikebh, sply88, Steve Stagg, Steven Kolawole, Surya
1272Prakash, Sven Eschlbeck, Swapnil Jha, swpease, Sylvain Marié, t-jakubek,
1273t-kusanagi, Takeshi Oura, Tamires Santana, Terence Honles, TFiFiE, Thomas A
1274Caswell, Thomas J. Fan, Tim Gates, Tim Vink, TimotheeMathieu, Timothy Wolodzko,
1275tliu68, Tobias Uhmann, Tom Dupré la Tour, tom1092, Tomás Moreyra, Tomás Ronald
1276Hughes, Tommaso Di Noto, Tomohiro Endo, TONY GEORGE, Toshihiro NAKAE, tsuga,
1277Tyler Martin, Uttam kumar, vadim-ushtanit, Vangelis Gkiastas, Venkatachalam N,
1278Vikas Vishwakarma, Vikrant khedkar, Vilém Zouhar, Vinicius Rios Fuck, Vladimir
1279Chernyy, Vlasovets, waijean, Whidou, xavier dupré, Xiao Yuan, xiaoyuchai, Yar
1280Khine Phyo, Yasmeen Alsaedy, yoch, Yosuke KOBAYASHI, Yu Feng, YusukeNagasaka,
1281yzhenman, Zeel B Patel, Zero, ZeyuSun, ZhaoweiWang, Zito, Zito Relova, Zhao Feng
1282