Skip to content

Commit 0f6cd8f

Browse files
JoElfnerglemaitreogriseljjerphanjeremiedbb
committed
ENH add an option to center ICE and PD (scikit-learn#18310)
Co-authored-by: Guillaume Lemaitre <[email protected]> Co-authored-by: Olivier Grisel <[email protected]> Co-authored-by: Julien Jerphanion <[email protected]> Co-authored-by: Jérémie du Boisberranger <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]>
1 parent cfec91b commit 0f6cd8f

File tree

5 files changed

+312
-66
lines changed

5 files changed

+312
-66
lines changed

doc/modules/partial_dependence.rst

Lines changed: 32 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,10 @@ Partial dependence plots (PDP) and individual conditional expectation (ICE)
1111
plots can be used to visualize and analyze interaction between the target
1212
response [1]_ and a set of input features of interest.
1313

14-
Both PDPs and ICEs assume that the input features of interest are independent
15-
from the complement features, and this assumption is often violated in practice.
16-
Thus, in the case of correlated features, we will create absurd data points to
17-
compute the PDP/ICE.
14+
Both PDPs [H2009]_ and ICEs [G2015]_ assume that the input features of interest
15+
are independent from the complement features, and this assumption is often
16+
violated in practice. Thus, in the case of correlated features, we will
17+
create absurd data points to compute the PDP/ICE [M2019]_.
1818

1919
Partial dependence plots
2020
========================
@@ -164,6 +164,18 @@ PDPs. They can be plotted together with
164164
... kind='both')
165165
<...>
166166

167+
If there are too many lines in an ICE plot, it can be difficult to see
168+
differences between individual samples and interpret the model. Centering the
169+
ICE at the first value on the x-axis, produces centered Individual Conditional
170+
Expectation (cICE) plots [G2015]_. This puts emphasis on the divergence of
171+
individual conditional expectations from the mean line, thus making it easier
172+
to explore heterogeneous relationships. cICE plots can be plotted by setting
173+
`centered=True`:
174+
175+
>>> PartialDependenceDisplay.from_estimator(clf, X, features,
176+
... kind='both', centered=True)
177+
<...>
178+
167179
Mathematical Definition
168180
=======================
169181

@@ -255,15 +267,19 @@ estimators that support it, and 'brute' is used for the rest.
255267
256268
.. topic:: References
257269

258-
T. Hastie, R. Tibshirani and J. Friedman, `The Elements of
259-
Statistical Learning <https://web.stanford.edu/~hastie/ElemStatLearn//>`_,
260-
Second Edition, Section 10.13.2, Springer, 2009.
261-
262-
C. Molnar, `Interpretable Machine Learning
263-
<https://christophm.github.io/interpretable-ml-book/>`_, Section 5.1, 2019.
264-
265-
A. Goldstein, A. Kapelner, J. Bleich, and E. Pitkin, :arxiv:`Peeking Inside the
266-
Black Box: Visualizing Statistical Learning With Plots of Individual
267-
Conditional Expectation <1309.6392>`,
268-
Journal of Computational and Graphical Statistics, 24(1): 44-65, Springer,
269-
2015.
270+
.. [H2009] T. Hastie, R. Tibshirani and J. Friedman,
271+
`The Elements of Statistical Learning
272+
<https://web.stanford.edu/~hastie/ElemStatLearn//>`_,
273+
Second Edition, Section 10.13.2, Springer, 2009.
274+
275+
.. [M2019] C. Molnar,
276+
`Interpretable Machine Learning
277+
<https://christophm.github.io/interpretable-ml-book/>`_,
278+
Section 5.1, 2019.
279+
280+
.. [G2015] :arxiv:`A. Goldstein, A. Kapelner, J. Bleich, and E. Pitkin,
281+
"Peeking Inside the Black Box: Visualizing Statistical
282+
Learning With Plots of Individual Conditional Expectation"
283+
Journal of Computational and Graphical Statistics,
284+
24(1): 44-65, Springer, 2015.
285+
<1309.6392>`

doc/whats_new/v1.1.rst

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -581,12 +581,19 @@ Changelog
581581
:pr:`16061` by `Thomas Fan`_.
582582

583583
- |Enhancement| In
584-
:meth:`~sklearn.inspection.PartialDependenceDisplay.from_estimator` and
585-
:meth:`~sklearn.inspection.PartialDependenceDisplay.from_predictions`, allow
584+
:meth:`inspection.PartialDependenceDisplay.from_estimator`, allow
586585
`kind` to accept a list of strings to specify which type of
587586
plot to draw for each feature interaction.
588587
:pr:`19438` by :user:`Guillaume Lemaitre <glemaitre>`.
589588

589+
- |Enhancement| :meth:`inspection.PartialDependenceDisplay.from_estimator`,
590+
:meth:`inspection.PartialDependenceDisplay.plot`, and
591+
:func:`inspection.plot_partial_dependence` now support plotting centered
592+
Individual Conditional Expectation (cICE) and centered PDP curves controlled
593+
by setting the parameter `centered`.
594+
:pr:`18310` by :user:`Johannes Elfner <JoElfner>` and
595+
:user:`Guillaume Lemaitre <glemaitre>`.
596+
590597
:mod:`sklearn.isotonic`
591598
.......................
592599

examples/inspection/plot_partial_dependence.py

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,13 @@
113113

114114
from sklearn.inspection import PartialDependenceDisplay
115115

116-
common_params = {"subsample": 50, "n_jobs": 2, "grid_resolution": 20, "random_state": 0}
116+
common_params = {
117+
"subsample": 50,
118+
"n_jobs": 2,
119+
"grid_resolution": 20,
120+
"centered": True,
121+
"random_state": 0,
122+
}
117123

118124
print("Computing partial dependence plots...")
119125
tic = time()
@@ -188,10 +194,12 @@
188194
# rooms per household.
189195
#
190196
# The ICE curves (light blue lines) complement the analysis: we can see that
191-
# there are some exceptions, where the house price remain constant with median
192-
# income and average occupants. On the other hand, while the house age (top
193-
# right) does not have a strong influence on the median house price on average,
194-
# there seems to be a number of exceptions where the house price increase when
197+
# there are some exceptions (which are better highlighted with the option
198+
# `centered=True`), where the house price remains constant with respect to
199+
# median income and average occupants variations.
200+
# On the other hand, while the house age (top right) does not have a strong
201+
# influence on the median house price on average, there seems to be a number
202+
# of exceptions where the house price increases when
195203
# between the ages 15-25. Similar exceptions can be observed for the average
196204
# number of rooms (bottom left). Therefore, ICE plots show some individual
197205
# effect which are attenuated by taking the averages.

0 commit comments

Comments
 (0)