Skip to content

BUG: Regression with StandardScaler due to #19527 #19726

@larsoner

Description

@larsoner

Describe the bug

#19527 introduced a regression with StandardScaler when dealing with data with small magnitudes.

Steps/Code to Reproduce

In MNE-Python some of our data channels have magnitudes in the ~1e-13 range. On 638b768 or before, this code (which uses random data of different scales) returns all True, which seems correct:

import numpy as np
from sklearn.preprocessing import StandardScaler

for scale in (1e15, 1e10, 1e5, 1, 1e-5, 1e-10, 1e-15):
    data = np.random.RandomState(0).rand(1000, 4) - 0.5
    data *= scale
    scaler = StandardScaler(with_mean=True, with_std=True)
    X = scaler.fit_transform(data)
    stds = np.std(data, axis=0)
    means = np.mean(data, axis=0)
    print(np.allclose(X, (data - means) / stds, rtol=1e-7, atol=1e-7 * scale))

But on c748e46 / after #19527, anything "too small" starts to fail, as I get 5 True and the last two scale factors (1e-10, 1e-15) False. Hence StandardScaler no longer standardizes the data.

cc @ogrisel since this came from your PR and @maikia @rth @agramfort since you approved the PR

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions