Skip to content

Corrupted Loaded Dataset Returns Unhelpful Error #19667

@RylanSchaeffer

Description

@RylanSchaeffer

I found an interesting unhelpful error when trying to use sklearn.datasets.fetch_kddcup99. I was downloading the data when my system crashed. After I rebooted and reran the code, I got the following error:

Traceback (most recent call last):
  File "/home/rylan/Documents/FieteLab-RCRP/rcrp/lib/python3.6/site-packages/sklearn/datasets/_kddcup99.py", line 351, in _fetch_brute_kddcup99
    X, y
UnboundLocalError: local variable 'X' referenced before assignment

This was unhelpful in identifying the true cause of the error (that the download wasn't completed correctly). Following this StackOverflow post (https://stackoverflow.com/questions/59192681/how-can-i-load-sklearn-data-in-jupyter-python-3), I'd recommend adding a check to the correctness of the dataset and raising an error if the dataset is corrupted.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions