-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
#!/opt/apps/anaconda3/bin/python
import pandas
from io import StringIO
if __name__ == "__main__":
csv = StringIO('''
Ticker,Last Update Timestamp
AAA,01/29/2024 17:04:19
AAA,01/30/2024 04:19:57
ABEQ,02/08/2024 14:33:51
ABEQ,02/06/2024 15:04:57
ABEQ,02/13/2024 07:53:11
''')
columns={'Ticker': str, 'Last Update Timestamp': str}
df = pandas.read_csv(csv, usecols=columns.keys(), dtype=columns, parse_dates=['Last Update Timestamp'])
print(pandas.__version__)
print(df)
Issue Description
parse_dates in combination with dtype does not correctly identify date column as a DateTime object and, in addition, converts the column into int64 (that are not even valid epochs).
This used to work correctly with pandas 1.4.0
The output of the above example is:
2.2.0
Ticker Last Update Timestamp
0 AAA 1706547859000000000
1 AAA 1706588397000000000
2 ABEQ 1707402831000000000
3 ABEQ 1707231897000000000
4 ABEQ 1707810791000000000
Expected Behavior
#!/opt/apps/anaconda3/bin/python
import pandas
from io import StringIO
if __name__ == "__main__":
csv = StringIO('''
Ticker,Last Update Timestamp
AAA,01/29/2024 17:04:19
AAA,01/30/2024 04:19:57
ABEQ,02/08/2024 14:33:51
ABEQ,02/06/2024 15:04:57
ABEQ,02/13/2024 07:53:11
''')
columns={'Ticker': str, 'Last Update Timestamp': str}
df = pandas.read_csv(csv, parse_dates=['Last Update Timestamp'])
print(pandas.__version__)
print(df)
Output:
> ./date.py
2.2.0
Ticker Last Update Timestamp
0 AAA 2024-01-29 17:04:19
1 AAA 2024-01-30 04:19:57
2 ABEQ 2024-02-08 14:33:51
3 ABEQ 2024-02-06 15:04:57
4 ABEQ 2024-02-13 07:53:11
Installed Versions
/opt/apps/anaconda3/lib/python3.11/site-packages/_distutils_hack/init.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
INSTALLED VERSIONS
commit : f538741
python : 3.11.6.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.0-9-amd64
Version : #1 SMP Debian 5.10.70-1 (2021-09-30)
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.2.0
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.8.2
setuptools : 69.0.3
pip : 24.0
Cython : None
pytest : 8.0.0
hypothesis : None
sphinx : 7.2.6
blosc : None
feather : None
xlsxwriter : 3.1.9
lxml.etree : 4.9.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.3
IPython : 8.21.0
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : 2023.10.0
gcsfs : None
matplotlib : 3.8.0
numba : 0.59.0
numexpr : 2.9.0
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : 12.0.1
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.12.0
sqlalchemy : 2.0.24
tables : 3.9.2
tabulate : 0.9.0
xarray : 2024.1.1
xlrd : None
zstandard : 0.22.0
tzdata : 2023.4
qtpy : 2.4.1
pyqt5 : None
None