-
Notifications
You must be signed in to change notification settings - Fork 187
Description
Problem description:
I query a dataset from a local InfluxDB which Data Explorer UI queries and plots in 0.02 seconds.
When I load exactly the same dataset into pandas.DataFrame
using query_data_frame
function it takes 184 seconds. 157 seconds of which cProfile
attributes to pandas.to_datetime
function, called by PandasDateTimeHelper
for each and every row's timestamp. This performance problem seems to originate from the fact that the query receives _time
column as text in format %Y-%m-%dT%H:%M:%SZ
, which must then be parsed. Complete cProfile output.
It looks like this conversion of _time
column represented as uint64
number of nanoseconds to string in ISO datetime format in InfluxDB and then parsing that string back into uint64
number of nanoseconds in influxdb_client
is totally unnecessary, if I understand the dataflow correctly.
How do I get rid of these unnecessary conversions for _time
column, please?
Is there a way to receive _time
column as uint64
number of nanoseconds, which I could cast to pandas.datetime64
with one call to pandas.to_datetime
on entire column?
Specifications:
- Client Version: 1.13.0
- InfluxDB Version: 2.0.3
- Platform: Ubuntu 18.04.5 LTS, amd64