-
Notifications
You must be signed in to change notification settings - Fork 76
Description
I would like to report a use case where we find specific design choices in quantities become a major computational bottleneck.
Essentially creating a quantity costs a substantial computational overhead and in situation where large numbers have to be created
or manipulated this can become the dominant computational cost of the operations. In our specific situation, we are recording hundreds of thousands of neurons which have sparse firing rates and storing them in the Neo SpikeTrainLists which in turn represent each spike train as quantity. In such situation this overhead becomes a dominant factor when for example loading such SpikeTrainLists from pickled storage.
My preliminary profiling indicates that ultimately the overhead comes down to the function quantity.validate_dimensionality which
calls isinstance. Apparently isinstance is notoriously expensive internal Python operation. There also seems to be unnecessary doubling of the isinstance call in quantity.validate_unit_quantity when called from quantity.validate_dimensionality.
I believe that avoiding the latter unnecessary double call to isinstance would already substantially improve the performance, but the
ultimate solution would need reevaluation if call to these heavy internal Python operations is unavoidable during the creation of
every single quantity. Perhaps a solution that would allow for creation of large number of Quantities together that make these
calls only once for all of them would be a solution that the upstream packages could use to optimize such specific use cases.