Improved concatination in mdf.return_pandas_dataframe()
Python version
3.9.19
Numpy version
1.26.1
mdfreader version
4.1
Description
Got a lot of times the Info: highly fragmented dataframe on one call of the following line
mdf.return_pandas_dataframe(master)
Resolution
I changed the function in the mdfreader.py to the following:
channel_dict = {key: None for key in self.masterChannelList[master_channel_name]}
for key, value in channel_dict.items():
data = self.get_channel_data(key)
if data.dtype.byteorder not in ['=', '|']:
data = data.byteswap().newbyteorder()
if data.ndim == 1 and data.shape[0] == temporary_dataframe.shape[0] \
and not data.dtype.char == 'V':
value = data
#temporary_dataframe[channel] = data # original line
temporary_dataframe = pd.DataFrame(data=channel_dict, index=temporary_dataframe.index) # added line
return temporary_dataframe
Therefore, the dataframe does not get expanded every time but only once at the end.
Thanks @LaurentBlanc73 for your interest. I am not sure I can visualise the change actually.. Could you reformat or if you are already confident of your proposal, submit a Pull Request with this ticket ?
@ratal sorry for the poor formatting, I just improved that (actually I changed the code again a little bit by pre-allocating storage).
I updated the changes in my fork and will create a pull request as soon as the previous on in #212 is passed :)