Improved concatination in mdf.return_pandas_dataframe()

Open LaurentBlanc73 opened this issue 1 year ago • 2 comments

Python version

3.9.19

Numpy version

1.26.1

mdfreader version

4.1

Description

Got a lot of times the Info: highly fragmented dataframe on one call of the following line mdf.return_pandas_dataframe(master)

Resolution

I changed the function in the mdfreader.py to the following:

            channel_dict = {key: None for key in self.masterChannelList[master_channel_name]}
            for key, value in channel_dict.items():
                data = self.get_channel_data(key)
                if data.dtype.byteorder not in ['=', '|']:
                    data = data.byteswap().newbyteorder()
                if data.ndim == 1 and data.shape[0] == temporary_dataframe.shape[0] \
                        and not data.dtype.char == 'V':
                    value = data
                    #temporary_dataframe[channel] = data # original line
            temporary_dataframe = pd.DataFrame(data=channel_dict, index=temporary_dataframe.index) # added line
            return temporary_dataframe

Therefore, the dataframe does not get expanded every time but only once at the end.

May 23 '24 19:05 LaurentBlanc73

Thanks @LaurentBlanc73 for your interest. I am not sure I can visualise the change actually.. Could you reformat or if you are already confident of your proposal, submit a Pull Request with this ticket ?

May 25 '24 08:05 ratal

@ratal sorry for the poor formatting, I just improved that (actually I changed the code again a little bit by pre-allocating storage).

I updated the changes in my fork and will create a pull request as soon as the previous on in #212 is passed :)

May 26 '24 09:05 LaurentBlanc73