Potential performance Issue: Slow read_csv() Function with pandas 1.3.3
Issue Description:
Hello.
I have discovered a performance degradation in the read_csv function of pandas version 1.3.3. And I notice some parts of the repository depend on pandas 1.3.3 in dadmatools/requirements.txt and some other dependencies require pandas below 1.4. I am not sure whether this performance problem in pandas will affect this repository. I found some discussions on pandas GitHub related to this issue, including #44158 and #44610.
I also found that dadmatools/pipeline/informal2formal/utils.py and dadmatools/pipeline/informal2formal/VerbHandler.py used the influenced api. There may be more files using the influenced api.
Suggestion
I would recommend considering an upgrade to a different version of pandas >= 1.4 or exploring other solutions to optimize the performance of read_csv.
Any other workarounds or solutions would be greatly appreciated.
Thank you!
Thank you for your comment; I will try to update the Pandas version. However, I'm uncertain whether our "informal2formal" function utilized the influenced API.
On another note, I wanted to inquire if you have knowledge about the Persian language?
No, I don't know the Persian language. I encountered this problem in other repositories, so I wanted to note other repositories this potential problem.
Thank you for your suggestion.