Segmentation fault under filebench workloads
Environment: Splitfs, Optane DC PM, Ubuntu 18.04 LTS, glibc 2.27, gcc 7.5.0
When I run the filebench workloads (varmail, fileserver, webserver, webproxy) using the scripts/filebench/run_fs.sh, it always gets Segmentation fault (core dumped). Although varmail, fileserver and webproxy can still complete the tests and show results, the results are doubtful because the performance is significant lower than that of ext4-dax. Webserver generally crash immediately... It seems there is nothing to do with the workload data size...
I setup Splitfs exactly following the steps, I only change the NVP_NUM_LOCKS from 32 to 144 because my machine has 72 logical CPUs.
Hello,
Thanks for trying out SplitFS. I am not sure about what is causing the problem at your end. I have modified the filebench_run.sh script file to ensure that the correct environment variables are being set, and also modified the workload files to match them with the filebench repository's workload files (except for webserver and webproxy).
Could you pull and check again? Also, please make sure you do the following things:
- Run the filebench directory present in SplitFS repository
- Do not modify the filebench workload files (you can modify them later, but please do not modify them right now for debugging purposes)
- Run filebench using ./run_filebench.sh present in the scripts directory
- Do not make any changes to the common.mk Makefile in the splitfs/ source directory
If it runs correctly, you should see around 2-3x improvement in SplitFS as compared to ext4 DAX in varmail and fileserver. I am seeing the same on my end, when I tested the performance on Fedora 30 with the same workload files, with 96 logical CPUs.
Thanks for your reply. The problem still exists but I found something new:
- Splitfs did outperforms ext4-DAX 2-3x with Linux 4.13.
- The performance of ext4-dax on Linux 5.1 is much better than Linux 4.13 (especially with #threads >= 8).
- Filebench crashes directly when running on NOVA and PMFS. So the problem is probably caused by glibc and kernel. Have you tried upgrading splitfs to a higher Linux kernel version (5.x)?
Sorry, it seems that the filebench core dump problem is caused by modifying NVP_NUM_LOCKS in nvp_lock.h. Thus currently splitfs can only run on #0-15 CPUs.
Thank you for this insight. I will look into why the core dump problem is coming up if we run with more cores on 5.1. The problem does not seem to be fundamental to SplitFS. Just to confirm, is the performance of SplitFS better than ext4-DAX on Linux 5.1 when you run with cores 0-15?
I am not sure why Filebench crashes directly when running on NOVA and PMFS. We don't modify NOVA and PMFS, and might be an issue with your kernel.
I failed to port SplitFS to Linux5.1 because ext4 code has been changed in Linux 5.1 For vamail, webserver and fileserver, ext4-DAX-5.1 oputperms ext4-DAX-4.13 about 10~20% with threads >= 8