daos icon indicating copy to clipboard operation
daos copied to clipboard

DAOS-16452 dfs: add a shmem cache for dentries

Open mchaarawi opened this issue 9 months ago • 5 comments

Steps for the author:

  • [ ] Commit message follows the guidelines.
  • [ ] Appropriate Features or Test-tag pragmas were used.
  • [ ] Appropriate Functional Test Stages were run.
  • [ ] At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • [ ] Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • [ ] Gatekeeper requested (daos-gatekeeper added as a reviewer).

mchaarawi avatar Apr 16 '25 19:04 mchaarawi

Ticket title is ' DFS user space cooperative cache for fast lookups, stats, readdir, and small file reads.' Status is 'Open' https://daosio.atlassian.net/browse/DAOS-16452

github-actions[bot] avatar Apr 16 '25 19:04 github-actions[bot]

Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/23/execution/node/1640/log

daosbuild3 avatar Oct 01 '25 16:10 daosbuild3

@mchaarawi @knard38 Could you please review? Thank you!

wiliamhuang avatar Oct 15 '25 16:10 wiliamhuang

i checked the changes for the using LRU instead of the hastable and they look OK to me.. i cannot approve my own PR though :-)

mchaarawi avatar Oct 20 '25 20:10 mchaarawi

@knard38 Could you please review? Thank you very much!

wiliamhuang avatar Oct 21 '25 02:10 wiliamhuang

@mchaarawi Could you please merge "feature/dfs_dcache" with master branch to fix an issue in "Build on Leap 15.5" in CI? Thank you!

wiliamhuang avatar Oct 23 '25 12:10 wiliamhuang

There is some points which needs to be clarified. For the code readability, it could help to have dedicated functions for each type of cache, instead of having conditional statements for each type of cache in one big function: see my comment for the cache_find_insert() function.

@knard38 struct shm and dh inside struct dfs_dcache have different data structure. Same data layout is required to completely avoid conditional statements and only rely on function pointers pointing to dedicated functions for each type of cache.

wiliamhuang avatar Oct 23 '25 13:10 wiliamhuang

There is some points which needs to be clarified. For the code readability, it could help to have dedicated functions for each type of cache, instead of having conditional statements for each type of cache in one big function: see my comment for the cache_find_insert() function.

@knard38 struct shm and dh inside struct dfs_dcache have different data structure. Same data layout is required to completely avoid conditional statements and only rely on function pointers pointing to dedicated functions for each type of cache.

TBH the second step is to remove the dram cache. i feel it is not needed anymore at the DFS level if the other one is done.

mchaarawi avatar Oct 23 '25 13:10 mchaarawi

@mchaarawi Could you please merge "feature/dfs_dcache" with master branch to fix an issue in "Build on Leap 15.5" in CI? Thank you!

done

mchaarawi avatar Oct 23 '25 13:10 mchaarawi

Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16273/30/testReport/

daosbuild3 avatar Oct 24 '25 02:10 daosbuild3

Test stage Functional on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16273/30/testReport/

daosbuild3 avatar Oct 24 '25 12:10 daosbuild3

Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16273/32/testReport/

daosbuild3 avatar Oct 24 '25 15:10 daosbuild3

Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/32/execution/node/1342/log

daosbuild3 avatar Oct 26 '25 05:10 daosbuild3

Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/32/execution/node/1478/log

daosbuild3 avatar Oct 26 '25 18:10 daosbuild3

@knard38 Could you please review? Thank you very much!

wiliamhuang avatar Oct 27 '25 05:10 wiliamhuang

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/33/execution/node/261/log

daosbuild3 avatar Oct 27 '25 12:10 daosbuild3

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/34/execution/node/262/log

daosbuild3 avatar Oct 27 '25 13:10 daosbuild3

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/35/execution/node/282/log

daosbuild3 avatar Oct 27 '25 14:10 daosbuild3

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/36/execution/node/282/log

daosbuild3 avatar Oct 27 '25 14:10 daosbuild3

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/37/execution/node/282/log

daosbuild3 avatar Oct 27 '25 14:10 daosbuild3

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/38/execution/node/304/log

daosbuild3 avatar Oct 27 '25 16:10 daosbuild3

Test stage Functional on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/40/execution/node/1076/log

daosbuild3 avatar Oct 28 '25 07:10 daosbuild3

Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/40/execution/node/1361/log

daosbuild3 avatar Oct 28 '25 08:10 daosbuild3

Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16273/41/execution/node/734/log

daosbuild3 avatar Oct 28 '25 23:10 daosbuild3

@knard38 Could you please review recent changes? Thank you!

wiliamhuang avatar Oct 29 '25 04:10 wiliamhuang

@knard38 Could you please review recent changes? Thank you!

done

knard38 avatar Oct 29 '25 07:10 knard38

The failed test in CI is an existing issue. https://daosio.atlassian.net/browse/DAOS-17751

wiliamhuang avatar Oct 29 '25 12:10 wiliamhuang

@mchaarawi Can we land this PR? Thank you!

wiliamhuang avatar Oct 31 '25 13:10 wiliamhuang