Gradle build failure at cache due to file lock
We are using action/cache for gradle builds and below is the action snippet defined in yaml.
- name: Setup Gradle Dependencies Cache uses: actions/cache@v2 with: path: ~/.gradle/caches key: ${{ runner.os }}-gradle-caches-${{ hashFiles('/*.gradle', '/*.gradle.kts') }}
In case we have two commits where build has progressed, If the gradle build step is still ongoing while the second workflow based on commit is in progress and then when it reaches the build stage it fails with file lock error as below(seems the .gradle directory is shared between two workflows).
Error: Process completed with exit code 1.
##[debug]Finishing: 🧱 Build with gradle wrapper
FAILURE: Build failed with an exception.
* What went wrong:
Gradle could not start your build.
> Could not create service of type ResourceSnapshotterCacheService using GradleUserHomeServices.createResourceSnapshotterCacheService().
> Timeout waiting to lock file hash cache (/root/.gradle/caches/7.0/fileHashes). It is currently in use by another Gradle instance.
Owner PID: 665
Our PID: 666
Owner Operation:
Our operation:
Lock file: /root/.gradle/caches/7.0/fileHashes/fileHashes.lock```
This happens only when there are concurrent builds at same time. Is there anything you can advise us to overcome this problem other that removing the file which has lock.
Instead of using actions/cache, I recommend that you use https://github.com/marketplace/actions/gradle-build-action which provides the same functionality with a number of other benefits. https://github.com/gradle/gradle-build-action#why-use-the-gradle-build-action
@bigdaz Thanks for the input. We do not wanted to use the build specific action as the commonalities vary from action to action here. our focus was to build individual build private actions which will also publish data about the build artifact, once this is done we wanted to common action which support caching for different build types. Even if we use the gradle build action we will have similar issues when there are concurrent builds being executed. As the .gradle file is common workspace which would have been mounted from shared storage or persistent storage. challenge is more with the file lock in our case.
@1633605 Are you running multiple runners on same machine? Otherwise not sure how two workflows can use same file.
If you really want to run two workflows concurrently, I would suggest to use runners on different machine or use GitHub hosted runners. If you are okay with not running two workflows at the same time, you can use concurrency and put caching and building in same group so that they do not run simultaneously.
@bishal-pdMSFT Yes you are right, we are running multiple runners on same machine. Having multiple runners is always a file lock issues as the disk is shared among the workflows. Even if we use concurrency then challenge is about the jobs getting into pending state as we operate at big scale. Do you have any insight how would be behaviour in case of pods where the cache directory is mounted via EFS as PV. Would we have similar challenges in this approach? Any feedback would be appreciated.
Persistent storage will have locking problem if multiple runners are trying to share it. To avoid it you may want to have cache directory inside GITHUB_WORKSPACE which is specific to each runner.
On a sidenote, I am not sure why you need to use cache when using EFS as that can server similar purpose. And even in case of EFS, you would probably need to have one EFS per runner to avoid locking. Isn't it?
@bishal-pdMSFT We are using DinD kind of setup where everything is clean state after each workflow execution, however we are persisting the workspace to persistent store(workspace). Based on your suggestion we will try out of having cache directory in workspace folder which is very much confined/restricted to single thread of the runner instance.
Closing old issue. Please re-open if the workaround does not help.