community icon indicating copy to clipboard operation
community copied to clipboard

REQUEST: Repository maintenance on opentelemetry-network-build-tools

Open yonch opened this issue 1 year ago • 3 comments

Affected Repository

https://github.com/open-telemetry/opentelemetry-network-build-tools

Requested changes

Enable Large GitHub Runners on the org, and configure a 2-core, 8 GB RAM, 75 GB SSD runner on the repo (size chart).

Relevant docs:

Purpose

The build environment is built relatively infrequently (e.g., 2-3 times per month), but requires ~35 GB of storage to build. This renders the standard GH Runners too small.

Here is a an error message from a run with the standard runner (No space left on device):

System.IO.IOException: No space left on device : '/home/runner/runners/2.320.0/_diag/Worker_20241015-190654-utc.log' at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset) at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite() at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder) at System.Diagnostics.TextWriterTraceListener.Flush() at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id) at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message) at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message) at GitHub.Runner.Worker.Worker.RunAsync(String pipeIn, String pipeOut) at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args) System.IO.IOException: No space left on device : '/home/runner/runners/2.320.0/_diag/Worker_20241015-190654-utc.log' at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset) at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite() at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder) at System.Diagnostics.TextWriterTraceListener.Flush() at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id) at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message) at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message) at GitHub.Runner.Common.Tracing.Error(Exception exception) at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args) Unhandled exception. System.IO.IOException: No space left on device : '/home/runner/runners/2.320.0/_diag/Worker_20241015-190654-utc.log' at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset) at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite() at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder) at System.Diagnostics.TextWriterTraceListener.Flush() at System.Diagnostics.TraceSource.Flush() at GitHub.Runner.Common.TraceManager.Dispose(Boolean disposing) at GitHub.Runner.Common.TraceManager.Dispose() at GitHub.Runner.Common.HostContext.Dispose(Boolean disposing) at GitHub.Runner.Common.HostContext.Dispose() at GitHub.Runner.Worker.Program.Main(String[] args)

Expected Duration

permanently. Expect to build on average 2-3 times a month.

Repository Maintainers

  • @open-telemetry/network-maintainers

EDIT: add a link to the erronouos run.

yonch avatar Oct 15 '24 20:10 yonch

I've created a runner named otel-linux-latest-2-cores and granted access to the opentelemetry-network-build-tools repo. Give it a try and let me know if it's working out.

trask avatar Oct 15 '24 21:10 trask

Great! I see it, testing now...

yonch avatar Oct 15 '24 22:10 yonch

This requires merging a PR so might take a bit.

yonch avatar Oct 15 '24 22:10 yonch

@trask it seems the 2 core / 75 GB SSD runner is too small; it still runs out of disk space. It might have a larger image or configure swap space, because the build worked on a 50 GB GCP machine...

Is there an option to use a 4 core / 150 GB SSD machine?

yonch avatar Oct 22 '24 14:10 yonch

yes, I just created otel-linux-latest-4-cores and it should have the same permissions

trask avatar Oct 22 '24 14:10 trask

@yonch let us know if everything is good, and if it's ok to close this issue, thanks!

trask avatar Oct 29 '24 20:10 trask

Hi @trask ! I haven't been able to test this yet, but can see the runner.

Let's close this issue, and I'll reopen if there is a problem. Thank you for the help!

yonch avatar Oct 29 '24 21:10 yonch

Just an update that this worked! Thank you @trask. No further action required.

yonch avatar Nov 12 '24 15:11 yonch

Circling back to update that the build works and pushes to docker hub.

Thank you @trask and @tigrannajaryan !

yonch avatar Nov 26 '24 19:11 yonch