tez icon indicating copy to clipboard operation
tez copied to clipboard

TEZ-4183: Failed event AM propagation logic for Tez Unordered fetcher

Open pgaref opened this issue 5 years ago • 0 comments

New event propagation logic for Tez unordered fetcher. Inform AM: * - Immediately if maxTimeToWaitForReportMillis is 0 * - When time exceeded SHUFFLE_BATCH_WAIT ms (batch events) * - When more than THRESHOLD readErrors occurred for a particular task_attempt * maxFetchFailuresBeforeReporting (5) (batch events) Extending InputReadErrorEvent and HostFetchResult to track diskRead errors ReporterCallable thread now sends events to AM either when there is a signal or time exceeded SHUFFLE_BATCH_WAIT

pgaref avatar May 18 '20 13:05 pgaref