varnish-cache icon indicating copy to clipboard operation
varnish-cache copied to clipboard

v1d: Log client write error

Open dridi opened this issue 2 years ago • 8 comments

Some scenarios like hitting send_timeout because of a dripping write leaves no clue that an error occurred from the client request logs alone.

This introduces a client counterpart for:

FetchError backend write error: %d (%s) (%s)

Refs 49e44e390488c57c03d8111fcd43e245933bc151

dridi avatar Jan 18 '24 21:01 dridi

Bugwash:

  • add SLT_DeliveryError
  • use it for "backend write error"
  • introduce "client write error"
  • collect feedback from production

dridi avatar Jan 22 '24 14:01 dridi

bugwash:

  • The error is in the session VSL, but due to the impracticality of -g session, it is not particularly useful
  • SLT_Error is not a particularly good tag for this, because it is currently used for internal errors which can usually be addressed with VCL or changing parameters
  • we have FetchError, should add a DeliveryError tag and then use the former consistently for VFPs, and the latter for VDPs

nigoroll avatar Jan 22 '24 14:01 nigoroll

SLT_Error is not a particularly good tag for this, because it is currently used for internal errors which can usually be addressed with VCL or changing parameters

nitpick: this can be addressed with send_timeout for that case, but point taken, there is nothing we can do when the client disconnects.

dridi avatar Jan 22 '24 15:01 dridi

I'd like to cover this during bugwash today, before we send this to production.

dridi avatar Jan 29 '24 12:01 dridi

bionic failure,

cache/cache_expire.c: In function 'exp_inbox':
cache/cache_expire.c:282:61: error: format '%ld' expects argument of type 'long int', but argument 6 has type 'VCL_INT {aka long long int}' [-Werror=format=]
   VSLb(&ep->vsl, SLT_ExpKill, "EXP_Removed x=%ju t=%.0f h=%ld",
                                                           ~~^
                                                           %lld
cache/cache_expire.c:284:7:
       oc->hits);
       ~~~~~~~~                                               
cache/cache_expire.c: In function 'exp_expire':
cache/cache_expire.c:360:61: error: format '%ld' expects argument of type 'long int', but argument 6 has type 'VCL_INT {aka long long int}' [-Werror=format=]
   VSLb(&ep->vsl, SLT_ExpKill, "EXP_Expired x=%ju t=%.0f h=%ld",
                                                           ~~^
                                                           %lld
cache/cache_expire.c:362:7:
       oc->hits);
       ~~~~~~~~                                               
  CC       cache/varnishd-cache_fetch_proc.o
cache/cache_expire.c: At top level:
cc1: error: unrecognized command line option '-Wno-nullability-completeness' [-Werror]
cc1: error: unrecognized command line option '-Wno-missing-variable-declarations' [-Werror]
cc1: error: unrecognized command line option '-Wno-system-headers-Wno-thread-safety' [-Werror]
cc1: all warnings being treated as errors

daghf avatar Feb 02 '24 12:02 daghf

goes after march-15 release

bsdphk avatar Feb 19 '24 14:02 bsdphk

during bugwash, @bsdphk brought up the question if this could affect vsl clients, and while it does not look like it, git grep suggests that we should change some other places where FetchError is referred.

nigoroll avatar Feb 19 '24 14:02 nigoroll

after march15 release

bsdphk avatar Feb 19 '24 14:02 bsdphk