Vincent Garonne

Results 26 comments of Vincent Garonne

We have another observation with two RemoteTransferManagers enabled on one instance at BNL. We observed some 5-10% failures in transfers. In the billing logs, the errors are related to post...

> A few questions: > Which version of dCache is the WebDAV door running? 7.2.21 is deployed uniformly (doors, pools, central nodes.) > Do these transfer failures stop (i.e., all...

Hi, ``` l-AAXndfG1HlA--AAXndfG5Z_g A 0 0 2 LocationMgrTunnel Connected to webdav-dcdndoor04_httpsDomain l-AAXndfG1HlA--AAXndfG5a-A A 0 0 2 LocationMgrTunnel Connected to webdav-dcdndoor03_httpsDomain l-AAXndfG1HlA--AAXndfG5ZBA A 0 0 2 LocationMgrTunnel Connected to webdav2-dcdndoor04_httpsDomain l-AAXndfG1HlA--AAXndfG4-pg...

Logs from RemoteTransferManagers for one file failing: 0000D9D6E3BAF0F04C9C9DF3F9301DC42A0C ``` [root@dcdncore01 ~]# grep 0000D9D6E3BAF0F04C9C9DF3F9301DC42A0C /var/log/dcache/dcdncore01Domain.log PNFS-ID: 0000D9D6E3BAF0F04C9C9DF3F9301DC42A0C 30 Aug 2022 10:35:28 (RemoteTransferManager) [door:WebDAV2-dcdndoor04@webdav2-dcdndoor04_httpsDomain:AAXndkqmlrA WebDAV2-dcdndoor04 RemoteTransferManager] PoolMgrSelectPoolMsg: PnfsId=0000D9D6E3BAF0F04C9C9DF3F9301DC42A0C;StorageInfo=size=0;new=true;stored=false;sClass=dunepro:DISKDATA;cClass=-;hsm=osm;accessLatency=NEARLINE;retentionPolicy=CUSTODIAL;path=/pnfs/sdcc.bnl.gov/data/dune/RSE/ftstest/dc4-vd-coldbox-bottom/1c/8f/dc4_np02bde_307100901_np02_bde_coldbox_run012352_0039_20211215T232400.hdf5;uid=-1;writeToken=6;gid=-1;StoreName=dunepro;xattr.xdg.origin.url=https://eospublic.cern.ch//eos/experiment/neutplatform/protodune/dune/dc4-vd-coldbox-bottom/1c/8f/dc4_np02bde_307100901_np02_bde_coldbox_run012352_0039_20211215T232400.hdf5;links=0000C336567E6BD741839C5CD1D8F7C336E6 dc4_np02bde_307100901_np02_bde_coldbox_run012352_0039_20211215T232400.hdf5;store=dunepro;group=DISKDATA;bfid=;; 30 Aug 2022...

> Could you check the domain log file (for domain webdav2-dcdndoor04_httpsDomain) to see whether the WebDAV door logged anything about this transfer ... or anything else (out of the ordinary)...

Yes, from the billings DBs and timestamps other successful transfers are ongoing on the same RTMs at the same time. Another observation, with enable.db=true for RTMsn there are SQL errors...

> A more specific question: did any transfer (that used this WebDAV door) start or end while this transfer was ongoing; i.e., between 10:35:28 and 10:38:02 See plot for transfer...

+1M Finally a proper alternative of updating ´updated_at´ :)

With this recommended gplazma configuration from ATLAS: ``` [${host.name}Domain/gplazma] gplazma.oidc.provider!atlas = https://atlas-auth.web.cern.ch/ -profile=wlcg -prefix=/pnfs/usatlas.bnl.gov/ -authz-id="uid:XXXX gid:XXXX username:XXXXX" gplazma.oidc.audience-targets = https://wlcg.cern.ch/jwt/v1/any https://dcgftp.usatlas.bnl.gov https://dcgftp.usatlas.bnl.gov:2881 roots://dcgftp.usatlas.bnl.gov:1094 roots://dcgftp.usatlas.bnl.gov:1096 dcgftp.usatlas.bnl.gov https://dcdoor-tape.usatlas.bnl.gov dcdoor-tape.usatlas.bnl.gov ```

Hi @kofemann, We'll inform you after the upgrade to version 9.2.* on the 22nd of January