janusgraph icon indicating copy to clipboard operation
janusgraph copied to clipboard

Using ConfiguredGraphFactory.drop failed when writingLogSetting

Open liu-chao-simon opened this issue 3 years ago • 3 comments

Discussed in https://github.com/JanusGraph/janusgraph/discussions/3105

Originally posted by liu-chao-simon June 26, 2022 I am using dynamic graphs features of Janusgraph 0.6.2, i can create graph successfully. But from the janusgraph server side, it always show exception log when using ConfiguredGraphFactory.drop to drop a graph.

I'm using CQL(a single node Cassandra) as storage backend, and here is the server configurations gremlin-server.yaml

host: 0.0.0.0
port: 8182
evaluationTimeout: 0
channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
graphManager: org.janusgraph.graphdb.management.JanusGraphManager
graphs:
  ConfigurationManagementGraph: conf\janusgraph-cql.properties
scriptEngines:
  gremlin-groovy:
    plugins:
      org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {}
      org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {}
      org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {}
      org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]}
      org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: []}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  # Older serialization versions for backwards compatibility:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1 }
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: true }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics:
  consoleReporter: {enabled: true, interval: 180000}
  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv}
  jmxReporter: {enabled: true}
  slf4jReporter: {enabled: true, interval: 180000}
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST}
  graphiteReporter: {enabled: false, interval: 180000}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536000
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
threadPoolWorker: 4
gremlinPool: 4

janusgraph-cql.properties

gremlin.graph=org.janusgraph.core.ConfiguredGraphFactory
graph.graphname=ConfigurationManagementGraph

storage.backend=cql
storage.hostname=192.168.1.100:9042

I can create graph successfully using scripts

map = new HashMap();
map.put("storage.backend", "cql");
map.put("storage.hostname", "192.168.1.100:9042");
ConfiguredGraphFactory.createTemplateConfiguration(new MapConfiguration(map));

g1 = ConfiguredGraphFactory.create("graph1");

But, exception always happen on server side when drop graph by ConfiguredGraphFactory.drop("graph1")

1354978 [pool-10-thread-1] INFO  org.janusgraph.diskstorage.log.kcvs.KCVSLog  - Loaded unidentified ReadMarker start time 2022-06-26T06:36:07.607Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@26790a9f
1357285 [pool-33-thread-1] INFO  org.janusgraph.core.ConfiguredGraphFactory$DropGraphOnEvictionTrigger  - Graph a has been removed from the graph cache on every JanusGraph node in the cluster.
1357285 [pool-33-thread-1] WARN  org.janusgraph.core.ConfiguredGraphFactory$DropGraphOnEvictionTrigger  - Attempting to drop the graph a.
**1358411 [pool-33-thread-1] ERROR org.janusgraph.diskstorage.log.kcvs.KCVSLog  - Reader thread pool for KCVSLog systemlog did not shut down in time - could not clean up or set read markers**
1358412 [pool-33-thread-1] INFO  org.janusgraph.diskstorage.util.BackendOperation  - Temporary exception during backend operation [writingLogSetting]. Attempting backoff retry.
org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
	at io.vavr.API$Match$Case0.apply(API.java:5135)
	at io.vavr.API$Match.of(API.java:5092)
	at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.lambda$static$0(CQLKeyColumnValueStore.java:120)
	at org.janusgraph.diskstorage.cql.function.mutate.AbstractCQLMutateManyUnloggedFunction.mutateMany(AbstractCQLMutateManyUnloggedFunction.java:64)
	at org.janusgraph.diskstorage.cql.CQLStoreManager.mutateMany(CQLStoreManager.java:302)
	at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.mutate(CQLKeyColumnValueStore.java:430)
	at org.janusgraph.diskstorage.log.kcvs.KCVSLog$3.call(KCVSLog.java:915)
	at org.janusgraph.diskstorage.log.kcvs.KCVSLog$3.call(KCVSLog.java:912)
	at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:106)
	at org.janusgraph.diskstorage.util.BackendOperation$1.call(BackendOperation.java:120)
	at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:66)
	at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:52)
	at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:117)
	at org.janusgraph.diskstorage.log.kcvs.KCVSLog.writeSetting(KCVSLog.java:912)
	at org.janusgraph.diskstorage.log.kcvs.KCVSLog.lambda$close$0(KCVSLog.java:346)
	at org.janusgraph.util.system.ExecuteUtil.executeWithCatching(ExecuteUtil.java:32)
	at org.janusgraph.diskstorage.log.kcvs.KCVSLog.close(KCVSLog.java:344)
	at org.janusgraph.util.system.ExecuteUtil.executeWithCatching(ExecuteUtil.java:32)
	at org.janusgraph.diskstorage.log.kcvs.KCVSLogManager.close(KCVSLogManager.java:251)
	at org.janusgraph.util.system.ExecuteUtil.executeWithCatching(ExecuteUtil.java:32)
	at org.janusgraph.diskstorage.Backend.close(Backend.java:594)
	at org.janusgraph.util.system.IOUtils.closeQuietly(IOUtils.java:63)
	at org.janusgraph.graphdb.database.StandardJanusGraph.closeInternal(StandardJanusGraph.java:286)
	at org.janusgraph.graphdb.database.StandardJanusGraph.close(StandardJanusGraph.java:244)
	at org.janusgraph.core.JanusGraphFactory.drop(JanusGraphFactory.java:224)
	at org.janusgraph.core.ConfiguredGraphFactory$DropGraphOnEvictionTrigger.call(ConfiguredGraphFactory.java:189)
	at org.janusgraph.core.ConfiguredGraphFactory$DropGraphOnEvictionTrigger.call(ConfiguredGraphFactory.java:169)
	at org.janusgraph.graphdb.database.management.ManagementLogger$EvictionTrigger.runTriggers(ManagementLogger.java:191)
	at org.janusgraph.graphdb.database.management.ManagementLogger$EvictionTrigger.receivedAcknowledgement(ManagementLogger.java:183)
	at org.janusgraph.graphdb.database.management.ManagementLogger.read(ManagementLogger.java:114)
	at org.janusgraph.diskstorage.log.util.ProcessMessageJob.run(ProcessMessageJob.java:46)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
	at java.util.concurrent.FutureTask.run(FutureTask.java)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: java.lang.InterruptedException
	at io.vavr.concurrent.FutureImpl$1.block(FutureImpl.java:238)
	at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
	at io.vavr.concurrent.FutureImpl._await(FutureImpl.java:202)
	at io.vavr.concurrent.FutureImpl.await(FutureImpl.java:168)
	at org.janusgraph.diskstorage.cql.function.mutate.CQLExecutorServiceMutateManyUnloggedFunction.mutate(CQLExecutorServiceMutateManyUnloggedFunction.java:59)
	at org.janusgraph.diskstorage.cql.function.mutate.AbstractCQLMutateManyUnloggedFunction.mutateMany(AbstractCQLMutateManyUnloggedFunction.java:61)
	... 35 more
Caused by: java.lang.InterruptedException
	... 41 more

Please note that ERROR log: 1358411 [pool-33-thread-1] ERROR org.janusgraph.diskstorage.log.kcvs.KCVSLog - Reader thread pool for KCVSLog systemlog did not shut down in time - could not clean up or set read markers I have traced this error into method KCVSLog.close(), here are the key lines

if (readExecutor!=null) {
            try {
                readExecutor.awaitTermination(1,TimeUnit.SECONDS);
            } catch (InterruptedException e) {
                log.error("Could not terminate reader thread pool for KCVSLog {} due to interruption", name, e);
            }
            if (!readExecutor.isTerminated()) {
                readExecutor.shutdownNow();
                log.error("Reader thread pool for KCVSLog {} did not shut down in time - could not clean up or set read markers", name);
            } else {
                for (MessagePuller puller : msgPullers) {
                    puller.close();
                }
            }
        }

ExceptionWrapper exceptionWrapper = new ExceptionWrapper();
        ExecuteUtil.executeWithCatching(() -> {
            try{
                writeSetting(manager.senderId, MESSAGE_COUNTER_COLUMN, numMsgCounter.get());
            } catch (Throwable e){
                log.error("Could not persist message counter [{}] ; message counter [{}]", manager.senderId, numMsgCounter.get(), e);
                throw e;
            }
        }, exceptionWrapper);

I found that readExecutor.shutdownNow() is trying to terminate itself, this throws an InterruptionException which will cause the writeSetting failure. Seems this is a inevitable failure during dropping graph, or am i using the wrong configurations?

Hoping your answers~

liu-chao-simon avatar Jun 26 '22 07:06 liu-chao-simon

Not sure if this is related, but have you tried using a higher value (e.g. 30000) for evaluationTimeout? Right now you are using 0 which does not make much sense to me.

li-boxuan avatar Jul 08 '22 20:07 li-boxuan

@li-boxuan Per the Gremlin docs (CTFL+F for "evaluationTimeout")

The amount of time in milliseconds before a request evaluation and iteration of result times out. This feature can be turned off by setting the value to 0.

So @liu-chao-simon is disabling the timeout by setting it to 0. He needs to, however, also run :remote config timeout none in his gremlin console in order to disable the timeout on the client end as well. Need to be separately disabled on both.

baughmann avatar Nov 15 '22 21:11 baughmann