scheduling icon indicating copy to clipboard operation
scheduling copied to clipboard

Removing a node produces exceptions in the logs

Open activeeon-bot opened this issue 11 years ago • 0 comments

Original issue created by Youri Bonnaffe on 22, Aug 2014 at 09:29 AM - SCHEDULING-2102


To reproduce:

  • Add a local nodesource
  • Remove it

In the Scheduler logs, a stacktrace will be printed after the removal (see below). It should not as this scenario is quite simple and we should be able to kill the node/connections properly.

It complicates troubleshooting and debugging.

[2014-08-22 09:24:42,190 INFO ] Adding a new node source TestL to the data base
[2014-08-22 09:24:42,195 INFO ] Creating a node source : TestL
[2014-08-22 09:24:42,216 INFO ] [TestL] Activating the policy Static Policy user access type [ALL], provider access type [ME]
[2014-08-22 09:24:42,217 INFO ] [TestL] Acquiring all nodes
[2014-08-22 09:24:42,222 INFO ] Node source TestL has been successfully created by "admin"
[2014-08-22 09:24:44,415 INFO ] rm is trying to connect
[2014-08-22 09:24:44,416 INFO ] User rm logged successfully
[2014-08-22 09:24:44,425 INFO ] "rm" connected from HalfBody_pa.stub.org.ow2.proactive.resourcemanager.nodesource.dataspace._StubDataSpaceNodeConfigurationAgent#configureNode_21103
[2014-08-22 09:24:45,034 INFO ] Looking up the node pnp://jily.activeeon.com:52632/local-TestL-6 with 30000 ms timeout
[2014-08-22 09:24:45,061 INFO ] The node pnp://jily.activeeon.com:52632/local-TestL-6 has been successfully looked up
[2014-08-22 09:24:45,061 INFO ] [TestL] new node available : pnp://jily.activeeon.com:52632/local-TestL-6
[2014-08-22 09:24:52,771 INFO ] admin is trying to connect
[2014-08-22 09:24:52,771 INFO ] User admin logged successfully
[2014-08-22 09:24:55,595 INFO ] "admin" requested removal of the TestL node source
[2014-08-22 09:24:55,600 INFO ] Removing node pnp://jily.activeeon.com:52632/local-TestL-6
[2014-08-22 09:24:55,601 INFO ] [TestL] removing node : pnp://jily.activeeon.com:52632/local-TestL-6
[2014-08-22 09:24:55,602 INFO ] Process associated to node local-TestL-6 destroyed
[2014-08-22 09:24:55,605 INFO ] [TestL] is shutting down by "admin"
[2014-08-22 09:24:55,605 INFO ] [TestL] Shutdown finalization
[2014-08-22 09:24:55,605 INFO ] Removing the node source TestL from the data base
[2014-08-22 09:24:55,618 INFO ] "admin" disconnected from ActiveObject_org.ow2.proactive.resourcemanager.nodesource.NodeSource_29370
[2014-08-22 09:24:55,618 INFO ] Node Source removed : TestL
[2014-08-22 09:24:55,631 INFO ] "admin" disconnected from ActiveObject_org.ow2.proactive.resourcemanager.nodesource.policy.StaticPolicy_18118
[2014-08-22 09:25:11,363 INFO ] cleaning session started, 2 existing session(s) 
[2014-08-22 09:25:11,363 INFO ] cleaning session ended, 0 session(s) removed
[2014-08-22 09:25:16,109 WARN ] main : unable to contact remote object [pnp://jily.activeeon.com:52632/HalfBody_pa.stub.org.ow2.proactive.resourcemanager.nodesource.dataspace._StubDataSpaceNodeConfigurationAgent%23configureNode_-5a6c45cb-147fc9a6513--7ffe--a072e474ce7b7a89--5a6c45cb-147fc9a6513--8000] when calling method receiveHeartbeat
org.objectweb.proactive.core.exceptions.IOException6: Failed to send PNP message to pnp://jily.activeeon.com:52632/HalfBody_pa.stub.org.ow2.proactive.resourcemanager.nodesource.dataspace._StubDataSpaceNodeConfigurationAgent%23configureNode_-5a6c45cb-147fc9a6513--7ffe--a072e474ce7b7a89--5a6c45cb-147fc9a6513--8000
        at org.objectweb.proactive.extensions.pnp.PNPROMessage.send(PNPROMessage.java:117)
        at org.objectweb.proactive.extensions.pnp.PNPRemoteObject.receiveMessage(PNPRemoteObject.java:82)
        at org.objectweb.proactive.core.remoteobject.RemoteObjectSet.receiveMessage(RemoteObjectSet.java:205)
        at org.objectweb.proactive.core.remoteobject.RemoteObjectAdapter.receiveMessage(RemoteObjectAdapter.java:151)
        at org.objectweb.proactive.core.remoteobject.SynchronousProxy.reify(SynchronousProxy.java:78)
        at pa.stub.org.objectweb.proactive.core.body._StubUniversalBody.receiveHeartbeat(_StubUniversalBody.java)
        at org.objectweb.proactive.core.body.UniversalBodyRemoteObjectAdapter.receiveHeartbeat(UniversalBodyRemoteObjectAdapter.java:132)
        at org.objectweb.proactive.api.PAActiveObject.pingActiveObject(PAActiveObject.java:1152)
        at org.ow2.proactive.resourcemanager.authentication.Client.isAlive(Client.java:200)
        at org.ow2.proactive.resourcemanager.utils.ClientPinger.ping(ClientPinger.java:106)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.objectweb.proactive.core.mop.MethodCall.execute(MethodCall.java:353)
        at org.objectweb.proactive.core.body.request.RequestImpl.serveInternal(RequestImpl.java:214)
        at org.objectweb.proactive.core.body.request.RequestImpl.serve(RequestImpl.java:160)
        at org.objectweb.proactive.core.body.BodyImpl$ActiveLocalBodyStrategy.serveInternal(BodyImpl.java:549)
        at org.objectweb.proactive.core.body.BodyImpl$ActiveLocalBodyStrategy.serve(BodyImpl.java:482)
        at org.objectweb.proactive.core.body.AbstractBody.serve(AbstractBody.java:426)
        at org.objectweb.proactive.Service.serve(Service.java:125)
        at org.ow2.proactive.resourcemanager.utils.ClientPinger.runActivity(ClientPinger.java:132)
        at org.objectweb.proactive.core.body.ActiveBody.run(ActiveBody.java:164)
        at java.lang.Thread.run(Unknown Source)
Caused by: org.objectweb.proactive.extensions.pnp.exception.PNPIOException: Failed to connect to jily.activeeon.com/192.168.1.19:52632
        at org.objectweb.proactive.extensions.pnp.PNPAgent$PNPClientChannel.<init>(PNPAgent.java:377)
        at org.objectweb.proactive.extensions.pnp.PNPAgent$PNPClientChannelCache.getChannel(PNPAgent.java:317)
        at org.objectweb.proactive.extensions.pnp.PNPAgent$PNPClientChannelCache.getChannel(PNPAgent.java:305)
        at org.objectweb.proactive.extensions.pnp.PNPAgent.sendMsg(PNPAgent.java:187)
        at org.objectweb.proactive.extensions.pnp.PNPAgent.sendMsg(PNPAgent.java:175)
        at org.objectweb.proactive.extensions.pnp.PNPROMessage.send(PNPROMessage.java:115)
        ... 23 more
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
        at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:400)
        at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:362)
        at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:284)
        at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
        at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        ... 1 more
[2014-08-22 09:25:16,111 WARN ] Client "rm" is down.
[2014-08-22 09:25:16,118 INFO ] "rm" disconnected from HalfBody_pa.stub.org.ow2.proactive.resourcemanager.nodesource.dataspace._StubDataSpaceNodeConfigurationAgent#configureNode_21103

activeeon-bot avatar Aug 22 '14 07:08 activeeon-bot