receiveQueueMessage freezes
Hello, I made an agent using PHP, that is pumping the messages from a queue, but I get the receiveQueueMessage() freezing after some time (maybe days). It's a PHP script that is running in loop, with the receiveQueueMessage() in peekLock and a timeout of 5 seconds, the script sleeps for another 10 seconds each loop run.
I've tried to call the ServicesBuilder::getInstance()->createServiceBusService(...) in 3 different ways: -one single call at the start of the script; -one call each 3 minutes; -one call before each call to receiveQueueMessage().
In all the 3 ways I got my script freezing in the receiveQueueMessage() call. The loop is clean, all the vars are set to null each loop run, the memory does not grow, so it does not look to be a memory leak...
Any thought about that?
Thanks
Below is the code that deals with the ServiceBus, it runs as a daemon in Ubuntu 14.04.2 LTS (GNU/Linux 3.13.0-29-generic x86_64)
<?php
require_once('vendor/autoload.php');
use WindowsAzure\Common\ServicesBuilder;
use WindowsAzure\Common\ServiceException;
use WindowsAzure\ServiceBus\Models\ReceiveMessageOptions;
use WindowsAzure\ServiceBus\Models\ListQueuesOptions;
use WindowsAzure\ServiceBus\Models\ListQueuesResult;
use WindowsAzure\Queue\Models;
use Guzzle\Http\Client;
$connectionString = '.............';
$queueName = "myqueue";
$pathLog = './log/';
const printInfo = false;
const printErrors = false;
const mainLoopSleepSecs = 10;
const receiveQueueMessageTimeout = 5;
$isRunning = true;
$serviceBusRestProxy = null;
logMessage( 'START' );
function logMessage( $message )
{
global $pathLog;
try
{
$doLog = true;
if ( ! file_exists( $pathLog ) )
{
$doLog = mkdir( $pathLog, 0700 );
}
if ( $doLog )
{
file_put_contents( $pathLog.'test_'.date("Y-m-d").'.log', date("[ d/m/Y - H:i:s ] ").$message."\n", FILE_APPEND );
}
}
catch (Exception $e)
{
if ( printError ) echo 'Error writing the log file!!! - '.$e->getMessage()."\n";
}
}
// object vars, for cleanup
$options = null;
$message = null;
$bodyArray = null;
$body = null;
// MAIN LOOP
while( $isRunning )
{
$done = false;
// OBS.: in early stages it was done only a single time; in an another test we reconnected each time interval of 3 minutes, but now it reconnects for each receiveQueueMessage() call
// ### reconnect to ServiceBus proxy ###
if ( ! $serviceBusRestProxy )
{
$numTries = 0;
while ( ! $serviceBusRestProxy )
{
try
{
//if ( printInfo ) echo date("[ d/m/Y - H:i:s ] ")."connecting to service bus...";
//logMessage( 'connecting to service bus' );
$serviceBusRestProxy = ServicesBuilder::getInstance()->createServiceBusService($connectionString);
//if ( printInfo ) echo " - connection done!\n";
//logMessage( 'connection to service bus DONE!' );
}
catch (Exception $ex)
{
$numTries++;
if ( printErrors ) echo "createServiceBusService Catch: ".$ex->getMessage()."\n";
logMessage( "createServiceBusService Catch: ".$ex->getMessage() );
if ( $numTries > 10 )
{
exit( 1 );
}
sleep( 10 );
}
}
}
// ### reads the message queue and process it ###
$shortDelay = false;
// deal with the message, short delay will be set to true, so the delay at the end of the loop will be only 500ms
try
{
if ( printInfo ) echo date("[ d/m/Y - H:i:s ] ")."I am alive...";
// ### reads the message queue ###
$options = new ReceiveMessageOptions();
$options->setPeekLock();
$options->setTimeout( receiveQueueMessageTimeout );
logMessage( 'receiveQueueMessage Starting' ); // <<<<<<<<<<< when FREEZE, the program writes this line...
$message = $serviceBusRestProxy->receiveQueueMessage($queueName, $options);
logMessage( 'receiveQueueMessage Finished' ); // <<<<<<<<<<< ... but does not write this one.
$sendMessage = false;
if ( ! $message )
{
if ( printInfo ) echo " - Got Null Message\n";
}
else
{
if ( printInfo ) echo "\n";
$body = $message->getBody();
$sendMessage = true;
}
// ### process the message ###
if ( $sendMessage )
{
if ( !$body )
{
throw new Exception( 'Got an invalid body from the Queue!' );
}
if ( printInfo ) print_r( $body )."\n";
// the body is a json
$bodyArray = json_decode( $body );
if ( json_last_error()!==JSON_ERROR_NONE )
{
throw new Exception( 'Body is in an invalid json format!' );
}
if ( printInfo ) echo "Received from Queue: ".$body."\n";
// code to process the message
{
..............
$done = true;
}
}
}
catch (Exception $ex)
{
if ( printErrors ) echo $ex->getMessage()."\n";
logMessage( "Main Catch: ".$ex->getMessage() );
}
try
{
if ( $message )
{
if ( $done )
{
$serviceBusRestProxy->deleteMessage($message);
$shortDelay = true;
}
else
{
$serviceBusRestProxy->unlockMessage($message);
}
}
}
catch (Exception $ex)
{
if ( printErrors ) echo $ex->getMessage()."\n";
logMessage( "Releasing Message Catch: ".$ex->getMessage() );
exit( 2 );
}
// clean up
$options = null;
$message = null;
$bodyArray = null;
$body = null;
$serviceBusRestProxy = null; // it will force the loop to get again the serviceBusRestProxy
if ( printInfo ) echo "End of cicle....\n";
if ( $isRunning )
{
if ( $shortDelay )
{
// 500 milliseconds = 500000 microseconds
usleep( 500000 );
}
else
{
// X seconds
sleep( mainLoopSleepSecs );
}
}
}
logMessage( 'STOP' );
I am on vacation and will come back in the second week of August. In the meantime, you can add log statements to the source code for receiveQueueMessage() and find out exactly what it is doing when freezing.
Ok Yaqi Yang, thank you for the feedback. I did some changes to make sure that the deleteMessage/unlockMessage will be called and its running now (about 2 days). If I get any freeze I'll put the logs. Regards!
I am closing this since nothing more has been reported recently. Please re-open it if you see more problems.
Yaqi
Hi Yaqi, I've left the program running since my last post, but it froze this weenkend, about 9 days running :( Usually the time to freeze take some days, I'm not sure if it's due to any network instability... Can you check the code above and test it?
Thanks, Gilberto
PS.: On my tests I push blocks of 5k messages (that are a small json with a GUID, a number and another small string), wait it to process all (the "processing" is a curl post to another process on the same machine - 127.0.0.1 - that just check if a file exists and return an error code).
Can you add some message logging in $serviceBusRestProxy->receiveMessage()? That will give you more information on where the freezing occurred.
Specifically, the code is in ServiceBus\ServiceBusRestProxy.php, function receiveMessage($path, $receiveMessageOptions = null) (Line 201)
Sure, I'll look for that tomorrow and let the process running, the result may take a few days. There is any specific point that you want to check with the logs?
Regards, Gilberto
If it is freezes again, I want to know exactly where it got frozen in the SDK source code.
Hi yaqi, I've put 4 log lines at these points in the receiveMessage():
...
logMessage('receiveMessage > setup http OK');
$response = $this->sendContext($httpCallContext);
logMessage('receiveMessage > sendContext OK');
if ($response->getStatus() === Resources::STATUS_NO_CONTENT) {
$brokeredMessage = null;
} else {
$responseHeaders = $response->getHeader();
$brokerProperties = new BrokerProperties();
if (array_key_exists('brokerproperties', $responseHeaders)) {
$brokerProperties = BrokerProperties::create(
$responseHeaders['brokerproperties']
);
}
if (array_key_exists('location', $responseHeaders)) {
$brokerProperties->setLockLocation($responseHeaders['location']);
}
$brokeredMessage = new BrokeredMessage();
$brokeredMessage->setBrokerProperties($brokerProperties);
if (array_key_exists(Resources::CONTENT_TYPE, $responseHeaders)) {
$brokeredMessage->setContentType(
$responseHeaders[Resources::CONTENT_TYPE]
);
}
if (array_key_exists('Date', $responseHeaders)) {
$brokeredMessage->setDate($responseHeaders['Date']);
}
$brokeredMessage->setBody($response->getBody());
logMessage('receiveMessage > before foreach loop');
foreach (array_keys($responseHeaders) as $headerKey) {
$value = $responseHeaders[$headerKey];
$decodedValue = json_decode($value);
if (is_scalar($decodedValue)) {
$brokeredMessage->setProperty(
$headerKey,
$decodedValue
);
}
}
logMessage('receiveMessage > foreach loop OK');
...
I've not added more log entries because it will create huge log files. Let's wait for the results...
Cheers, Gilberto
The problem may be here (around line 212),
$timeout = $receiveMessageOptions->getTimeout();
if (!is_null($timeout)) {
$httpCallContext->addQueryParameter('timeout', $timeout);
}
If freezing happens again, log the value of $timeout. If it is null, the call might hang in some cases.
Ok, I've added a log entry with the $timeout value. Let's wait for the results.
Hi Yaqi,
I got the freeze early now. It's for sure in the sendContext() function.
Here is my last log lines:
[ 16/08/2016 - 18:10:28 ] receiveQueueMessage Starting
[ 16/08/2016 - 18:10:28 ] receiveMessage > timeout value = 5
[ 16/08/2016 - 18:10:28 ] receiveMessage > setup http OK
And that's the modification done to the ServiceBusRestProxy.php:
logMessage('receiveMessage > setup http OK');
$response = $this->sendContext($httpCallContext);
logMessage('receiveMessage > sendContext OK'); // <<<<<< we have not reached this line
There is any other value to log?
sendContext() is where the request is sent to the server. You can remove other message logging, but add logging in sendContext(). I will try to reproduce it also at the same time using your code.
I got something strange... I've added a log to vendor/microsoft/windowsazure/WindowsAzure/Common/Internal/RestProxy.php around the line 142, on the "protected function sendContext($context)":
logMessage( 'RestProxy->sendContext > setup OK' );
$channel->send($this->_filters, $url);
logMessage( 'RestProxy->sendContext > send done!' );
But I got this result with the log:
[ 17/08/2016 - 18:27:55 ] RestProxy->sendContext > setup OK
[ 17/08/2016 - 18:27:55 ] RestProxy->sendContext > send done!
[ 17/08/2016 - 18:28:00 ] RestProxy->sendContext > send done!
I don't understand why I got the two "send done!" with the last one 5 seconds later...
The log function was added to the final of this same file:
function logMessage( $message )
{
$pathLog='/home/test/actionjob/log/';
try
{
$doLog = true;
if ( ! file_exists( $pathLog ) )
{
$doLog = mkdir( $pathLog, 0700 );
}
if ( $doLog )
{
file_put_contents( $pathLog.'test_'.date("Y-m-d").'.log', date("[ d/m/Y - H:i:s ] ").$message."\n", FILE_APPEND );
}
}
catch (Exception $e)
{
echo 'Error writing the log file!!! - '.$e->getMessage()."\n";
}
}
it could be a problem by file_put_contents(). Since it opens and closes a file each time you write to it, it may lose some log entries. You can try to have the file open all the time and keep writing to it.
Sorry, I got the problem... It was a threaded 2 calls to sendContext that blind me. I didn't see it at start because I'd another log entry for the URI value at ServiceRestProxy::sendContext():
Here is the full log:
[ 17/08/2016 - 18:42:42 ] ServiceRestProxy->sendContext > used URI: https://nnnnn.servicebus.windows.net/
[ 17/08/2016 - 18:42:42 ] RestProxy->sendContext > setup OK
[ 17/08/2016 - 18:42:42 ] ServiceRestProxy->sendContext > used URI: https://nnnnn-sb.accesscontrol.windows.net/WRAPv0.9
[ 17/08/2016 - 18:42:42 ] RestProxy->sendContext > setup OK
[ 17/08/2016 - 18:42:42 ] RestProxy->sendContext > send done!
[ 17/08/2016 - 18:42:47 ] RestProxy->sendContext > send done!
Well, I'll let the program running with this log entries to filter if the problem is before or inside the $channel->send($this->_filters, $url); call.
I'll return with the results :)
Hello Yaqi,
Looks that the problem really is with the $channel->send($this->_filters, $url) call. There is any parameter that you wish me to add to the log, before this call to the send?
This is the final output where the "[ 18/08/2016 - 05:44:53 ] RestProxy->sendContext > https://nnnnn.servicebus.windows.net/myqueue/messages/head?timeout=5 > setup OK" have not returned its "> send done!":
[ 18/08/2016 - 05:44:53 ] receiveQueueMessage Starting
[ 18/08/2016 - 05:44:53 ] ServiceRestProxy->sendContext > used URI: https://nnnnn.servicebus.windows.net/
[ 18/08/2016 - 05:44:53 ] RestProxy->sendContext > https://nnnnn.servicebus.windows.net/myqueue/messages/head?timeout=5 > setup OK
[ 18/08/2016 - 05:44:53 ] ServiceRestProxy->sendContext > used URI: https://nnnnn-sb.accesscontrol.windows.net/WRAPv0.9
[ 18/08/2016 - 05:44:53 ] RestProxy->sendContext > https://nnnnn-sb.accesscontrol.windows.net/WRAPv0.9 > setup OK
[ 18/08/2016 - 05:44:53 ] RestProxy->sendContext > https://nnnnn-sb.accesscontrol.windows.net/WRAPv0.9 > send done!
Then you should continue to track down the issue. $channel->send() will be in Common\Internal\Http\HttpClient.php, line 240. You can add several log entries there, and then line 264 , $this->_response = $this->_request->send(); will go to vendor\pear\http_request2\HTTP\Request2.php line 915.
The key is to locate the code where the freezing occurs. Then we can try to fix it.
But what about the parameters that we set in the sendContext function, for the $channel object? There is any that matter to check for this freeze? I believe that it occurs due to some network instability... can you test it there?
it is not supposed to freeze, no matter what. $channel is just an object of Common\Internal\Http\HttpClient.php. You can dump the value of interesting variables you want to the log.
I am running your code now and have not experienced any problems yet. My OS is Windows 10 and PHP is 5.5.33.
I'm running it on an Ubuntu 14.04.2 LTS (GNU/Linux 3.13.0-29-generic x86_64), PHP 5.5.9-1ubuntu4.16 (cli). Try to simulate network failures. I believe that it could be something like that, due to the non constant time for the freeze (sometimes it take 9 days, sometimes 2 days...).
My process is still running all right. You can also run this process on a different machine ( a different OS) and see it freezes.
I've tried it before on a Windows Server 2012 R2 - PHP 5.4.34 (cli) (built: Oct 15 2014 21:58:00), but got the same freezes :( The freeze time can vary from 1 day to more than a week. Let it running...
I have been running it for several days and no freezing yet. But it did have an exception of "Malformed Reponse" once. The process was stopped and but not frozen.
I've got some "Malformed Reponse" as well, usually it brings an empty response from the queue. When I get a freeze I'll let you know. Have you tried to simulate a network instability? I'm unable to do that with my production machine test because its access is only thru SSH and it's not an virtual machine, so if I break the network it'll be a pain to recover it...
BTW, what do you mean by "The process was stopped and but not frozen." ? Do the loop asking for messages stills running?
The whole PHP process was kicked out
Strange... the script should catch it. Do you have any log entry in the log file for this error?
It was caught by this block. That is why the process exited
catch (Exception $ex)
{
if ( printErrors ) echo $ex->getMessage()."\n";
logMessage( "Releasing Message Catch: ".$ex->getMessage() );
exit( 2 );
}