Monday, December 5, 2011

How to use CURL command to find pending and blocking job in the replication Queue in CQ5.4 / WEM

Use case : You want to write monitoring script to find all pending and blocking job in the replication queue and do some action.

Solution : Use following curl commands (Change server name, Port, UID and PWD)

To find all pending Jobs
curl -s -u admin:admin "http://localhost:4504/bin/querybuilder.json?&path=/var/eventing/jobs&type=slingevent:Job&p.limit=-1&fulltext=/com/day/cq/replication/job&fulltext.relPath=@slingevent:topic&property.and=true&property=slingevent:finished&property.operation=not&orderby=slingevent:created&orderby.sort=asc" | tr ",[" "\n" | grep path | awk -F \" '{print $4 "\n"}'

To find all Blocking Jobs
curl -s -u admin:admin "http://localhost:4504/bin/querybuilder.json?path=/var/eventing/jobs/anon&type=slingevent:Job&rangeproperty.property=event.job.retrycount&rangeproperty.lowerBound=1" | tr ",[" "\n" | grep path | awk -F \" '{print $4 "\n"}'

Once you have blocking jobs, You can go to CRX and remove blocking entry (Before that make sure that blocking entry is causing problem).

You can also use replication clean up script (It is custom script I wrote to remove one entry) to remove one entry from the queue and then if necessary activate them again.

Blocking replication queue can happen because,

1) There is some problem with sling eventing and queue is not getting processed (For that restart sling event support bundle from felix console)

2) There is a blocking job in the queue (For that find blocking entry in the queue using above curl command and remove it)

3) There is some problem with publish server (503 or OOM etc, In that case restarting publish server should resolve the issue)

11 comments:

  1. How often do you run these commands for monitoring? every minute, 15 seconds, once in a while... only when there is a problem?

    ReplyDelete
    Replies
    1. You can run this command when there is a problem. Also in CQ 5.5 you can use JMX console to find out these information.

      Delete
    2. Hi Yogesh,

      Can you provide details on how to analyse the blocking job details in JMX or any support document

      Delete
  2. Hi

    Can you please tell me how exactly do we look for blocking jobs? I mean what does the query actually does? Because I am facing similar issue but this query isn't returning anything and we are sure that there is a blocking event which is causing issues in our replication queues.

    Thanks
    Dipti

    ReplyDelete
    Replies
    1. Dipti,

      You should be able to see that in replication queue. go to tools -> replication agents for that. If that does not help then modify query to look only under /var/eventing/jobs

      Yogesh

      Delete
  3. Where can I get the script to clean one entry from a replication queue ? Is there any feature like priority item should go first then others. If I get one more priority page I want to activate I should be inject it into queue in first place. Hence that item will go first and later other items in queue will process. I am not sure it is possible or not. If possible let me know how can I do it ?

    ReplyDelete
    Replies
    1. Harry,

      Replication Jobs are always processed in FIFO manner for obvious reason. So if you have one bad replication job it will block all other one. Please check http://www.wemblog.com/2012/07/how-to-clear-replication-queue-in-cq.html for an example of how to clear replication queue.

      Yogesh

      Delete
  4. Hi Yogesh,
    Mahesh here, i would like to know the script for replication queue pending and also if replication is disabled it should let me know.Version is AEM CQ5.6.1

    ReplyDelete
    Replies
    1. You can simply curl /etc/replication/agents.author/publish/jcr%3Acontent.json | jsawk -a 'return this.enabled'

      More info about jsawk

      https://github.com/micha/jsawk

      Delete
  5. Is there anyway I can use a curl to get the number of queued jobs on /system/console/slingevent page?

    ReplyDelete
    Replies
    1. For that you can curl page and use sed or any other tool to parse info. I would suggest to use query builder as mention above.

      Delete