Friday, May 10, 2013

How to Perform System Clean Up in Adobe CQ / AEM (CQ5.5)

Use Case:

CQ System grows over time as more data is modified, removed and added. CQ follow append only model for datastore, so data is never deleted from datastore even if it is deleted from console. Also over the time we end up having a lot of unnecessary packages as part of deployment and migration. On top of that adding a lot of DAM asset create a lot of workflow data that is not required.

As a result of which Disk size increases and if you are planning to have many instances sharing same hardware (Specially dev) it make sense to reduce size of instance time to time.

Solution: 

You can use following script to clean your data time to time.

Prerequisite:

Get workflow purge script from here

Step 1:

Create file with information about your instance (For example here name is host_list.txt)

#File is use to feed the clean up package script
#FORMAT HOST:PORT
<YOUR SERVER>:<PORT>
#END

Step 2:

Actual Script

#!/bin/bash
#
# Description:
#      Clean Master author Only
#      Clean Old Packages
#      Clean DataStore GC


PURGE_WORK_FLOWS_FILE="purge-workflows-2.zip"
CURL_USER='admin:my_super_secret'
IS_PURGE_PAK_FOUND=NO
MY_HOST_LIST=host_list.txt
# Name of package group that you want to clear
PACKAGE_GROUP=<MY PACKAGE GROUP>


if [ ! -f "${MY_HOST_LIST}" ]; then
  echo "Error cannot find host list file: ${MY_HOST_LIST}"
  echo "Exiting ..."
  exit 1;
fi

function run_purge_job()
{
MY_HOST= <YOUR HOST NAME>
IS_PURGE_PAK_FOUND=$(curl -su "${CURL_USER}" "http://${MY_HOST}:4502/crx/packmgr/service.jsp?cmd=ls" | grep "name" | grep "purge-workflows-2" | tr -d ' \t\n\r\f')

if [ -z "${IS_PURGE_PAK_FOUND}" ]; then
  IS_PURGE_PAK_FOUND=NO
else
  IS_PURGE_PAK_FOUND=YES
fi

if [ "$IS_PURGE_PAK_FOUND" = "NO" -a -f $PURGE_WORK_FLOWS_FILE ]; then
   MY_PAK_NAME=$(basename $PURGE_WORK_FLOWS_FILE .zip)
   MY_STATUS=$(curl -su "${CURL_USER}" -f -F"install=true" -F name=$MY_PAK_NAME -F file=@$PURGE_WORK_FLOWS_FILE http://${MY_HOST}:4502/crx/packmgr/service.jsp | grep code=\"200\"| tr -d ' \t\n\r\f')

   if [ -z "${MY_STATUS}" ]; then
     echo "Error uploading $PURGE_WORK_FLOWS_FILE exiting..."
     exit 1
   fi
fi

if [ "${IS_PURGE_PAK_FOUND}" = "YES" ]; then
   curl -su "${CURL_USER}"  -X POST --data "status=COMPLETED&runpurge=1&Start=Run"  http://${MY_HOST}:4502/apps/workflow-purge/purge.html > /dev/null 2>&1
    sleep 10
   curl -su "${CURL_USER}"  -X POST --data "status=ABORTED&runpurge=1&Start=Run"  http://${MY_HOST}:4502/apps/workflow-purge/purge.html > /dev/null 2>&1
fi
}

function clean_old()
{
for MY_HOST in $(cat $MY_HOST_LIST|grep -v '#')
do
IS_INSTANCE_UP=$(curl --connect-timeout 20 -su "${CURL_USER}" -X POST "http://${MY_HOST}/crx/packmgr/service.jsp?cmd=ls" | grep "name" | grep -i ${PACKAGE_GROUP} | tr -d ' \t\n\r\f')

if [ -z "${IS_INSTANCE_UP}" ]; then
   continue
fi

# You can have multiple package here
# Or you can use Commands from here
echo "deleting package group"
curl -su "${CURL_USER}" -F" :operation=delete" http://${MY_HOST}/etc/packages/<PACKAGE GROUP NAME> > /dev/null 2>&1
 sleep 10
 done
}

function clean_datastore_gc()
{
for MY_HOST in $(cat $MY_HOST_LIST|grep -v '#')
do


IS_INSTANCE_UP=$(curl --connect-timeout 20 -su "${CURL_USER}" -Is "http://${MY_HOST}/crx/packmgr/index.jsp"  | grep HTTP | cut -d ' ' -f2)

if [ ${IS_INSTANCE_UP} -eq 200 ]; then
   continue
fi
echo "running datastore gc"
   curl -su  "${CURL_USER}" -X POST --data "delete=true&delay=2" http://${MY_HOST}/system/console/jmx/com.adobe.granite%3Atype%3DRepository/op/runDataStoreGarbageCollection/java.lang.Boolean > /dev/null 2>&1
done
}

case "$1" in
  'purge')
   run_purge_job
;;
  'clean_paks')
   clean_old
;;
  'clean_ds')
   clean_datastore_gc
;;
*)
  echo $"Usage: $0 {purge|clean_paks|clean_ds}"
  exit 1
  ;;
esac
exit 0
#
#end


Manual Cleaning:

CQ5.5 and before:
1) Download workflow purge script from here
2) Install purge script using package manager
3) Login as admin or as user having administrative access
4) Go to http://${MY_HOST}:4502/apps/workflow-purge/purge.html
5) Select completed from drop down and run purge workflow.
6) You might have to run it multiple time to make sure that everything is deleted.
7) Using crxde light or crx explorer using admin session go to /etc/packages/<Your package group>
8) Delete package you want to delete
9) After deleting click save all
10) To run datastore GC please follow http://www.wemblog.com/2012/03/how-to-run-online-backup-using-curl-in.html Or http://www.cqtutorial.com/courses/cq-admin/cq-admin-lessons/cq-maintenance/cq-datastore-gc


In CQ 5.6 OOTB you can configure audit and workflow purge using instruction here http://helpx.adobe.com/cq/kb/howtopurgewf.html


Special Thanks to Rexwell Minnis for organizing this script.

Note: Please Test This before use. I did not get enough time to test it completely.

18 comments:

  1. thx for the script

    I think there is a small problem with your script.
    Shouldn't it be status=ABORTED than status=OBORTED?

    Regards
    jan

    ReplyDelete
  2. Once more me:
    for the purge of workflows and audits there is also a solution from adobe.

    see http://helpx.adobe.com/cq/kb/howtopurgewf.html

    ReplyDelete
    Replies
    1. Thanks A lot for your suggestions Jan. Fixed spelling mistake and yes even above solution will work. Just wanted to show how it can be scripted so that can easily be used by system admin through either cron Job or independent.

      Delete
  3. We are seeing the below warning message for the tar optimization log and also though we disable tar optimization (to improve our performance) and restarted the server it still runs and gives the following warning message:

    *WARN* [Tar PM Optimization] com.day.crx.persistence.tar.ReentrantLockWithInfo Lock on tarJournal still held by Thread[pool-6-thread-1,5,main]: 1



    And when we generate a threadump we see the following:

    Thread 2694: (state = BLOCKED)
    - com.day.crx.persistence.tar.TarPersistenceManager.loadBundle(org.apache.jackrabbit.core.id.NodeId) @bci=0, line=330 (Compiled frame)
    - org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.getBundleCacheMiss(org.apache.jackrabbit.core.id.NodeId) @bci=17, line=766 (Compiled frame)
    - org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.getBundle(org.apache.jackrabbit.core.id.NodeId) @bci=37, line=749 (Compiled frame)


    Any help will be much appreciated.

    ReplyDelete
    Replies
    1. Hello,

      Are you using cluster instance ? If yes then you will see this message in case you are running tar optimization and at same time writes are going on. Thats why it is recommended to run system clean up activities when system load is low.

      Yogesh

      Delete
  4. Can you list how you perform these tasks manually?

    ReplyDelete
    Replies
    1. Manual instruction added. Thank you for your feedback.

      Yogesh

      Delete
  5. Hi Yogesh,

    Can you please let me know how to integrate RubyRails with CQ5.6?

    I ams Adobe CQ5 Admin.I need it from Admin Perspective.

    I have installed CQ5.6 Author and publish instances.
    Also installed Ruby Rails , Easy Eclipse for ruby rails , Radrails IDE (I thought that i can integrate any one i mentioned).

    My Devs are gonna write code in Radrails IDE or any IDE above specified.

    Later they need to integrate that with CQ5.6

    Note: Installed Maven aswell.

    Please let me know the process.

    Thanks in Advance
    Mahesh

    ReplyDelete
    Replies
    1. Hello Mahesh,

      Ruby on Rail is not supported in CQ web container. May be this will help https://www.ruby-forum.com/topic/209564 .

      Yogesh

      Delete
  6. Hi Yogesh,

    Thanks for your reply.

    How can we integrate jruby in CQ5?
    How can we move the jruby code from CQ5 to ROR(RubyonRails)?

    If it is possible let me know the procedure.

    Or

    Should we need to involve the Adobe team to work on this integration part?

    Thanks

    Mahesh

    ReplyDelete
    Replies
    1. Hi Yogesh,

      Can you reply plz?

      Thanks,
      Mahesh

      Delete
    2. Hello Mahi,

      You have to involve Adobe Professional services for this request. See some information here http://maniagnosis.crsr.net/2009/07/running-jruby-in-osgi-container.html

      Yogesh

      Delete
    3. Hi Yogesh,

      Thanks for your info.

      Rgds,
      Mahesh

      Delete
  7. Hi Yogesh,

    Could you pleas let me know how to setyp wily introscope to CQ5.6.1

    ReplyDelete
  8. Has anyone been able to setup CA Introscope to monitor CQ5?

    ReplyDelete