Wednesday, December 12, 2012

How to remove non referenced node from DAM in CQ 5.5

Use Case: Clean your DAM to remove non referenced node.

Solution: You can use following script to remove all non referenced node from DAM.

# !/bin/bash
# Author: upadhyay.yogesh@gmail.com
# The host and port of the source server
SOURCE="localhost:4502"
# The user credentials on the source server (username:password)
SOURCE_CRED="admin:admin"
#Filter path
ROOT_PATH="/content/dam/<Your PATH>"
ALL_PATHS=`curl -s -u $SOURCE_CRED "$SOURCE/bin/querybuilder.json?path=$ROOT_PATH&type=dam:Asset&p.limit=-1" | tr ",[" "\n" | sed 's/ /%20/g' | grep path | awk -F \" '{print $4 "\n"}'`
echo "$ALL_PATHS"
for SINGLE_PATH in $ALL_PATHS
do
REFERENCE_COUNT=`curl -s -u $SOURCE_CRED "$SOURCE/bin/wcm/references.json?path=$SINGLE_PATH" | tr ",[" "\n" | sed 's/ /%20/g' | grep path | awk -F \" '{print $4 "\n"}'`
if [ "$REFERENCE_COUNT" == "" ] ; then
  echo "Removing $SINGLE_PATH"
  #curl -u $SOURCE_CRED -F:operation=delete $SOURCE$SINGLE_PATH
fi
done

Post Cleaning Step:

1) Run Data store garbage collection see http://www.wemblog.com/2012/03/how-to-run-online-backup-using-curl-in.html or http://helpx.adobe.com/crx/kb/DataStoreGarbageCollection.html for that)
2) Remove these nodes from publish as well if they are not deactivated already (usually deleting activated node should delete it from publish as well, But please check)

Caution: As always please test this. Also this might not cover Image referenced through CSS or your code. This only covers Image referenced through Image or Image related component.

Note: You can also use JSAWK https://github.com/micha/jsawk for better parsing of response.