Monday, December 9, 2013

How to fix com.day.crx.persistence.tar.ClusterTarSet Could not open java.io.IOException: Bad file descriptor Issue in CQ

Issue: Server does not start and you see following error in logs


*WARN* [FelixStartLevel] com.day.crx.persistence.tar.ClusterTarSet Could not open java.io.IOException: Bad file descriptor
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:486)
at com.day.crx.persistence.tar.file.TarFile.write(TarFile.java:742)
at com.day.crx.persistence.tar.file.TarFile.writeData(TarFile.java:635)
at com.day.crx.persistence.tar.file.TarFile.appendMetaData(TarFile.java:709)
at com.day.crx.persistence.tar.file.TarFile.append(TarFile.java:591)
at com.day.crx.persistence.tar.TarSet.switchDataFile(TarSet.java:445)
at com.day.crx.persistence.tar.TarSet.open(TarSet.java:227)
at com.day.crx.persistence.tar.ClusterTarSet.reopenCopy(ClusterTarSet.java:1455)
at com.day.crx.persistence.tar.ClusterTarSet.open(ClusterTarSet.java:860)
at com.day.crx.persistence.tar.TarPersistenceManager.openTarSet(TarPersistenceManager.java:980)
at com.day.crx.persistence.tar.TarPersistenceManager.init(TarPersistenceManager.java:500)
at com.day.crx.core.CRXRepositoryImpl.createVersionManager(CRXRepositoryImpl.java:869)
at org.apache.jackrabbit.core.RepositoryImpl.<init>(RepositoryImpl.java:311)
at com.day.crx.core.CRXRepositoryImpl.<init>(CRXRepositoryImpl.java:307)
at com.day.crx.core.CRXRepositoryImpl.create(CRXRepositoryImpl.java:262)
at com.day.crx.core.CRXRepositoryImpl.create(CRXRepositoryImpl.java:245)
at com.day.crx.sling.server.impl.jmx.ManagedRepository.activate(ManagedRepository.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.felix.scr.impl.helper.BaseMethod.invokeMethod(BaseMethod.java:236)
at org.apache.felix.scr.impl.helper.BaseMethod.access$500(BaseMethod.java:37)
at org.apache.felix.scr.impl.helper.BaseMethod$Resolved.invoke(BaseMethod.java:613)
at org.apache.felix.scr.impl.helper.BaseMethod.invoke(BaseMethod.java:496)

at org.apache.felix.scr.impl.helper.ActivateMethod.invoke(ActivateMethod.java:149)

Solution: 

1) Find a line above this error, it would be something like

09.12.2013 13:42:32.030 *INFO* [FelixStartLevel] com.day.crx.persistence.tar.TarSet scanning index <some path>/crx-quickstart/repository/<either version or workspace>/data_<some number>.tar 

2) Based on path go to that location.
3) STOP YOUR INSTANCE. remove all index files using rm -rf <Path from above>/index*tar
4) change permission of data tar file using chmod 644 <path from above>/data*tar
5) Start instance
6) Some cases data tar files can not be recovered. Please check my other post to fix non recovery data tar files.

Caution: If there are a lot of data tar files, Index creation may take some time. Please create daycare ticket to find root cause of this issue.

Wednesday, July 31, 2013

How to host Adobe Dynamic Tag Management System files in CQ dispatcher

Use Case: You often have situation where for marketing and analytics purpose you have to reply on dev team to push tracking code or tag management code or inclusion of any third party client side library. Satellite Search and Discovery provide great way to abstract client side tracking or tagging changes for marketing and analytics with development.
Documentation on Tag Management can be found here

Challenges: 

1) Update to CQ could happen out side dev cycle. For this make sure that satellite changes are completely tested before using it in production.
2) Since Satellite is similar to SAS service, changes in satellite could also cause some module not to work as expected. For that you can keep track of changes in satellite side and test that module.
3) Some time satellite load is slow if it is loaded from there hosted service. For this you can host satellite code on dispatcher and use some script to update it every time there is any change.

Include satellite script to CQ:

You can simply use,

<script type="text/javascript" src="SOME-PATH.js" ></script>
you can also use run mode specific configuration to include dev or prod specific script to your side.
Set<String> runModes = sling.getService(SlingSettingsService.class).getRunModes();
if(!runModes.contains("author")) { 
   if(runModes.contains("prod"){
             //Include prod satellite code
   }else{
            //Include dev satellite code
   }
}

Host satellite Script on Dispatcher:

For performance you can host satellite on dispatcher itself and then include it in your file. For this satellite provide a feature for deploy Hook. Deploy hook URL is called every time there is a change in any configuration or any rules are published. 



If you want to host satellite files on dispatcher then you can give this deploy hook URL as your production server URL path that you want to call every time there is any change. For example I want call a servlet or any script on change and update change in dispatcher and all other publish and author instance.

Here I am using python script to make this update,  Process is like this,
  • Changes made in satellite
  • Satellite call a dispatcher URL
  • Dispatcher URL invoke python script (You need rewrite rule to do that)
  • Python script checks if this is staging or production server
  • Based on that it get corresponding satellite files which is in zip fomat
  • Script unzip file, remove existing files from dispatcher if present and put it in dispatcher in certain location
  • Then it calls other dispatchers to update files as well
  • Then it issue upload request to upload changed file to author
  • After that it issue tree activation request to update these files on all publish. (This step is required in case some one clear dispatcher cache).
  • In order to avoid infinite loop within all dispatchers, one dispatcher call other dispatcher with a URL param indicating not to call other dispatcher.
  • for UID:PWD you can use non admin users that only have access to satellite files, make sure that they have activation rights as well.

You can use 

Note: I am using old version of python, You can reduce code with latest version.


#!/usr/bin/python
import urllib2
import shutil
import urlparse
import os
import sys
import zipfile
import cgitb; cgitb.enable()
import cgi
import socket
import urllib
#import pwd
#import grp
#Global Var
#Read URL from path
#Staging IP List contain list of IP for Stage
staging_IP_list = ["X.X.X.X","X.X.X.X"]
#Production IP list
production_IP_list = ["Y.Y.Y.Y","Y.Y.Y.Y","Y.Y.Y.Y","Y.Y.Y.Y"]
#This is required to avoid circular loop
form = cgi.FieldStorage()
checked = form.getvalue("checked")
addr = socket.gethostbyname(socket.gethostname())
#This folder path is required to avoid permission issue.
rootFolder = "../some/folder/in/dispatcher"
#Need to create rewrite mapping for this to work.
pingUrl = "/some/path/for/dispatcher?checked=true"
staging_satellite_url="THIS IS THE URL WHERE DEV SATELLITE FILE IS HOSTED"
production_satellite_url="THIS IS THE URL WHERE PROD SATELLITE FILE IS HOSTED"
url=staging_satellite_url
author_content_folder="FOLDER NAME WHERE YOU WANT THIS FILE TO GO"
author_content_subfolder="SUBFOLDER NAME GIVEN BY SATELLITE /"
satellite_js_file_name="SATELLITE FILE NAME GIVEN BY SATELLITE"
staging_author_server="YOUR-STAGING-AUTHOR-SERVER"
production_author_server="YOUR-PRODUCTION-AUHTOR-SERVER"
curl_ping_url=staging_author_server
script_file_list=[]
#Destination Path
print "Content-type: text/html; charset=iso-8859-1\n\n"
print '''<HTML>'''
print '''<TITLE>Satellite Ping Check</TITLE><body>'''
#print '''<br>url I got host as''',addr

#This method override url open to make just head request
class HeadRequest(urllib2.Request):
def get_method(self):
return "HEAD"

#Method to ping URL to another server
def pingURL(customURL):
try:
response = urllib2.urlopen(HeadRequest(customURL))
except:
print '''<br>We failed to reach a server.'''

#Method that will ping other server based on IP address
def pingOtherServer():
for eachIp in staging_IP_list:
if eachIp==addr:
for eachIp2 in staging_IP_list:
if eachIp2!=addr:
resp = pingURL("http://"+eachIp2+pingUrl)
break
for eachIp in production_IP_list:
if eachIp==addr:
for eachIp2 in production_IP_list:
if eachIp2!=addr:
resp = pingURL("http://"+eachIp2+pingUrl)
break


#This is required to keep those files to author
def pingauthorServer():
filepath = rootFolder+"/"+author_content_subfolder+satellite_js_file_name
#Curl command to upload satellite file
os.system('curl -u UID:PWD -F@TypeHint="nt:file" -Ftype="file" --upload-file '+filepath+' '+curl_ping_url+author_content_folder+author_content_subfolder)
filepath=rootFolder+"/"+"selector.js"
#Curl command to upload selectors.js file
os.system('curl -u UID:PWD -F@TypeHint="nt:file" -Ftype="file" --upload-file '+filepath+' '+curl_ping_url+author_content_folder)
for script_file in script_file_list:
filepath=rootFolder+"/"+author_content_subfolder+"scripts/"+script_file
os.system('curl -u UID:PWD -F@TypeHint="nt:file" -Ftype="file" --upload-file '+filepath+' '+curl_ping_url+author_content_folder+author_content_subfolder+"scripts/")
#Curl command to activate files to publish instance
os.system('curl -u UID:PWD -Fcmd=activate -Fignoredeactivated=true -Fonlymodified=false -Fpath='+author_content_folder+' '+curl_ping_url+'/etc/replication/treeactivation.html')

#Method to delete existing folder before extracting new one
def deleteFileOrFolder(directory):
    if os.path.exists(directory):
        try:
            if os.path.isdir(directory):
                print '''<br>removing folder<b>''',directory
                shutil.rmtree(directory)
                print '''<br>Creating''',directory
                os.makedirs(directory)
            else:
                print '''<br>removing file<b>''',directory
                os.remove(directory)
        except:
            print '''<br>Ecxeption''',str(sys.exc_info())
    else:
        print '''<br>not found''',directory
        print '''<br>Creating''',directory
        os.makedirs(directory)

#Method to set satellite url based on IP address. If this is production server then set URL as production
def seturl():
for eachIp in production_IP_list:
if eachIp==addr:
global url
url=production_satellite_url
global satellite_js_file_name
satellite_js_file_name="YOUR-SATELLITE-FILE-NAME.js"
global curl_ping_url
curl_ping_url=production_author_server
break


def extract():
zip_file = zipfile.ZipFile(fileName, 'r')
#print '''file name is ''',fileName
for files in zip_file.namelist():
print '''<br>files in zip''',files
myfile_path=rootFolder+"/"+files
#print '''<br> Yogesh ''',myfile_path
if myfile_path.endswith("/"):
#print '''<br>I am in if and myfile_path is ''',myfile_path
if not os.path.exists(myfile_path):
os.makedirs(myfile_path)
else:
if files.find("/scripts/") != -1:
script_file_list.append(files.split('/')[-1])
#print '''<b> found script file with name <br>''',rootFolder+"/"+author_content_subfolder+"scripts/"+files.split('/')[-1]
#print '''<br>I am here and myfile_path is ''',myfile_path
data = zip_file.read(files)
myfile = open(myfile_path, "w+")
myfile.write(data)
myfile.close()
zip_file.close()

#Setting URL to production if this is production server. By default it is always staging server
seturl()
#print '''<br>url I got is''',url
fileName = url.split('/')[-1].split('#')[0].split('?')[0]
print '''<br>filename I got is''',fileName
#Delete all file and folder before creating them
#deleteFileOrFolder(rootFolder+"/"+fileName)
deleteFileOrFolder(rootFolder)
r = urllib2.urlopen(urllib2.Request(url))
try:
fileName = rootFolder+"/"+fileName
f=open(fileName, 'wb')
urllib.urlretrieve(url,fileName)
finally:
    r.close()
#zfile = zipfile.ZipFile(fileName)
extract()
#zfile.extractall(rootFolder)
#os.system('jar -xvf '+fileName)
#Do it only from one server
if checked is None:
pingOtherServer()
pingauthorServer()

print '''</body>'''
print '''</HTML>'''

Happy tagging and tracking. Let me know if you have any question.

AEM 6 provide this feature OOTB for that go to http://HOST:PORT/miscadmin#/etc/cloudservices/dynamictagmanagement and enter your DTM info

Note: Please note that there could be other tools that are capable of doing similar things. You can use similar approach there as well. This post has no mean to say that you should use satellite search and discovery for similar use case.

Tuesday, July 23, 2013

How to Create Custom Adapters in Adobe CQ / AEM

Prerequisite: http://sling.apache.org/documentation/the-sling-engine/adapters.html

Use Case: You often have a case where you want to adaptTo from existing object to custom Object or Provide adapter functionality for custom object to existing object.

Solution: There are mainly two ways you can use adaptTo

Case 1: You want existing object to be adaptable to custom object. For example you have a specific kind of node and you want Node or Resource to be adaptable to this object.

CustomObject myCustomObject   = resource.adaptTo(CustomObject.class)
Or
CustomObject myCustomObject   = node.adaptTo(CustomObject.class)
Or
CustomObject myCustomObject   = <ANY Adaptable OBJECT>.adaptTo(CustomObject.class)

Case 2: You want custom object to be adaptable to existing object. For example you have specific kind of resource and you want this to be adaptable to existing resource.

Node node = CustomObject.adaptTo(Node.class)
Or
Resource resource = CustomObject.adaptTo(Resource.class)
Or
<Any OOTB Adaptable> myObject   = MycustomObject.adaptTo(<Any OOTB Adaptable>.class)

Case 1: Example


Here is how your CustomAdapter will look like


Case 2: Example




In pom.xml you need following include. You can always find dependencies from HOST:PORT/system/console/depfinder

<dependency>
    <groupId>org.apache.sling</groupId>
    <artifactId>org.apache.sling.api</artifactId>
    <version>2.2.4</version>
    <scope>provided</scope>
</dependency>

<dependency>
      <groupId>org.apache.sling</groupId>
       <artifactId>org.apache.sling.adapter</artifactId>
        <version>2.0.10</version>
        <scope>provided</scope>
</dependency>

Let me know if you have any question.

Saturday, June 22, 2013

How to implement robots.txt / sitemap.xml / crossdomain.xml in Adobe CQ / AEM

Use Case: 

  • Some time you want to implement robots.txt or any web related configuration in CQ.
  • Some time you need to have different configuration of robots for different environment 

Solution: 

You can directly created web related configuration in CQ. For that do following,

1) Go to CRXDE or CRXDE light, Or you can directly put them in your CVS under jcr_root folder. You can create different version of robots.txt based environment and domain name.






Then you can configure sling rewriter (org.apache.sling.jcr.resource.internal.JcrResourceResolverFactoryImpl.xml) under /apps/sling/config to redirect to correct robots.txt or any site specific configuration. More information about configuration can be obtained from here http://www.wemblog.com/2012/10/how-to-work-with-configurations-in-cq.html

For Prod something like resource.resolver.virtual="[/robots.txt:/robots-prod.txt] and for all other env

 resource.resolver.virtual="[/robots.txt:/robots-qa.txt]"


For site specific configuration you can use something like

/apps/map.publish/www-robots.txt
     jcr:primaryType = "sling:Mapping" (that's the type when you create a new node)
     sling:internalRedirect = "/content/robots-prod.txt"
     sling:match = "http/www.SITE.com/robots.txt"


You might have to do some tuning on dispatcher for make this work. Feel free to ask any question.

Friday, June 14, 2013

How to avoid / flush caching of Static files / clientlibs on client side in Adobe CQ / AEM

Use Case: We often have situation where static files are changed during deployment and if static files are cached on user browser then styles are all messed up for some time.

Solution:

Option 1: Use Expires Modules:

You can use mod_expires module from apache http://httpd.apache.org/docs/current/mod/mod_expires.html for this. Setting like this could avoid permanent caching

# enable expirations
ExpiresActive On
# Image - 1 Month; JS,CSS - 1 Hour; font - 1 week
ExpiresByType image/gif "access plus 1 month"
ExpiresByType application/javascript "access plus 1 hour"
ExpiresByType application/x-javascript "access plus 1 hour"
ExpiresByType text/css "access plus 1 hour"
ExpiresByType image/png "access plus 1 month"

ExpiresByType application/octet-stream "access plus 1 week"

Advantage:

  • Simple
  • No custom development required.
Disadvantage:
  • Not full proof. Still static files will be cached till files get expires
Option 2: Use custom html rewriter

You can use custom sling rewriter to append dynamic number to your static file path. For example if your static path is HOST:PORT/etc/designs/clientlibs/wemblog.js then on each production release you can change it to HOST:PORT/etc/designs/clientlibs/wemblog.<Release Number>.js

Example of how to create custom rewrite can be obtained from here http://www.wemblog.com/2011/08/how-to-remove-html-extension-from-url.html

Here is one example 


import java.io.IOException;
import java.util.Map;

import org.apache.commons.lang3.StringUtils;
import org.apache.felix.scr.annotations.Activate;
import org.apache.felix.scr.annotations.Component;
import org.apache.felix.scr.annotations.Properties;
import org.apache.felix.scr.annotations.Property;
import org.apache.felix.scr.annotations.Service;
import org.apache.sling.rewriter.ProcessingComponentConfiguration;
import org.apache.sling.rewriter.ProcessingContext;
import org.apache.sling.rewriter.Transformer;
import org.apache.sling.rewriter.TransformerFactory;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.Locator;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.AttributesImpl;

@Component(metatype = true, immediate = true, label = "Link Transformer", description = "Appends version number to the js and css files under specified paths")
@Service
@Properties({ @Property(name = "pipeline.type", value = "append-version", propertyPrivate = true) })
public class ClientlibLinkTransformerFactory implements TransformerFactory {

@Property(label = "JS and CSS file path", cardinality = Integer.MAX_VALUE, description = "Path to the JS and CSS files", value = "[/etc/designs/SOMEPATH/clientlibs]")
private static final String PATH = "path";

@Property(label = "Version", description = "Version Number to be appended.")
private static final String VERSION = "version";

private static final String HTML_TAG_SCRIPT = "script";
private static final String HTML_TAG_LINK = "link";
private static final String HTML_ATTRIBUTE_SRC = "src";
private static final String HTML_ATTRIBUTE_HREF = "href";
private static final String JS_EXTENTION = ".js";
private static final String CSS_EXTENTION = ".css";
private static final String SELECTOR_SEPARATOR = ".";

private static final Logger log = LoggerFactory.getLogger(ClientlibLinkTransformerFactory.class);

private String version = "";
private String[] pathArray;

@Activate
protected final void activate(final Map<String, Object> config) {
this.version = (String) config.get(VERSION);
this.pathArray = (String[]) config.get(PATH);
}

public Transformer createTransformer() {
return new ClientlibLinkTransformer();
}

private boolean shouldAppendVersion() {
return StringUtils.isNotEmpty(this.version) && this.pathArray != null && this.pathArray.length > 0;
}

private Attributes rewriteLink(Attributes atts, String attrNameToLookFor, String fileExtension) {
boolean rewriteComplete = false;
AttributesImpl newAttrs = new AttributesImpl(atts);
int length = newAttrs.getLength();
for (int i = 0; i < length; i++) {
String attributeName = newAttrs.getLocalName(i);
if (attrNameToLookFor.equalsIgnoreCase(attributeName)) {
String originalValue = newAttrs.getValue(i);
if (StringUtils.isNotEmpty(originalValue)) {
for (String pathPrefix : pathArray) {
if (StringUtils.isNotEmpty(pathPrefix) && originalValue.indexOf(pathPrefix) != -1) {
int index = originalValue.lastIndexOf(fileExtension);
if (index != -1) {
newAttrs.setValue(i, originalValue.substring(0, index) + SELECTOR_SEPARATOR
+ this.version + fileExtension);
rewriteComplete = true;
break;
}
}
}
if (rewriteComplete) {
break;
}
}

}
}
return newAttrs;
}

private class ClientlibLinkTransformer implements Transformer {
private ContentHandler contentHandler;

public void characters(char[] ch, int start, int length) throws SAXException {
contentHandler.characters(ch, start, length);
}

public void dispose() {
// TODO Auto-generated method stub

}

public void endDocument() throws SAXException {
contentHandler.endDocument();
}

public void endElement(String uri, String localName, String qName) throws SAXException {
contentHandler.endElement(uri, localName, qName);
}

public void endPrefixMapping(String prefix) throws SAXException {
contentHandler.endPrefixMapping(prefix);
}

public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException {
contentHandler.ignorableWhitespace(ch, start, length);
}

public void init(ProcessingContext context, ProcessingComponentConfiguration config) throws IOException {
// TODO Auto-generated method stub

}

public void processingInstruction(String target, String data) throws SAXException {
contentHandler.processingInstruction(target, data);
}

public void setContentHandler(ContentHandler handler) {
this.contentHandler = handler;
}

public void setDocumentLocator(Locator locator) {
contentHandler.setDocumentLocator(locator);
}

public void skippedEntity(String name) throws SAXException {
contentHandler.skippedEntity(name);
}

public void startDocument() throws SAXException {
contentHandler.startDocument();
}

public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
if (shouldAppendVersion() && HTML_TAG_SCRIPT.equalsIgnoreCase(localName)) {
contentHandler.startElement(uri, localName, qName, rewriteLink(atts, HTML_ATTRIBUTE_SRC, JS_EXTENTION));
} else if (shouldAppendVersion() && HTML_TAG_LINK.equalsIgnoreCase(localName)) {
contentHandler.startElement(uri, localName, qName,
rewriteLink(atts, HTML_ATTRIBUTE_HREF, CSS_EXTENTION));
} else {
contentHandler.startElement(uri, localName, qName, atts);
}
}

public void startPrefixMapping(String prefix, String uri) throws SAXException {
contentHandler.startPrefixMapping(prefix, uri);
}

}

}

And then you have to add this in rewrite pipeline.

This will look like this /apps/<Your Custom Folder>/config [sling:folder]/rewriter  [sling:folder]/append-version  [sling:folder]/.content.xml

<?xml version="1.0" encoding="UTF-8"?>
<jcr:root xmlns:sling="http://sling.apache.org/jcr/sling/1.0" xmlns:jcr="http://www.jcp.org/jcr/1.0"
    jcr:primaryType="sling:Folder"
    contentTypes="[text/html]"
    enabled="{Boolean}true"
    generatorType="htmlparser"
    order="{Long}1"
    serializerType="htmlwriter"

    transformerTypes="[linkchecker,append-version]"/>

You can dynamically update version number to get fresh static file.



Note: Please test before use. You might have to write additional rules to avoid rewriting of custom static files.

Special Thanks to Appaji Bandaru for code.

More options:


Also check https://github.com/Adobe-Consulting-Services/acs-aem-commons/blob/master/bundle/src/main/java/com/adobe/acs/commons/rewriter/impl/VersionedClientlibsTransformerFactory.java which creates version based on last modified date.                                            

Wednesday, June 12, 2013

How to Disable Replication Agent using CURL in Adobe CQ / AEM

Use Case:

  • You want to disable replication agent without going to console.
  • You are doing a production deployment and want to disable replication agent
  • You want author not to replicate 
  • You want dispatcher not to get flush
Solution:

# !/bin/bash
# Author: upadhyay.yogesh@gmail.com
# The host and port of the source server
SOURCE="localhost:4502"
# The user credentials on the source server (username:password)
SOURCE_CRED="admin:admin"

#Root path, You can change this path to target only author or publish agent
ROOT_PATH="/etc/replication"

ALL_PATHS=`curl -s -u $SOURCE_CRED "$SOURCE/bin/querybuilder.json?path=$ROOT_PATH&type=nt:unstructured&1_property=cq:template&1_property.value=/libs/cq/replication/templates/%&1_property.operation=like&2_property=enabled&2_property.value=true&p.limit=-1" | tr ",[" "\n" | sed 's/ /%20/g' | grep path | awk -F \" '{print $4 "\n"}'`
echo "$ALL_PATHS"
for SINGLE_PATH in $ALL_PATHS
do
curl -s -u $SOURCE_CRED -F"enabled=false" $SOURCE$SINGLE_PATH
done

If you already know replication agent name you can also do following,

for agent in flush flush1 {Other agent name}
do echo Disabling /etc/replication/agents.author/${agent} 
curl -D- -o /dev/null -XPOST -F./enabled=false http://admin:admin@HOST:PORT/etc/replication/agents.author/${agent}/jcr:content 2>/dev/null done

Please test it before use.

Friday, May 10, 2013

How to Perform System Clean Up in Adobe CQ / AEM (CQ5.5)

Use Case:

CQ System grows over time as more data is modified, removed and added. CQ follow append only model for datastore, so data is never deleted from datastore even if it is deleted from console. Also over the time we end up having a lot of unnecessary packages as part of deployment and migration. On top of that adding a lot of DAM asset create a lot of workflow data that is not required.

As a result of which Disk size increases and if you are planning to have many instances sharing same hardware (Specially dev) it make sense to reduce size of instance time to time.

Solution: 

You can use following script to clean your data time to time.

Prerequisite:

Get workflow purge script from here

Step 1:

Create file with information about your instance (For example here name is host_list.txt)

#File is use to feed the clean up package script
#FORMAT HOST:PORT
<YOUR SERVER>:<PORT>
#END

Step 2:

Actual Script

#!/bin/bash
#
# Description:
#      Clean Master author Only
#      Clean Old Packages
#      Clean DataStore GC


PURGE_WORK_FLOWS_FILE="purge-workflows-2.zip"
CURL_USER='admin:my_super_secret'
IS_PURGE_PAK_FOUND=NO
MY_HOST_LIST=host_list.txt
# Name of package group that you want to clear
PACKAGE_GROUP=<MY PACKAGE GROUP>


if [ ! -f "${MY_HOST_LIST}" ]; then
  echo "Error cannot find host list file: ${MY_HOST_LIST}"
  echo "Exiting ..."
  exit 1;
fi

function run_purge_job()
{
MY_HOST= <YOUR HOST NAME>
IS_PURGE_PAK_FOUND=$(curl -su "${CURL_USER}" "http://${MY_HOST}:4502/crx/packmgr/service.jsp?cmd=ls" | grep "name" | grep "purge-workflows-2" | tr -d ' \t\n\r\f')

if [ -z "${IS_PURGE_PAK_FOUND}" ]; then
  IS_PURGE_PAK_FOUND=NO
else
  IS_PURGE_PAK_FOUND=YES
fi

if [ "$IS_PURGE_PAK_FOUND" = "NO" -a -f $PURGE_WORK_FLOWS_FILE ]; then
   MY_PAK_NAME=$(basename $PURGE_WORK_FLOWS_FILE .zip)
   MY_STATUS=$(curl -su "${CURL_USER}" -f -F"install=true" -F name=$MY_PAK_NAME -F file=@$PURGE_WORK_FLOWS_FILE http://${MY_HOST}:4502/crx/packmgr/service.jsp | grep code=\"200\"| tr -d ' \t\n\r\f')

   if [ -z "${MY_STATUS}" ]; then
     echo "Error uploading $PURGE_WORK_FLOWS_FILE exiting..."
     exit 1
   fi
fi

if [ "${IS_PURGE_PAK_FOUND}" = "YES" ]; then
   curl -su "${CURL_USER}"  -X POST --data "status=COMPLETED&runpurge=1&Start=Run"  http://${MY_HOST}:4502/apps/workflow-purge/purge.html > /dev/null 2>&1
    sleep 10
   curl -su "${CURL_USER}"  -X POST --data "status=ABORTED&runpurge=1&Start=Run"  http://${MY_HOST}:4502/apps/workflow-purge/purge.html > /dev/null 2>&1
fi
}

function clean_old()
{
for MY_HOST in $(cat $MY_HOST_LIST|grep -v '#')
do
IS_INSTANCE_UP=$(curl --connect-timeout 20 -su "${CURL_USER}" -X POST "http://${MY_HOST}/crx/packmgr/service.jsp?cmd=ls" | grep "name" | grep -i ${PACKAGE_GROUP} | tr -d ' \t\n\r\f')

if [ -z "${IS_INSTANCE_UP}" ]; then
   continue
fi

# You can have multiple package here
# Or you can use Commands from here
echo "deleting package group"
curl -su "${CURL_USER}" -F" :operation=delete" http://${MY_HOST}/etc/packages/<PACKAGE GROUP NAME> > /dev/null 2>&1
 sleep 10
 done
}

function clean_datastore_gc()
{
for MY_HOST in $(cat $MY_HOST_LIST|grep -v '#')
do


IS_INSTANCE_UP=$(curl --connect-timeout 20 -su "${CURL_USER}" -Is "http://${MY_HOST}/crx/packmgr/index.jsp"  | grep HTTP | cut -d ' ' -f2)

if [ ${IS_INSTANCE_UP} -eq 200 ]; then
   continue
fi
echo "running datastore gc"
   curl -su  "${CURL_USER}" -X POST --data "delete=true&delay=2" http://${MY_HOST}/system/console/jmx/com.adobe.granite%3Atype%3DRepository/op/runDataStoreGarbageCollection/java.lang.Boolean > /dev/null 2>&1
done
}

case "$1" in
  'purge')
   run_purge_job
;;
  'clean_paks')
   clean_old
;;
  'clean_ds')
   clean_datastore_gc
;;
*)
  echo $"Usage: $0 {purge|clean_paks|clean_ds}"
  exit 1
  ;;
esac
exit 0
#
#end


Manual Cleaning:

CQ5.5 and before:
1) Download workflow purge script from here
2) Install purge script using package manager
3) Login as admin or as user having administrative access
4) Go to http://${MY_HOST}:4502/apps/workflow-purge/purge.html
5) Select completed from drop down and run purge workflow.
6) You might have to run it multiple time to make sure that everything is deleted.
7) Using crxde light or crx explorer using admin session go to /etc/packages/<Your package group>
8) Delete package you want to delete
9) After deleting click save all
10) To run datastore GC please follow http://www.wemblog.com/2012/03/how-to-run-online-backup-using-curl-in.html Or http://www.cqtutorial.com/courses/cq-admin/cq-admin-lessons/cq-maintenance/cq-datastore-gc


In CQ 5.6 OOTB you can configure audit and workflow purge using instruction here http://helpx.adobe.com/cq/kb/howtopurgewf.html


Special Thanks to Rexwell Minnis for organizing this script.

Note: Please Test This before use. I did not get enough time to test it completely.

Monday, April 22, 2013

How to create custom query predicate in CQ

Use Case: Your search has some custom requirement that can not be covered by existing predicate

Pre requisite:
Example: 

Suppose you want to create a custom predicate to copare and sort case sensitive property.
Your predicate expression will look like

-----Signature -----

path=<Path under which search would be performed>
type=<node type>
customcase.property=<property name for which case sensitive search needs to be performed>
customcase.fulltext=<Search Text>
customcase.case=upper/lower/UPPER/LOWER/no_case/NO_CASE
orderby=customcase

You can test these example after deploying code HOST:PORT/libs/cq/search/content/querydebug.html

---- Example 1 (Find all node with subtitle as "MAIL" and do upper case compare) -----
path=/content/geometrixx/en
type=cq:Page
customcase.property=jcr:content/subtitle
customcase.fulltext=MAIL
customcase.case=upper
orderby=customcase

---- Example 2 (find all node and sort by subtitle property) -----
path=/content/geometrixx/en
type=cq:Page
customcase.property=jcr:content/subtitle
customcase.case=no_case
orderby=customcase



Note: As you can see, Code is not optimal. It is just an example of how you can create your own predicate. Also Example assumes that you know how to import dependencies for this code. Let me know if you have any question.

Wednesday, March 13, 2013

How to manage multi site using dispatcher in CQ / WEM

Use Case:

1) You have multi language site and you want to have different dispatcher configuration for them
2) Activating pages under english site should not flush pages under french site for example
3) I have different URL for each site (for example en.mysite.com and fr.mysite.com)

Solution:

At Publish Instance

First of all, In order to manage multi domain multi language site, You should have proper mapping rule within CQ. Please see http://dev.day.com/content/kb/home/cq5/CQ5SystemAdministration/HowToMapDomains.html for that

Now what this is going to do is, make sure that your links are properly written when you open any page under different domain.

Example of sample /etc/map

map
  +-- http
    +-- any_geometrixx.de (nodetyp: sling:Mapping)
        property: internalRedirect : /content/geometrixx/de.html
        property: sling:match : .*.*geometrixx.de.(4503|80)+/$
    +-- libs (nodetyp: sling:Mapping)
          property: internalRedirect: /libs
      +-- etc (nodetyp: sling:Mapping)
        +-- designs (nodetyp: sling:Mapping)
            property: internalRedirect: /etc/designs
    +-- any_geometrixx.en (nodetyp: sling:Mapping)
        property: internalRedireect: /content/geometrixx/en.html
        property: sling:match : .*.*geometrixx.en.(4503|80)+/$
      +-- libs (nodetyp: sling:Mapping)
          property: internalRedirect: /libs
      +-- etc (nodetyp: sling:Mapping)
        +-- designs (nodetyp: sling:Mapping)
            property: internalRedirect: /etc/designs


At Dispatcher

First of all you have to associate dispatcher handler with each incoming domain. You can use Name Virtual Host for this

better to create different farm for different domain

# Each farm configures a set of load balanced renders (i.e. remote servers)
/farms
  {
       $include farm*.any
  }

Suppose you have two farms,

farm_site1.any
farm_site2.any

Then you will have virtual host setting as,

 /virtualhosts
      {
 
      "*site1*"
      }

 /renders
      {
       $include "renders.any"
      }

/cache
   /docroot "<Global Doc root>/site1"

And

/virtualhosts
      {
 
      "*site2*"
      }


    /renders
      {
       $include "renders.any"
      }

  /cache
    /docroot "<Global Doc root>/site2"

You can have same or different renderers for different site.

Now at your virtual host setting you will configure different doc root for different site

NameVirtualHost *:80

<VirtualHost *:80>
    ServerName site1.com
    #This is to handle dev enviornments
    ServerAlias *.site1.com
    DocumentRoot <Global Doc root>/site1
    Include <Configurations specific to site1>
 
</VirtualHost>

<VirtualHost *:80>
    ServerName site2.com
    #This is to handle dev enviornments
    ServerAlias *.site2.com
    DocumentRoot <Global Doc root>/site2
    Include <Configurations specific to site2>
 
</VirtualHost>
DocumentRoot <Global Doc root>

Now to handle dispatcher flush request for specific site path, You can have rule like,


SetEnvIfNoCase CQ-Path ^/content/site1 hostnameforfarm=site1.com
SetEnvIfNoCase CQ-Path ^/content/site2 hostnameforfarm=site2.com
RequestHeader set Host %{hostnameforfarm}e env= hostnameforfarm

Now above rule will set the host name for dispatcher based on {CQ-Path} in flush request. That mean if some thing under site1 get activated site2 cache will NOT be flushed.

I know there could be many variation to this, I just gave basic example. Let me know if you have more questions.

How to cache Error page in CQ

Use Case:

1) You want to serve your error page from dispatcher
2) Error pages are creating a lot of load on publish instance

Prerequisite:

  • http://sling.apache.org/site/errorhandling.html
  • http://dev.day.com/docs/en/cq/current/developing/customizing_error_handler_pages.html
  • http://dev.day.com/docs/en/cq/current/deploying/dispatcher/disp_config.html
Assumption: You are using apache as webserver

Solution:

1) Create your custom error handler using pre-requisite doc
2) Set status code in your error handler
for example:
for 404 in /apps/sling/servlet/errorhandler/404.jsp use response.setStatus(404) do not do any redirect. In fact you can have just one line (In publish).
3) Then in your dispatcher set DispatcherPassError to 1

<IfModule disp_apache2.c>
       # All your existing configuration  
       DispatcherPassError 1
</IfModule>

4) Then configure your error document in httpd.conf (Or your custom configuration file), Something like

<LocationMatch \.html$>
    ErrorDocument 404 /404.html
    #ErrorDocument <ANY OTHER ERROR CODE>  <PATH IN YOUR DOCROOT>
</LocationMatch>

Above rule mean, For any .html request if you get 404 response show /404.html page. All other extension will be handled by dispatcher 404 response (As shown below). This is a good approach in case you don't want to load your 404 (With all CSS and JS) for all other extensions.

You can have customize error file as well, Something like

<LocationMatch "/india">
    ErrorDocument 404 /404india.html
    #ErrorDocument <ANY OTHER ERROR CODE>  <PATH IN YOUR DOCROOT>
</LocationMatch>









With above configuration if someone request a page, that may lead to any of above error code, in that case page will be served by CQ dispatcher and not by Publish instance. This will also avoid unnecessary load on publish instance.

Special Thanks to Andrew Khoury from Adobe for sharing this.

Sunday, March 3, 2013

How to Create Custom Authentication Handler in CQ

Use Case:
  • You want to use custom Authentication handler instead of OOTB one for authentication. 
  • Custom User Registration
Pre requisite:
Available Authentication Handler in CQ:



How to create your Own:


1) Create custom class extending Sling Authentication Handler and override available methods


2) Create a form (Your custom Login form) 

which will be something like this

String action = currentPage.getPath() +"/j_mycustom_security_check";

<form method="POST" action="<%= xssAPI.getValidHref(action) %>"> 
Enter User Name: <input  name="j_username" type="text" />
Enter Passord: <input  name="j_password" type="text" />
<input type="button" name="Click Here to login">
</form>

You can also use Ajax post or something to see if response is 200 (Which mean successful login)

3) Then under apache sling post servlet, Make sure that you allow parameter you are posting. In this case j_*


4) Add your custom authentication prefix to sling authenticator service


5) Once you have your bundle deployed, You should see your additional authentication handler. 




Integrate it with Custom Pluggable Login Module (AEM 6)

Step1 : create pluggable login Module


Step2 : Plug it in your custom auth handler



Example: https://svn.apache.org/repos/asf/sling/trunk/bundles/auth/form/src/main/java/org/apache/sling/auth/form/impl/

Example of Open Source Extended Authentication Handler:



CQ OOTB Extended authentication Handler

Day CRX Sling - Token Authenticationcom.day.crx.sling.crx-auth-token)Adobe Granite SSO Authentication Handlercom.adobe.granite.auth.sso)Day Communique 5 PIN Authentication Handlercom.day.cq.cq-pinauthhandler

There are a lot of things needed for creating your custom user registration process (You might / Might not) Need following,

1) Custom Login Module http://www.wemblog.com/2012/06/how-to-add-custom-login-module-in-cq55.html to sync users / group in CQ from third party system
2) Custom Authentication handler as above
3) Reverse replication to sync user across (If user registration is in publish)

Note: Above code is just Pseudo code. Please test, You might have to add your custom logic for this to work.

Let me know if you have any question or comment.