Wednesday, December 10, 2014

How to Set Up Clustering In CQ/AEM 6 using MongoDB

Background:


With CQ / AEM 6 TarPM is not supported any more. AEM 6 ships with Oak which for now support TarMK and MongoMK Microkernal OOTB. More information about what is New Can be found from http://www.slideshare.net/AEMHub2014/oak-michael-marth . With this change Support from Clustering is moved to actual storage layer it self (Which make more sense, given supporting all issues for clustering in earlier version). TarMK does not have replication or sharding feature so it comes down to MongoDB which support replication and sharding and hence enable High Availability (HA through replication) and Scalability (Through Sharding, Though this is still a question ?? See note below) through clustering in CQ /AEM 6.

Here we will give step by step instruction of how to set up clustering using MongoDB in CQ

Pre requisite:


There are two cases for setting Up Replica Set:

Set up a new MongoDB Instance:

  • Set up additional MongoDB instance based on instruction above
  • Start any one of instance using ./mongod --port <Your Port> --dbpath <Your DB Path> --replSet <Replica Set Name could be any thing> &  
  • You can also use configuration file to do that. More instruction here http://docs.mongodb.org/manual/tutorial/deploy-replica-set/
  • Once Mongo DB is started you can add additional replica using following instruction 
  • Once Replica set is up, Now set Up AEM
  • Then You can go to each Mongo Instance and check of data is coming using Mongo Log
Convert Existing Mongo Instance:

  • Stop you AEM instance
  • Use Following instruction to convert Mongo to replica
  • Once this is set Change AEM start script to add mongo replica instance as given in approach one 
  • start your AEM instance
  • AEM should be part of replica set now

Backup and Restore

Please check https://docs.mongodb.org/v3.0/tutorial/backup-and-restore-tools/ for MongoDB instruction of backup and restore. 

Automated script can be found here: https://github.com/micahwedemeyer/automongobackup/blob/master/src/automongobackup.sh just put this script under /etc/cron.daily and you are set for backup.

Some Common Questions

Should I set up my AEM author instance on MongoDB

Unless you have clustering requirement, I would not suggest to set up your author instance with MongoDB. Mainly because of administrative overhead.

Should I set up my AEM publish instance on MongoDB

Same as above, Unless you have a requirement which requires shared content generation I would suggest not to use MongoDB. With AEM communities, now you have an option to add Mongo Persistence for community feature at any time. Here is more detail https://docs.adobe.com/docs/en/aem/6-1/administer/communities/srp/msrp.html and https://docs.adobe.com/docs/en/aem/6-1/administer/communities/srp/msrp/demo-mongo.html

Should I store Blobs in MongoDB as well in AEM

It is not recommended to store Blob data with MongoDB. There are other options like, Local Storage, NAS, AWS you can use in that case. More detail https://docs.adobe.com/content/docs/en/aem/6-1/deploy/platform/aem-with-mongodb.html#AEM Configuration and https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/data-store-config.html

How can I secure my MongoDB deployment with AEM

Notes:


1) Mongo Replication Only Provide High Availability (HA) it does not provide scalability. For scalability you need to use Sharding feature provided by Mongo. However I am not sure what would be best key to create shard on for Mongo. You can create Shard based on _id attribute. More information about sharding can be obtained here http://docs.mongodb.org/manual/sharding/  . If you are using Sharding I would suggest to use sharding with replication (Shard and then replicate shard instance) to provide both HA and scalability.

2)  There are many feature available in Mongo Replication where you can make certain replica instance read only (Data Center replica), you can use this to avoid high latency across Data Center here is all configuration you can do on Mongo http://docs.mongodb.org/manual/administration/replica-set-member-configuration/

3) MongoDB recently released MMS https://mms.mongodb.com/ to monitor and deploy Mongo Cluster easily. This will be useful if you are worried about administrative cost for Mongo 

4) If you don't want to store large documents in Mongo feel free to use custom Data Store using instruction here http://jackrabbit.apache.org/oak/docs/osgi_config.html

5) Mongo Recently launched another feature of pluggable datastore. You can use this for faster read and write based on your requirement (For example Primary with high Write Enabled Storage Like SSD or something and read with cheap storage). More info here https://www.mongosoup.de/blog-entry/A-closer-look-at-pluggable-storage.html (Official Doc yet to come)

6) Official AEM Documentation: https://docs.adobe.com/content/docs/en/aem/6-1/deploy/platform/aem-with-mongodb.html


Finally .... Some more Mongo Command ...



Special Thanks To Nelson Mei for Setting up POC for Mongo with AEM