Friday, September 16, 2011

How to use vlt tool to copy data from one CRX to other CRX

Use Case Some time you want to migrate repository from one instance to another instance (Because of non repairable corruption in current repository). If your repository size is big, Package manager is not a viable option (As you end up creating a lot of packages)

Assumption You have basic knowledge about CQ structure and CRX. You are using crx2.2 or higher. Path of copy could differ if you are using older version of CRX.

Here are different options:

1) Use package manager
    • Pros: Very simple to create and do not require command like knowledge
    • Cons: Not suitable for Big packages. Slow some time
2) Use VLT (We are explaining how to use this in this article)
    •  Pros: Faster than package manager. Do not consume package space like package manager.
    • Cons: Slow I/O
3) Use Tool based on VLT like Recap http://adamcin.net/net.adamcin.recap/
    • Pros: VLT rcp UI easy to see status and administer. Do not require command like knowledge.
    • Cons: Slow I/O because it uses VLT under the hood.
4) Use Tool based on other transfer protocol like Grabit  https://github.com/TWCable/grabbit
    • Pros: Faster than VLT
    • Cons: Not Adobe supported. Need more dependency to set up.

Step 1: Make sure that you have vlt set up properly. More detail can be found here. In the end you can use vlt --version command to check if vlt is installed.



Step 2: Use following command to Migrate data between CRX

vlt rcp -r http://<login>:<password>@<source-host>:<port>/crx/-/jcr:root/<Source-Path> http://<login>:<password>@<destination-host>:<port>/crx/-/jcr:root/

Note
1) While you are migrating data from one CQ to another CQ, make sure that you stop launchpad application from <host>:<port>/admin (This will make sure that unnecessary workflows are not getting triggered during migration). If this is not possible make sure that you disable DAM related workflow by going to http://<host>:<port>/libs/cq/workflow/content/console.html and then launcher. Don't forget to re enable them after the copy and make sure that no DAM activities are going on during that time on target instance.


2) Note that copying user and groups does not mean all ACL are also copied. ACL are stored under /content node. In general you need to migrate following stuff (Plus if you have some thing else custom)
-- /content/<Your-site>
-- /apps/<Your-application>
-- /var/dam/<your-asset>
-- /content/dam/<your-asset>
-- /etc/design/<your-design>
-- /etc/tags/<your-tags>
-- /etc/workflow/<your-custom-stuff>
-- /etc/replication/<your-custom-replication>

Important Note: At any point of time (Even after Migration). You have to make sure that /content/dam and /var/dam are sync. After migration go to http://<host>:<port>/etc/dam/healthchecker.html and make sure that they are in sync by clicking check binaries (List entries missing in /var/dam) and check Asset (List entries missing in /content/dam). You might also do small test before you do big migration to make sure that all renditions are getting migrated fine.

If you want to process all asset again then just migrate /var/dam in batches (As Asset synchronization using workflow could be expensive) and enable launchpad or all workflows. In order to migrate Asset in baches you can use sleep between two rsync. Make sure that you monitor logs of target system for any OOM error or any issue.

You can have script to do above with all the path (So that you can re use them in future)

If do not have Login can download Recap from Here

19 comments:

  1. I was looking forward toward trying this out since it seems like it would be a huge time saver but when I tried it on our CRX (2.2.0.54) I get the following error:


    Copy http://admin:password@servername:4503/crx/server/-/jcr:root/content/forms to http://admin:password@servername2:4503/crx/server/-/jcr:root (recursive)
    Connecting via JCR remoting to http://admin:password@servername:4503/crx/server
    [ERROR] Error while retrieving src repository http://admin:password@servername:4503/crx/server/-/jcr:root/content/forms: javax.jcr.UnsupportedRepositoryOperationException: Missing implementation

    ReplyDelete
    Replies
    1. @Dave,

      I have not tested it, But I guess you are missing destination path crx/server/-/jcr:root/PATH. If it does not work let me know and I will test. May be source location is changed, That is without /crx

      Delete
    2. This is an example of how I'm calling the command:

      vlt rcp -r http://admin:pa\$\$word@localhost:7503/crx/-/jcr:root/content/geometrixx http://admin:pa\$\$word@localhost:4503/crx/-/jcr:root/

      and here is an example of the error:


      Copy http://admin:pa$$word@localhost:7503/crx/server/-/jcr:root/content/geometrixx to http://admin:pa$$word@localhost:4503/crx/server/-/jcr:root (recursive)
      Connecting via JCR remoting to http://admin:pa$$word@localhost:7503/crx/server
      [ERROR] Error while retrieving src repository http://admin:pa$$word@localhost:7503/crx/server/-/jcr:root/content/geometrixx: javax.jcr.UnsupportedRepositoryOperationException: Missing implementation

      Now I'm wondering if it has something to do with the dollarsign in the password.

      Delete
    3. I'm fairly certain this was not working due to WebDAV being disabled on my publisher instance. When I try this locally with a fresh install it is working fine.

      Delete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Hey Yogesh

    I'm doing a VLT copy between a 5.4 and a 5.5 instance and am copying a big folder (/content/sitename) in one shot.
    After a while the whole process gets bottlenecked because of these errors:

    [ERROR] Error during intermediate save (3426); try again later: javax.jcr.ReferentialIntegrityException: Target node 6a2607ee-a2c8-4df4-9f27-a6bc8eb6680d of REFERENCE property does not exist

    Getting it on each save and there's at least 10 second delay. Do you know if there's a way to turn off some referential integrity checking when saving nodes?

    Thanks
    Boris

    ReplyDelete
    Replies
    1. Boris,

      I am not sure if there is any way to disable it. I would try
      1) Using latest VLT version
      2) Using package manager instead and then deleting those package after import.

      Also open a daycare ticket to see if it is known issue.

      Yogesh

      Delete
  4. Hi Yogesh,

    Do you know what might cause this error when using VLT? (replaced source server name)

    [ERROR] Error while logging in src repository http://admin:admin@SOURCE:45
    02/crx.default/jcr:root/content/stuff: javax.jcr.RepositoryException: OK

    I tried using the /crx/-/jcr:root/content/stuff , but that fails with an XML exception.

    Source server is running CQ5.3 with CRX2.2.70 and my Destination is OOTB 5.6.1. I'm also trying to run VLT from a 3rd machine that will just pass from Source to Destination.

    If you need more info, let me know. I can always log a DayCare ticket and copy you in on it if required.

    Thanks in advance.

    ReplyDelete
  5. Hello Leigh,

    Note that in CQ5.3 repository path might require /crx/server/-/jcr:root in path but in CQ5.6 it would be just /crx/-/jcr:root

    You might also want to see https://www.adobeaemcloud.com/content/marketplace/marketplaceProxy.html?packagePath=/content/companies/public/acquity/packages/contest/recap that is a utility to copy content across repository.

    Yogesh

    ReplyDelete
    Replies
    1. Thanks again for the reply, I'm going to take a look at Recap.

      Delete
  6. vlt rcp -r http://admin:admin@localhost:4516/crx/-/jcr:root/content -r http://admin:admin@localhost:4502/crx/-/jcr:root/content_copy

    This is the correct command

    ReplyDelete
  7. While migrating user, I am getting following error- Does anyone know about it?

    [ERROR] Error during copy: javax.jcr.nodetype.ConstraintViolationException: /home/users/t/testuser: mandatory property {internal}password does not exist

    Command used:

    C:\>vlt rcp -r http://admin:admin@localhost:4516/crx/-/jcr:root/home/users/t -r http://admin:admin@localhost:4502/crx/-/
    jcr:root/home/users/t

    ReplyDelete
    Replies
    1. Hello,

      From what version to what version you are trying to copy ? Can you please go to node type admin of destination version and check if rep:password field of rep:User node type is marked as mandatory ?

      Yogesh

      Delete
  8. This is from 5.5 to 5.5 only. On source side, the new users are created which are not presented at destination and hence I wanted to move it. Do I need to check the property at /home/users level of destination?

    ReplyDelete
  9. Yogesh ...any update for me. I have created one user under geometrixx and try to move it to diff repository using VLT. I am on 5.5 and I am getting following exception. I have checked many times for setting any internal password but I couldn't able to find it. Your help is appreciated !

    Command:

    C:\>vlt rcp -r http://admin:admin@localhost:4508/crx/-/jcr:root/home/users/geometrixx -r http://admin:admin@localhost:45
    02/crx/-/jcr:root/home/users/geometrixx

    Exception

    000143 - /home/users/geometrixx/zachary.w.mitchell@spambob.com/profile
    000144 A /home/users/geometrixx/testser
    [WARN ] Error while adding node /home/users/geometrixx/testser/rep:policy (ignored): javax.jcr.PathNotFoundException: re
    p:policy
    000145 A /home/users/geometrixx/testser/profile
    000146 A /home/users/geometrixx/testser/preferences
    Saving 120 nodes...
    [ERROR] Error during copy: javax.jcr.nodetype.ConstraintViolationException: /home/users/geometrixx/testser: mandatory pr
    operty {internal}password does not exist

    ReplyDelete
    Replies
    1. Hello,
      I am not sure why CQ thinks that password is mandatory. One solution to this problem would be to create dummy password for all users not having password using JCR API and then make transfer using vlt. Hope it make sense. You can use following api to do that http://jackrabbit.apache.org/api/2.0/org/apache/jackrabbit/api/security/user/UserManager.html

      Yogesh

      Delete
  10. can you help me to export from vlt by using https?

    I have crx usrl in https "https://cex-cq.abc.com" can you please help me.

    ReplyDelete
  11. I have found VLT hangs for a very long time if the tree is large. After it starts its all good kind of weird. I wrote a workaround that basically splits the paths up but any ideas on this?

    ReplyDelete
    Replies
    1. Thanks for your comment Eric. I agree that VLT is slow for large data transfer. There are other tools like grabit https://github.com/TWCable/grabbit that you can use as well.

      Delete