Tuesday, November 8, 2011

How to fix File Not found issue in CRX

Caution : Most of time it works but not guaranteed

Assumption : You are using CRX and latest HF is not installed. Otherwise need to figure out why you had this issue.

Symptom : You get following error in log

Caused by: org.apache.jackrabbit.core.state.ItemStateException: Failed to read bundle: <SOME -- UUID> : java.io.IOException: File not found: 263
at com.day.crx.persistence.tar.TarPersistenceManager.getInputStream(TarPersistenceManager.java:1195)
at com.day.crx.persistence.tar.TarPersistenceManager.loadBundle(TarPersistenceManager.java:334)
at com.day.crx.persistence.tar.TarPersistenceManager.loadBundle(TarPersistenceManager.java:311)
at org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.getBundle(AbstractBundlePersistenceManager.java:654)
at org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.load(AbstractBundlePersistenceManager.java:400)
at org.apache.jackrabbit.core.state.SharedItemStateManager.loadItemState(SharedItemStateManager.java:1819)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getNonVirtualItemState(SharedItemStateManager.java:1739)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState(SharedItemStateManager.java:261)
at org.apache.jackrabbit.core.state.LocalItemStateManager.getNodeState(LocalItemStateManager.java:107)
at org.apache.jackrabbit.core.state.LocalItemStateManager.getItemState(LocalItemStateManager.java:172)
at org.apache.jackrabbit.core.state.XAItemStateManager.getItemState(XAItemStateManager.java:260)
at org.apache.jackrabbit.core.state.SessionItemStateManager.getItemState(SessionItemStateManager.java:161)
at org.apache.jackrabbit.core.ItemManager.getItemData(ItemManager.java:370)
... 34 more
Caused by: java.io.IOException: File not found: 263
at com.day.crx.persistence.tar.TarSet.getInputStream(TarSet.java:731)
at com.day.crx.persistence.tar.TarSet.getInputStream(TarSet.java:724)
at com.day.crx.persistence.tar.ClusterTarSet.getInputStream(ClusterTarSet.java:502)
at com.day.crx.persistence.tar.TarPersistenceManager.getInputStream(TarPersistenceManager.java:1191)
... 46 more



Solution

1) stop the instance
2) Make sure that it stopped (You can run ps -ef | grep java command to check that)
3) delete [take backup] all index_*.tar files found in /crx-quickstart/repository/workspaces/crx.default/copy (If it is CQ5.3 or upgraded instance)
4) Grep the error.log to find the missing tar file numbers:
grep "^java.io.IOException: File not found:" error.log* | awk '{ print $5 }' | sort –u
5) from your nightly backups Recover and place missing tar file "Data*.tar" at repository/workspace/crx.default/
6) Set the file permissions back to rw as the system sets all data tar files to read only after a file is missing. chmod 755 *.tar.
7) Start CRX / CQ with the following system property (in start up script):
java -Dcom.day.crx.persistence.tar.IndexMergeDelay=0
8) Make sure that you have latest CRX Hotfix installed


If above issue happen in clustered environment, You can recover it from working cluster instance. See How to recover from Out of sync node in cluster

Please do not attempt to do this directly on production instance without taking help of Adobe Support

15 comments:

  1. What if there is no backup of the Data*.tar file?

    ReplyDelete
  2. Then you have a problem. If these data tar are missing from index*.tar or from tarJournal then deleting this folder or file should fix the issue. If they are missing on upgraded CQ5.4 or CQ5.3 instance, you might have to check under /shared folder if they exist. But if they are missing from version folder and workspace folder then there is no way you can recover.

    PS: This will not happen if you have latest CRX hotfix. Or else this is a critical bug.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Hi Yogesh,
    I had the same problem on QA environment. Since I didn't have any nightly backup to restore tar file from I did following: using tar in listing mode and grep I found tar file containing missing bundle. Then copied it giving it name of missing file, re-started CQ. At least this error doesn't appear in logs anymore.
    Did I do smth meaningless and it's not the way of possible resolution. Could you please comment?
    Thanks,
    Max.

    ReplyDelete
    Replies
    1. Максим,

      That will resolve the error in log but your data is still missing. If that data is pointing to something not useful like any completed workflow instance or audit record, Then functionality wise you shouldn't have any problem. But if it is pointing to any useful record then in nutshell that record does not exist any more. You might want to recreate index again after doing this.

      Delete
    2. Got it, thanks a lot for a quick reply!

      Delete
    3. Максим,

      Did you look through all the tar files under /repository/workspace/crx.default? And lets assume you found data_000xxx.tar file containing the bundle UUID. You just copied data_000xxx.tar file and renamed it to data_missing_file_name. tar and restarted CQ?

      Delete
  5. One more thing I forgot to mention is, in worst to worst case you can create new instance and then use vlt rcp (http://www.wemblog.com/2011/09/how-to-use-vlt-tool-to-copy-data-from.html) to migrate data over from this instance to fresh instance. This will remove data not found error, But unfortunately data is missing .. and you can not do any thing about it ...

    ReplyDelete
  6. Which hotfix needs to be installed? We are facing this issue on CQ5.4 in our Dev environment. We never saw this issue on our QA environment, but we did get hotfixes on QA for some other issues. Will those hotfixes fix this as well?

    ReplyDelete
    Replies
    1. Hello,

      You need latest CRX hotfix for CRX2.2. Please open daycare ticket to get one.

      However, Hotfix will not resolve your current issue. It will help you not to get this issue in future. To resolve issue you might have to first install Hotfix and then follow process I outlined above. If you are not sure about what you are doing, Please open a daycare ticket and some one will help you.

      Delete
  7. Hi Yogesh,

    It's a bit off topic, but related to the original post.
    I recently added this option to my server's startup script

    -Dorg.apache.jackrabbit.core.state.validatehierarchy=true

    and my error log is getting filled up with this message now:

    *INFO * SharedItemStateManager: Validating change-set hierarchy (SharedItemStateManager.java, line 732)

    Is there a way to keep that from being logged?

    Running CQ5.3 with CRX 2.2.0.70 on Windows 2008.
    Let me know if you require any additional info.

    Thanks in advance.

    ReplyDelete
    Replies
    1. Leigh,

      Yes that is expected. Above configuration make sure that all node in hierarchy exist before save is made. This is to avoid orphan node issue. Note that it might slow down your repository write operation. Also if node has a lot of child node then repository write will be slow. At same time it will help to avoid corruption in your repository. You might need to log a daycare ticket to change this as debug message. You can also move this log message to different log using log manager config. Here is example http://www.wemblog.com/2011/09/how-to-set-up-debug-mode-for.html. Note that once you will use another log file to log these messages it will no longer come to error.log

      Yogesh

      Delete
    2. Thanks for the quick reply Yogesh.

      -Leigh

      Delete
  8. Hi Yogesh ,

    I am getting the apache issue and because of this some of the pages are not opening , i tried removing the CRX folder and started fresh, but it shows the same error , also i removed the cache from the system, so what could be the possible reason for it ?

    Thanks
    Nupur

    ReplyDelete
    Replies
    1. Hello Nupur,

      Are you not able to access your system from apache ? Or you are not able to start your CQ because of some error. In later case what is the error you are getting ?

      Yogesh

      Delete