Wednesday, March 13, 2013

How to cache Error page in CQ

Use Case:

1) You want to serve your error page from dispatcher
2) Error pages are creating a lot of load on publish instance

Prerequisite:

  • http://sling.apache.org/site/errorhandling.html
  • http://dev.day.com/docs/en/cq/current/developing/customizing_error_handler_pages.html
  • http://dev.day.com/docs/en/cq/current/deploying/dispatcher/disp_config.html
Assumption: You are using apache as webserver

Solution:

1) Create your custom error handler using pre-requisite doc
2) Set status code in your error handler
for example:
for 404 in /apps/sling/servlet/errorhandler/404.jsp use response.setStatus(404) do not do any redirect. In fact you can have just one line (In publish).
3) Then in your dispatcher set DispatcherPassError to 1

<IfModule disp_apache2.c>
       # All your existing configuration  
       DispatcherPassError 1
</IfModule>

4) Then configure your error document in httpd.conf (Or your custom configuration file), Something like

<LocationMatch \.html$>
    ErrorDocument 404 /404.html
    #ErrorDocument <ANY OTHER ERROR CODE>  <PATH IN YOUR DOCROOT>
</LocationMatch>

Above rule mean, For any .html request if you get 404 response show /404.html page. All other extension will be handled by dispatcher 404 response (As shown below). This is a good approach in case you don't want to load your 404 (With all CSS and JS) for all other extensions.

You can have customize error file as well, Something like

<LocationMatch "/india">
    ErrorDocument 404 /404india.html
    #ErrorDocument <ANY OTHER ERROR CODE>  <PATH IN YOUR DOCROOT>
</LocationMatch>









With above configuration if someone request a page, that may lead to any of above error code, in that case page will be served by CQ dispatcher and not by Publish instance. This will also avoid unnecessary load on publish instance.

Special Thanks to Andrew Khoury from Adobe for sharing this.

13 comments:

  1. It sounds like a solid approach. Could you translate the steps into IIS? We use IIS, not httpd.

    ReplyDelete
    Replies
    1. Hello,

      Thanks for your feedback. I have not tested this configuration in IIS, But you can find Location Match and Error Document counter part in IIS to set up this.

      Yogesh

      Delete
    2. This comment has been removed by the author.

      Delete
    3. Which IIS do you use?

      on IIS 7.5 you can add the following config.
      Prerequisits are: the 404.jsp should return a 404.

      <?xml version="1.0" encoding="UTF-8"?%gt;
      <configuration%gt;
      <system.webServer%gt;
      ...
      <httpErrors%gt;
      <remove statusCode="403" subStatusCode="-1" /%gt;
      <remove statusCode="404" subStatusCode="-1" /%gt;
      <remove statusCode="500" subStatusCode="-1" /%gt;
      <error statusCode="403" path="<path to error site%gt;" responseMode="ExecuteURL" /%gt;
      <error statusCode="404" path="<path to error site%gt;" responseMode="ExecuteURL" /%gt;
      <error statusCode="500" path="<path to error site%gt;" responseMode="ExecuteURL" /%gt;
      </httpErrors%gt;
      <system.webServer%gt;
      </configuration%gt;

      <path to error site%gt; --%gt; ex. /content/mysite/system/error.html

      the error site will be cached as normal webpage.

      Delete
  2. Great to see your reply within a day. We currently us jsp:include to pull in the site 404. The pub is taking a load when we do our security scan. On the positive side we are able to retain the browser url, which gives users the opportunity to correct his/he url for a re-request.
    Do you see any issues with multiple with each map to one site? We have seven sites. Each site has its own 404 page setup to reflect different I18N translation and link to its own site map. I see the only way to get the dispatcher approach to work is for me to setup one for each site.

    ReplyDelete
    Replies
    1. If you are using I18n approach then yes you have to create different localized 404 page or else you can use selectors. With dispatcher URL does not change. So for example you want to show es 404 then have something like 404.es.html and use i18n accordingly. You will end up having more dispatcher rules though.

      Yogesh

      Delete
  3. In Step 3, you mention setting the DispatcherPassError to 1. What is the purpose of that step?

    ReplyDelete
    Replies
    1. if DispatcherPassError is set that mean all error response code will be handled by Web Server (Apache or IIS or whatever you are using) and not by CQ servlet/sling engine.

      Delete
  4. Hi,

    I want to set error code value as 404 for error.html page.
    Sothat this page will not show in search results.

    any inputs.

    ReplyDelete
    Replies
    1. You can use response.setStatus(404) for your 404 error handler.

      Yogesh

      Delete
  5. Sorry, my previous comment was NOT relevant to a reply of another comment. So I am re-posting correctly.

    Using your configuration seems to prevent the Replication Agent on Publish from invalidating the Dispatcher cache. When we set the DispatchPassError 0, the Replication Flush agent started working.

    Any Ideas?

    ReplyDelete
    Replies
    1. Hello,

      Replication flush agent should work even after above configuration. Can you please check you dispatcher log with error level set to 3 and make sure that there is no other reason. Also make sure that you set your Error handler with location match. In above example it is any thing with extension .html

      Yogesh

      Delete
  6. Hi, I am getting an error : The requested URL /content/aemproject/parent1.html was not found on this server.
    Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.

    I have /apps/sling/servlet/errorhandler/404.jsp with only content inside <% response.setStatus(404); %> & have ErrorDocument 404 /404.html inside apache2.conf file. Shall i need to place 404.html inside my cache folder ?

    ReplyDelete