JBoss-Tomcat-Liferay portal Clustering - what and how

Introduction #

Clustering allows us to run portal instances on several parallel servers (called cluster nodes). The load is distributed across different servers, and even if any of the servers fails, the portal is still accessible via other cluster nodes. Clustering is crucial for scalable portal enterprise, as you can improve performance by simply adding more nodes to the cluster.

For larger installations, you would likely need a clustered configuration in order to handle the traffic of popular website. A cluster allows us to distribute the traffic coming in to website to several machines. Sure the cluster allows websites to handle more web traffic at a faster pace than it would be possible with a single machine. Definitely the portal works well in a clustered environment.

A cluster is a set of nodes. Supposed that there are two nodes: Node1 and Node2, we’re going to use Apache HTTP server and database MySQL as shown in following screen shot.

Abstracted from the book: Liferay Portal 6 Enterprise Intranets

In order for Tomcat to start, you need to set three environment variables, $JAVA_HOME, $TOMCAT_AS_DIR, $JBOSS_AS_DIR and $APACHE_HTTPD_DIR. $JAVA_HOME should point to JDK installation directory, $TOMCAT_AS_DIR should point to the installation directory for Tomcat, $JBOSS_AS_DIR should point to the installation directory for JBoss and $APACHE_HTTPD_DIR should point to the installation directory for Apache HTTPD.

And moreover, Apache HTTP Server has been installed at $APACHE_DIR directory.

Apache HTTP Server has following settings
IP: 192.168.2.170 Apache Tomcat Connector mod_jk 1.2 or above Apache JServ Protocol AJP 1.3 or above JDK 1.6 or above Apache HTTP Server 2.2 or above

Node1 has following settings: IP: 192.168.2.171 Portal with Tomcat 6.x or JBoss 5.x.GA (or 4.2.3.GA)

Node2 has following settings: IP: 192.168.2.172 Portal with Tomcat 6.x or JBoss 5.x.GA (or 4.2.3.GA)

Database server has following settings IP: 192.168.2.173 Database MySQL 5.0 or above }}}

HTTP Services #

HTTP session replication is used to replicate the state associated with portals on other nodes of a cluster. Thus, in the event one of one node crashes, another node in the cluster will be able to recover. There are two ways of clustering portals on Tomcat or JBoss. Here we use Sticky Session as example.

Sticky Session: User request will always go to same portal instance;
Session Replication: User request can go to any tomcat in cluster and his session is copied on entire cluster.}}}

Configure mod_jk #

First of all, we need to install mod_jk. mod_jk is the connector used to connect Tomcat JSP container with web servers, e.g., Apache. Simply download the latest version from http://apache.tradebit.com/pub/tomcat/tomcat-connectors/jk/binaries. Depending on the physical hardware of the Apache HTTP server, you need to choose OS and furthermore choose either 32-bit or 64-bit. The terms 32-bit and 64-bit refer to the way a computer's processor handles information. After downloading, rename to mod_jk.so before putting it in $APACHE_HTTPD_DIR/modules directory. With mod_jk installed, we must now configure Apache to load the module by editing $APACHE_HTTPD_DIR/conf/httpd.conf. Configuring Apache to load mod_jk is a simple two-line step – adding following lines at the end of $APACHE_HTTPD_DIR/conf/httpd.conf.

#Load the mod_jk connector 
LoadModule jk_module modules/mod_jk.so }}} Then we need to configure worker properties. A Tomcat worker is a Tomcat instance that is waiting to execute servlets or any other content on behalf of some web server. For example, we can have a web server such as Apache forwarding servlet requests to a Tomcat process (the worker) running behind it. To do so, we should create a file named workers.properties at $APACHE_HTTPD_DIR/conf as following settings.

# Define list of workers that will be used
# for mapping requests
worker.list=loadbalancer,status
# Define Node1
# modify the host as your host IP or DNS name.
worker.node1.port=8009
worker.node1.host=192.168.2.171
worker.node1.type=ajp13
worker.node1.lbfactor=1
worker.node1.socket_timeout=60
worker.node1.connection_pool_timeout=60
worker.node1.ping_mode=A
worker.node1.ping_timeout=20000
worker.node1.connect_timeout=20000

# Define Node2
# modify the host as your host IP or DNS name.
worker.node2.port=8009
worker.node2.host=192.168.2.172
worker.node2.type=ajp13
worker.node2.lbfactor=1
worker.node2.socket_timeout=60
worker.node2.connection_pool_timeout=60
worker.node2.ping_mode=A
worker.node2.ping_timeout=20000
worker.node2.connect_timeout=20000

# Load-balancing behaviour
worker.loadbalancer.type=lb
worker.loadbalancer.balance_workers=node1,node2
worker.loadbalancer.sticky_session=1
# Status worker for managing load balancer
worker.status.type=status

As shown in above code, mod_jk uses a file named workers.properties, defining where Apache looks for the Tomcat instances. worker.list is a comma-separated list of worker names. Each worker needs to define the port on which the connector is configured to work, e.g., 8009 for both Node1 and Node2.

Finally, to get all of this started, we need to tell Apache where to find the workers.properties file and where to log mod_jk requests. We also need to specify the format of the log files and the options specific to mod_jk. To do so, simple add following line at the end of $APACHE_HTTPD_DIR/conf/httpd.conf.

JkWorkersFile conf/workers.properties
JkLogFile logs/mod_jk.log JkLogLevel error JkLogStampFormat "[%a %b %d %H:%M:%S %Y]" JkMount / loadbalancer}}}

The above code tells Apache to use $APACHE_HTTPD_DIR/conf/workers.properties for the worker definitions and to use the $APACHE_HTTPD_DIR/logs/mod_jk.log log file.

In addition, it would be better to serve all images and CSS from Apache htdocs directly. How to achieve this? The following is an option.

  • Copy $PORTAL_ROOT_HOME/html to $APACHE_HTTPD_DIR/htdocs;
  • Copy $AS_WEB_APP_HOME/${plugin.name} to $APACHE_HTTPD_DIR/htdocs; where ${plugin.name} represents custom themes, portlets, webs, etc.
  • Add following lines at the end of $APACHE_HTTPD_DIR/conf/httpd.conf.

Jkunmount /*.jpg loadbalancer
Jkunmount /.gif loadbalancer Jkunmount /.png loadbalancer Jkunmount /.ico loadbalancer Jkunmount /.css loadbalancer}}}

Configure Tomcat #

If bundled with Tomcat, In $TOMCAT_AS_DIR/conf/server.xml file, find the line that reads: <Engine name="Catalina" defaultHost="localhost"> And change it for each VM so that it includes the appropriate worker name. For node1, it would like following line. <Engine name="Catalina" defaultHost="localhost" jvmRoute="node1"> For node2, it would like following line. <Engine name="Catalina" defaultHost="localhost" jvmRoute="node2">

Enable Tomcat clustering in $TOMCAT_AS_DIR/conf/server.xml

<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
	channelSendOptions="6">
  
  <Manager className="org.apache.catalina.ha.session.BackupManager"
	expireSessionsOnShutdown="false"
	notifyListenersOnReplication="true"
	mapSendOptions="6"/>

	
  <Channel className="org.apache.catalina.tribes.group.GroupChannel">
    <Membership className="org.apache.catalina.tribes.membership.McastService"
	address="228.0.0.4"
	port="45564"
	frequency="500"
	dropTime="3000"/>
    <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
      address="auto"
	port="5000"
	selectorTimeout="100"
	maxThreads="6"/>

    <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
      <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
    </Sender>
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/>
  </Channel>

  <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
	 filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>

  <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
</Cluster>

In order to enable session replication, edit the $TOMCAT_AS_DIR/conf/context.xml, update <Context> with <Context distributable="true">

Configure JBoss #

In $JBOSS_AS_DIR/server/default/deploy/jbossweb.sar/server.xml file (for JBoss 4.2.3.GA, the web folder is /jboss-web.deployer instead of /jbossweb.sar), find the line that reads: <Engine name="jboss.web" defaultHost="localhost"> And change it for each VM so that it includes the appropriate worker name. For node1, it would like following line. <Engine name="jboss.web" defaultHost="localhost" jvmRoute="node1"> For node2, it would like following line. <Engine name="jboss.web" defaultHost="localhost" jvmRoute="node2">

To enable replication of your web application sessions, you need to tag the portal as distributable in the $PORTAL_ROOT_HOME/WEB_INF/web.xml descriptor. The following is an example.

<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns="http://java.sun.com/xml/ns/j2ee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/j2eehttp://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd" version="2.4"> <!-- ignore details --> <distributable/> </web-app> }}}

Configure Portal #

For each node, add following lines at the end of portal-ext.properties

net.sf.ehcache.configurationResourceName=/ehcache/hibernate-clustered.xml

ehcache.multi.vm.config.location=/ehcache/liferay-multi-vm-clustered.xml

For 6.0 or above, add following lines at the end of portal-ext.properties

cluster.link.enabled=true

lucene.replicate.write=true

Database Replication and Clustering #

Repository Clustering #

Jackrabbit clustering works: content is shared between all cluster nodes. That means all Jackrabbit cluster nodes need access to the same persistent storage (persistence manager and data store). The cluster nodes store information identifying items they modified in a journal. This journal must again be globally available to all nodes in the cluster. This can be either a folder in the file system (called file journal) or a database running standalone (called database journal).

First, using a file based journal implementation, where the journal files are created in a share exported by NFS, e.g., /nfs/server/journal.

 <Cluster id="node1" syncDelay="5">
<Journal class="org.apache.jackrabbit.core.journal.FileJournal"> <param name="revision" value="${rep.home}/revision.log" /> <param name="directory" value="/nfs/server/journal" /> </Journal> </Cluster> }}}

As shown in above code, the file journal is configured for node1 through following properties: revision - location of the cluster node's revision file; directory - location of the journal folder. Do the same in Node2 with id=”node2”.

 <Cluster id="node1" syncDelay="5">
<Journal class="org.apache.jackrabbit.core.journal.DatabaseJournal"> <param name="revision" value="${rep.home}/revision"/> <param name="driver" value="com.mysql.jdbc.Driver"/> <param name="url" value="jdbc:mysql://192.168.2.173:3306/lportal"/> <param name="user" value="lportal"/> <param name="password" value="lportal"/> <param name="schema" value="mysql"/> <param name="schemaObjectPrefix" value="J_C_"/> </Journal> </Cluster> }}}

As shown in above code, the database journal is configured through the following properties: revision - location of the cluster node's revision file; driver-JDBC driver class name; url - JDBC URL; user - user name of default account; password – password of default account.

We have set clustering of Jackrabbit for Node1. Do the same in Node2 with id=”node2”.

By the way, if you have a Storage Area Network (SAN) and a shared folder, you can configure the portal to store documents and images there to take advantage of the extra redundancy. In this case, you could be able to use File System Hook and Advanced File System Hook, besides JCR Hook with File System.

To configure the location where your documents and images are stored, you could be able to use following properties in portal-ext.properties.

dl.hook.impl=com.liferay.documentlibrary.util.AdvancedFileSystemHook
dl.hook.file.system.root.dir=//bookpub.com/liferay-portal/data/document_library
image.hook.impl=com.liferay.portal.image.FileSystemHook
image.hook.file.system.root.dir=//bookpub.com/liferay-portal/data/images

As shown in above code, you could be able to use Advanced File System Hook. Definitely there are no differences between File System Hook and Advanced File System Hook, if you are using exFAT (Extended File Allocation Table) - format size limits and files per directory limits are practically eliminated.

Note that when using File System Hook or Advanced File System Hook, you would be able to get better performance on repository clustering, than that of JCR Hook with File System.

(To be continued)

Related Wiki Articles #

http://www.liferay.com/web/guest/community/wiki/-/wiki/Main/High+Availability+Guide

http://www.liferay.com/web/guest/community/wiki/-/wiki/Main/Clustering

2 Attachments
61306 Views
Average (0 Votes)
The average rating is 0.0 stars out of 5.
Comments
Threaded Replies Author Date
In this article it is mention for JBoss 4 but... Syed Mujeeb October 23, 2013 2:34 AM

In this article it is mention for JBoss 4 but we want to have JBoss 7.1.1, can any one let us know how to configure the cluster for Liferay on JBoss 7.1.1
Posted on 10/23/13 2:34 AM.