Pluggable Enterprise Search with Solr
Using a Solr Server for Liferay Search #
Since Solr is a standalone search engine, you will need to download it and install it first according to the instructions on the Solr web site ([http://lucene.apache.org/solr]). Once you have Solr up and running, integrating it with Liferay is easy, but it will require a restart of your application server. This tutorial will help you set up a master and slave Solr application. We'll set up one instance of Solr in the beginning, and at the end we'll copy our instance to set up replication. This first instance will eventually become the master(writer) instance, and the copy will become a slave(reader).
Installing Solr #
The first thing you will need to define is the location of your search index. Assuming you are running a Linux server and you have mounted a file system for the index at /solr, create an environment variable that points to this folder. This environment variable needs to be called $SOLR_HOME. So for our example, we would define:
This environment variable can be defined anywhere you need: in your operating system's start up sequence, in the environment for the user who is logged in, or in the start up script for your application server. If you are going to use Tomcat to host Solr, you would modify catalina.sh or catalina.bat and add the environment variable there.
Once you have created the environment variable, you then can use it in your application server's start up configuration as a parameter to your JVM. This is configured differently per application server, but again, if you are using Tomcat, you would edit catalina.sh or catalina.bat and append the following to the $JAVA_OPTS variable:
This takes care of telling Solr where to store its search index.
Installing the Solr Plugin #
Next, go to the Liferay web site (http://www.liferay.com) and download the plugin manually from the Customer Portal. This will ensure that you download the correct and most up-to-date version of the plugin. Drop the plugin into your deploy folder right above Tomcat.
Once you've deployed the plugin, navigate to tomcat/webapps/solr-web/WEB-INF/classes/META-INF, and find the solr-spring.xml file. We'll be making a few changes to set up the master and slave in here:
<!-- Solr search engine --> <bean id="com.liferay.portal.search.solr.server.BasicAuthSolrServerReader" class="com.liferay.portal.search.solr.server.BasicAuthSolrServer"> <constructor-arg type="java.lang.String" value="http://localhost:8984/solr" /> </bean> <bean id="com.liferay.portal.search.solr.server.BasicAuthSolrServerWriter" class="com.liferay.portal.search.solr.server.BasicAuthSolrServer"> <constructor-arg type="java.lang.String" value="http://localhost:8983/solr" /> </bean> <bean id="com.liferay.portal.search.solr.SolrIndexSearcherImpl" class="com.liferay.portal.search.solr.SolrIndexSearcherImpl"> <property name="solrServer" ref="com.liferay.portal.search.solr.server.BasicAuthSolrServerReader" /> <property name="swallowException" value="true" /> </bean> <bean id="com.liferay.portal.search.solr.SolrIndexWriterImpl" class="com.liferay.portal.search.solr.SolrIndexWriterImpl"> <property name="commit" value="false" /> <property name="solrServer" ref="com.liferay.portal.search.solr.server.BasicAuthSolrServerWriter" /> </bean> <bean id="com.liferay.portal.search.solr.SolrSearchEngineImpl" class="com.liferay.portal.kernel.search.BaseSearchEngine"> <property name="clusteredWrite" value="false" /> <property name="indexSearcher" ref="com.liferay.portal.search.solr.SolrIndexSearcherImpl" /> <property name="indexWriter" ref="com.liferay.portal.search.solr.SolrIndexWriterImpl" /> <property name="luceneBased" value="true" /> <property name="name" value="SYSTEM_ENGINE" /> <property name="vendor" value="SOLR" /> </bean> <!-- Configurator -->
Save those changes and navigate back a few levels to WEB-INF. This time, go into the conf folder and find the schema.xml. We'll need this file, so copy it and head over to your solr instance and navigate to the /example/solr/conf folder and replace the schema.xml with the one that you just copied. We're almost done!
Replicating Solr #
The first instance, which we'll call the master, is almost done. Let's copy the entire instance and paste the copy right next to the original and rename it slave. So now we should have two instances of Solr (but they're exactly the same). We need to configure a few more things so that the slave properly replicates the master. The first thing we need to do is change the port number for the slave. In reader/example/etc, find jetty.xml and change the default port number:
<Call name="addConnector"> <Arg> <New class="org.mortbay.jetty.bio.SocketConnector"> <Set name="port"><SystemProperty name="jetty.port" default="8984"/></Set> <Set name="maxIdleTime">50000</Set> <Set name="lowResourceMaxIdleTime">1500</Set> </New> </Arg> </Call>
The last thing we need to change is the solrconfig.xml file for both master and slave. This file is located in /example/solr/conf, and we'll need to make these changes:
<requestHandler name="/replication" class="solr.ReplicationHandler" > <lst name="master"> <str name="replicateAfter">commit</str> <str name="replicateAfter">startup</str> <str name="confFiles">schema.xml,stopwords.txt</str> </lst> </requestHandler>
<requestHandler name="/replication" class="solr.ReplicationHandler" > <lst name="slave"> <str name="masterUrl">http://localhost:8983/solr/replication</str> <str name="pollInterval">00:00:60</str> </lst> </requestHandler>
You can visit ([http://wiki.apache.org/solr/SolrReplication]) for more detailed information on Solr replication. Remember, you should modify the values so that they point to the correct server. While you still have solrconfig.xml open for both master and slave, uncomment and modify this <autoCommit> code for both files:
<!-- Perform a <commit/> automatically under certain conditions: maxDocs - number of updates since last commit is greater than this maxTime - oldest uncommited update (in ms) is this long ago Instead of enabling autoCommit, consider using "commitWithin" when adding documents. http://wiki.apache.org/solr/UpdateXmlMessages --> <autoCommit> <maxDocs>10000</maxDocs> <maxTime>10000</maxTime> </autoCommit>
This should help Solr run a little faster.
Once everything is configured, we need to start up Solr- both master and slave. Navigate to master/example and start up Solr by typing "java -jar start.jar" and do the same for slave/example. If everything is working correctly, you should see the slave replicating the master. Restart tomcat.
Your Liferay search is automatically upgraded to use Solr. It is likely, however, that initial searches will come up with nothing: this is because you will need to reindex everything using Solr.
Go to the Admin Portlet. Click the Server tab and then click the Execute button next to Reindex all search indexes. It may take a while, but Liferay will begin sending indexing requests to Solr for execution. When the process is complete, Solr will have a complete search index of your site, and will be running independently of all of your Liferay nodes.
Installing the plugin to your nodes has the effect of overriding any calls to Lucene for searching. All of Liferay's search boxes will now use Solr as the search index. This is ideal for a clustered environment, as it allows all of your nodes to share one search server and one search index, and this search server operates independently of all of your nodes.