Liferay 6.2 Lucene replication in cluster

Foren

Martin Dulak, geändert vor 8 Jahren.

Liferay 6.2 Lucene replication in cluster

New Member Beiträge: 5 Beitrittsdatum: 02.07.14 Neueste Beiträge

Hi,

I'd welcome any help regarding simple issue: I have clustered environment and I enabled Lucene replication in properties (lucene.replicate.write=true). Now, all the tutorials are instructing me to reindex Lucene.

Should I run it on one node? On both? Simultaneously or sequentially?

This question has been asked in Stack Overflow as well: http://stackoverflow.com/questions/35161320/liferay-6-2-lucene-replication-in-cluster

Thank you!

Amos Fong, geändert vor 8 Jahren.

RE: Liferay 6.2 Lucene replication in cluster

Liferay Legend Beiträge: 2047 Beitrittsdatum: 07.10.08 Neueste Beiträge

Hi Martin,

You don't have to reindex if one of the nodes has the full index already. Just delete the index on the out-of-date node and restart it. It should sync the index during restart (you should be able to see it in the logs). If you have more than 2 nodes, you should shut down all out-of-date nodes first in case they try to sync from each other.

If the index is not current on any node, then run reindex on just one of the nodes. After it's done, it should sync it to all the other nodes.

Martin Dulak, geändert vor 8 Jahren.

RE: Liferay 6.2 Lucene replication in cluster

New Member Beiträge: 5 Beitrittsdatum: 02.07.14 Neueste Beiträge

Amos, thank you for such a quick reply.
Reindexation on my project takes about 24 hours, so I will be back with result or further questions then.

Stef Gold, geändert vor 8 Jahren.

RE: Liferay 6.2 Lucene replication in cluster

New Member Beiträge: 3 Beitrittsdatum: 04.03.16 Neueste Beiträge

Hello Martin,

did you succeed in full reindexing both LF server?
I yes, could you please explain how you do this.

Thanks,
Stéphane

Martin Dulak, geändert vor 8 Jahren.

RE: Liferay 6.2 Lucene replication in cluster

New Member Beiträge: 5 Beitrittsdatum: 02.07.14 Neueste Beiträge

Stef Gold:

Hello Martin,

did you succeed in full reindexing both LF server?
I yes, could you please explain how you do this.

Thanks,
Stéphane

Hey,

basically what I did at first was following:
- cluster.link.enabled=true
- lucene.replicate.write=true
and the result was NOT WORKING replication.

What I tried next was to overcome this issue and continue with clustering the rest of the portal which at the end helped lucene as well. My progress was to:
- deploy cluster activation keys
- deploy ehcache-cluster-web.war
- portal-ext.properties:
a. cluster.link.enabled=true
b. cluster.link.autodetect.address=<COMMONLY_ACCESSIBLE_IP_AND_PORT>
c. lucene.commit.batch.size=1
d. lucene.commit.time.interval=5000
e. lucene.replicate.write=true
f. ehcache.cluster.link.replication.enabled=true
g. cluster.link.channel.properties.control=<PATH_TO_XML>
h. cluster.link.channel.properties.transport.0=<PATH_TO_XML>
i. portal.instance.protocol=http
j. portal.instance.http.port=8080
- setenv.sh
a. -Djava.net.preferIPv4Stack=true
b. -Djgroups.bind_addr=<IP_OF_THE_NODE>
- edit clusterlink_control and clusterlink_transport files by Liferay tutorials
- when servers shutted down delete contents of data/lucene and in Control Panel run reindaxation on one node

At the end, Lucene replication IS WORKING. What I think could be significant I've colored red. At first, portal.properties explanation on keys lucene.commit.* is kind of hard to comprehend. By trial and error I found out that these two keys are in AND relation. Also, I found out about portal.instance.* keys which are used for multiple purposes in clustering and can matter if you have loadbalancers and/or Apaches between the nodes and autodetect fails.

Let know if anything of that worked! emoticon

Olaf Kock, geändert vor 8 Jahren.

RE: Liferay 6.2 Lucene replication in cluster

Liferay Legend Beiträge: 6403 Beitrittsdatum: 23.09.08 Neueste Beiträge

Martin Dulak:

- lucene.replicate.write=true
and the result was NOT WORKING replication.

Note that replication in this case first and foremost means replicating the indexing requests, not necessarily the full index (While the index can be transferred, I'd want to have more controls over this anyways). When both servers are up and one updates its content, it will send the indexing request for that content to the other server as well. When only one server is up, the other server will miss this changed content and have an outdated index. That's why this solution forces either a full index replication or a full reindexing on restart.

Stef Gold, geändert vor 8 Jahren.

RE: Liferay 6.2 Lucene replication in cluster

New Member Beiträge: 3 Beitrittsdatum: 04.03.16 Neueste Beiträge

Martin Dulak:

Let know if anything of that worked!

Hi Martin,

thank you very much for these detailed explanation. With the commentary of Olaf, it seems to me that it would be better to use SOLR with huge data.
I'm going to try setup a SOLR server, but I'm keeping by my side all your precious information.

Again, many thanks to you and Olaf. The difficulty now is to find the SOLR plugin for LF6.2 CE !!!

Best regards,
Stéphane.

Pidugu Sundeep, geändert vor 3 Jahren.

RE: Liferay 6.2 Lucene replication in cluster

New Member Beiträge: 4 Beitrittsdatum: 02.07.20 Neueste Beiträge

did you have any luck with finding out SOLR plugin for LF6.2 CE ?

Olaf Kock, geändert vor 8 Jahren.

RE: Liferay 6.2 Lucene replication in cluster

Liferay Legend Beiträge: 6403 Beitrittsdatum: 23.09.08 Neueste Beiträge

Hi Martin,

if Reindexing takes 24h, replicating lucene indexing requests might not be the solution for you, as the indexes can get out of date when a server doesn't get indexing requests while it's down. You'll have to know which is the current index in order to distribute it to all machines on start up. It might be wise to consider a Client/Server solution, e.g. utilize SOLR, which will be commonly used by all members of a cluster, and handle all indexing requests. Yes, this is more investment upfront, but 24h reindexing time is significant and might be worth it.

Olaf

Stef Gold, geändert vor 8 Jahren.

RE: Liferay 6.2 Lucene replication in cluster

New Member Beiträge: 3 Beitrittsdatum: 04.03.16 Neueste Beiträge

Hi Olaf,

I setup Lucene cluster, lcatalina log in each server confirm me that the clustering is Ok but when I run a full reindex on the serer A nothing happen on the B server. (LF 6.2 .CE.Ga2)

Both server are switch off, I delete all files in lucene folder. I switch on A, then started, I switch on B. When both are on, I restart LF on A server, then go on control panel and clear all till search full reindex.
In Catalina log I have

Server A (10.**.**.21)
426198 04 Mar 2016 08:34:38:838 INFO [BaseReceiver:64] Accepted view [ncys0333-11220|4] [ncys0333-11220]
426199 04 Mar 2016 08:34:38:841 INFO [DebuggingClusterEventListenerImpl:57] Cluster event DEPART_Cluster node {clu sterNodeId=cf430ddb-0eb7-4ad5-b6bc-e618c6c039c1, inetAddress=/10.**.**.45, port=8080} [Sanitized]

Server B (10.**.**.45)
402154 04 Mar 2016 08:33:44:651 INFO [BaseReceiver:64] Accepted view [ncys0464-36100|4] [ncys0464-36100]
402155 04 Mar 2016 08:33:44:654 INFO [DebuggingClusterEventListenerImpl:57] Cluster event DEPART_Cluster node {clu sterNodeId=3b43462f-f773-4b2c-a5c9-abec2bebfc5e, inetAddress=/10.**.**.21, port=8080} [Sanitized]

II expected that the full reindex on the A server was replicated on the B but it didn't

Do you have an idea?

Best regards,
Stephane.

Martin Dulak, geändert vor 8 Jahren.

RE: Liferay 6.2 Lucene replication in cluster

New Member Beiträge: 5 Beitrittsdatum: 02.07.14 Neueste Beiträge

Olaf Kock:

Hi Martin,

if Reindexing takes 24h, replicating lucene indexing requests might not be the solution for you, as the indexes can get out of date when a server doesn't get indexing requests while it's down. You'll have to know which is the current index in order to distribute it to all machines on start up. It might be wise to consider a Client/Server solution, e.g. utilize SOLR, which will be commonly used by all members of a cluster, and handle all indexing requests. Yes, this is more investment upfront, but 24h reindexing time is significant and might be worth it.

Olaf

Thanks for an input Olaf, I agree with you.