Recent Bloggers

Andrea Di Giorgi

3 Mensagens
26 de Dezembro de 2014

Jorge Ferrer

Staff
59 Mensagens
23 de Dezembro de 2014

Olaf Kock

Staff
90 Mensagens
18 de Dezembro de 2014

David Kubitza

2 Mensagens
14 de Dezembro de 2014

Meera Prince

19 Mensagens
4 de Dezembro de 2014

James Falkner

Staff
100 Mensagens
2 de Dezembro de 2014

Juan Fernández

Staff
17 Mensagens
28 de Novembro de 2014

Gregory Amerson

Staff
26 Mensagens
25 de Novembro de 2014

Cody Hoag

Staff
6 Mensagens
25 de Novembro de 2014

Duke H

Staff
2 Mensagens
24 de Novembro de 2014

Manage your imports, exports and reports using ETL process manager

Technical Blogs 3 de Novembro de 2014 Por Sven Werlen

I'd like to spend some time to present a portlet that we recently published on the Marketplace: ETL process manager. I'm sure that not everyone knows what an ETL tool is and how poweful this can be when combined with Liferay.

ETL stands for "Extract, Transform and Load". ETL tools are extremely useful when you need to manipulate data: extract, import, export, clean, etc. 

Here is a list of most frequently features asked from our customers :

  • Import users from Excel file
  • Weekly report with the list of users with various information
  • On demand report with all organizations / sites and users with their roles

There are many ways to handle that kind of request. Imports can easily be managed using Liferay webservices but reading Excel sheets is a pain (CSV is generally a good compromise). Exports can be handled by a reporting tool like Jasper or BIRT but it takes time to design a report and the report engine generally requires an access to the database.

We finally ended using Talend Open Studio. Talend DI (Data Integration) is a very powerful ETL and opensource solution. It provides a lot of connectors/tools:

  • it can read and write almost any kind of format (CSV, Excel, txt, etc.)
  • it can retrieve data from anywhere (database, FTP, email, SSH, etc.)
  • it can manipulate data very efficiently (normalize, denormalize, sort, duplicate, ...)

The problem when using an ETL is to maintain the processes and to give to end users the possibility to execute them by themself without having to connect to the server and execute some strange Shell script. This is why we decided to develop an ETL portlet for managing processes.

ETL process manager

The portlet can be download from the Markeplace: LINK

The portlet is useless without ETL processes. Fortunately, we provide a few examples that you can use to see how it works. Examples are available on this page.

1. Import your Talend process

The first step is to import an existing Talend process. You can easily download Talend Open Studio (DI), design your own process and then import them into Liferay. Here, I'm going to import a very simple process, HelloWord (download here), in order to illustrate how it works.

  • Make sure you installed "ETL process manager" from the Marketplace
  • Go on the control panel and click on "Server | ETL Processes"
  • Click on the Add button
  • Upload the HelloWorld_0.1.zip file

  • You should now have a new process available in the list

 

2. Execute process

Now, you can execute the process directly from Liferay.

  • Click on "Actions | Execute now" (notice the message on the top of the list)
  • Go on History tab, you will see one entry in the list of executions

  • Click on the date to see the details of the execution

  • From the new screen, you can now download the Runtime logs (and Error logs if any)
  • Check that you see the message "Hello World!" in the logs

 

3. So what?

OK. It is certainly not so impressive so far. Anyone can easily print a "Hello World" message into a log. 

So let's take another, better example.

 

Report with all active organizations, sites and active users and their roles

This example is amazing and shows some of the portlet capabilities. The new example uses the process "ReportUsersRoles" (download here).

  • The process uses Liferay API to retrieve the following information
    • List of all users with detailed information and the list of roles (per organization and site)
    • List of all organizations (sorted according to the hierarchy) with the list of members and their roles
    • List of all sites with the list of members and their roles
  • The process uses Talend connectors:
    • To generate an Excel sheet (not a CSV!) with one tab for each list
    • To send the final report by email to a provided address (parameter)

 

To test it, follow the instructions below:

  • Upload the process (download here) into Liferay
  • Edit the process and add the following context parameter: MailTo=YOUREMAIL
  • Execute the process
  • Check your emails
  • Download the attachment and enjoy!

Notice that you can also schedule the execution of the process in order to receive that report every week, for example.

If the example doesn't work:

  • make sure that you have configured your email server correctly for your portal
  • make sure that the MailTo parameter is properly configured with your email address
  • check the error logs

Liferay remote publishing - Troubleshooting

Technical Blogs 10 de Outubro de 2014 Por Sven Werlen

During last Liferay North America symposium in Boston, I had the opportunity to attend to Máté's interesting presentation (Best Practices for Using Staging in Liferay 6.2). I have always been fascinated by this complex feature in Liferay and I have spent hours "struggling" with it in the past years while helping companies implementing and using it.

Remote publishing has been improved a lot since its first implementation and is very robust and reliable in Liferay 6.2. However, this feature is so complex that they are many situations for which the process will fail or not complete how you would expect.

I'd like to share some of my past experiences in order to help you understand how remote publishing works and how to debug and fix some common issues. Please also refer to Liferay documentation for basic understanding about Staging Page Publication and its configuration.

This applies only to Liferay 6.2+. Remote publishing has been re-implemented for this version and works differently than in previous versions.

Understanding the remote publishing process

The remote publishing feature is based on Liferay's export/import functionality. There are several important steps:

  • 1. Connection: staging server establishes a connection with the remote server in order to check the configuration.
  • 2. Export: staging server exports the desired site and its content as archive (.lar) on the local storage (temp)
  • 3. Data tranfer: staging server transfers the archive to the remote server
  • 4. Checksum: remote server validates archive's integrity
  • 5. Validation: remote server checks for missing references or invalid content
  • 6. Import: remote server proceeds with the import of the content. If it fails, the entire import will be rolled back.
  • 7. Cleanup: both servers will cleanup temporary files

If you're lucky, everything will work as expected ;-)

This will most probably be the case when you test it for the first time with a small site and a few contents. You'll realize later that it becomes more tricky with complex web sites with hundreds of pages and web contents.

Remote publishing fails with errors

If your publication fails, first check the error message. Sometimes, it gives you a good advice about the issue (most probably a configuration or a missing reference).

If the error message says "Unexpected error" with strange details (FileNotFoundException, InvalidCheckSum, ...), proceed by identifying which step is failing and checking one of my advices below.

Identify which step fails

In order to identify which step fails, you need an access to both servers (staging and remote server). The remote publishing process will not clearly indicate which step failed but you can figure out by checking the servers.

  • During the export (step 2), Liferay creates a temporary file for the archive on the application server of the staging environment. For instance, check the /temp folder in Tomcat to see if you can see a new file beeing created. If this is the case, you know that Liferay is proceeding with export.
  • During the data transfer (step 3), Liferay sends the archive by splitting it into 10MB files to the remote server. The remote server receives it and stores it into the document library. If the size of the archive file in the temp folder (staging) is not increasing anymore and you see the remote server starting to create new file in the document library (/data/document_library), that means that the process is currently proceeding with the data transfer. Another way to identify the beginning of this step is by monitoring the CPU of the staging server. Export uses CPU a lot (to compress data into zip file) but data transfer doesn't.
  • Step 4 is extremely quick and you won't be able to distinguish it from the step before. Generally, you will get an obvious message (InvalidChecksumException) if the process fails during this step.
  • Validation (step 5) is also difficult to clearly identify but you should get an detailed error message about missing references when this steps fails.
  • When the process starts importing the data, you'll notice that CPU increases on remote server. You should also notice that individual 10MB files have been removed and merged into one valid package in the document library (/data/document_library). This step can take several minutes to complete.
  • Last step never fails. If the import succeeds, cleanup will also do.
Steps 3 (data transfer) and 6 (import) are the most frequent to fail. If you are not sure which step fails, it is probably one of these two.

Error during 1. Connection

If you get an error during the connection (after a few seconds), check one of these:

  • Make sure that both servers have been properly configured (tunneling.servlet.shared.secret, axis.servlet.hosts.allowed, ...). Check Liferay Documentation.
  • If you're using a web proxy server (Apache HTTPD, BigIP F5, Netscaler, ...), make sure that it preserves the origin host in the proxy request or the request will be rejected by the remote server due to invalid IP address. For instance, use "ProxyPreserveHost on" configuration in Apache.
  • If you may not configure the proxy server (see previous point), consider changing the property "axis.servlet.hosts.allowed" of the remote server in order to match the IP address of the proxy server (rather than the staging server). WARNING: by doing this, you're allowing anyone to access Liferay remote API. This is unsecure.
An easy way to check connectivity to remote server is to acces /api/axis URL (wget or browser). If you're not authorized, you'll get a clear message with IP address of the requester. This will help you understand what the remote server receives.

Error during 2. Export

Export should not fail. If it really does, I recommend:

  • Check that disk space is not empty on the server ;-)
  • Try to export the site (from the Control Panel) in order to see if this is really the problem
  • Try installing the latest patches (Liferay EE). Remote publishing is frequently improved by Liferay.

Error during 3. Data Transfer

This is a tricky one because you'll get strange errors and it's very difficult to debug. Try to find a good system administrator in the company who can monitor connections on the network (wireshark or similar). Common issues that I have faced:

  • Check timeouts and any other rules on proxies, servers, switches, application servers, anti-virus. If the process always fails after X minutes, there is a good chance that something on your network, between the two servers, cuts the connexion.
  • Check the size of the archive in the temp folder of staging environment. If the file is bigger than 10MB, it will be sent to the remote server in multiple pieces. Check with less data in order to see if the problem is related to that.
  • Good luck!

Error during 4. Checksum

This step should technically never fail. If it does, consider one of these recommendations:

  • Try to publish again. Maybe one sent file got corrupted during the transfer.
  • If you're using a cluster, check to see if sent files (when archive > 10MB) are clustered. This should not happen but if it does, Liferay will end up with 50% of the files on one server and 50% on the other server (assuming your cluster has 2 servers). The checksum will then always fail.
  • Check the size of the archive in the temp folder of staging environment. If the file is bigger than 10MB, it will be sent to the remote server in multiple pieces. Check with less data in order to see if the problem is related to that.

Error during 5. Validation

An error during validation will provide more information in the "remote publishing" interface (history):

  • Try to identify the missing reference and understand why Liferay is complaining about it.
  • If your site is using global references (structures, templates, categories, etc.), make sure to publish /global site first. Global site can be published from the control panel (Sites).
  • If your site doesn't have any external reference, try publishing the entire content. In the "remote publishing" options, choose "All Content".
If none of the above solutions works, you could export the site from the control panel and check the content of the archive (zip file). This might help you understand what is missing.

Error during 6. Import

It's very difficult to cover all possible situations during the import step because it strongly depends on the content in your site. Common issues are:

  • If you get errors about duplicated content, try to publish the entire site.
  • Disable "Version history" to reduce the size of your publication and see if that makes any difference
  • Check logs on the remote server for more detailed information (stacktraces).
    • If you see OutOfMemoryException, increase the memory available for your server application (-Xmx)
    • If you see GenericJDBCException, check your JDBC connection pool. It might happen that all connections have been used and none released. You might need to restart your application server to free them. Check your JDBC settings according to your needs.
  • If you have the feeling that the import is really slow, monitor your infrastructure (CPU, IO access) and give more ressources to your virtual machine or server. If the import process takes too long (typically 30 minutes and more), your staging server will get a timeout back and will not cleanup properly, although the import still continues and completes (see next point).

Error during 7. Cleanup

Cleanup step will never fail but remote publishing doesn't end properly if the process takes too long. The staging server keeps waiting for the remote server to finish the import. After some time, the staging server will get a timeout and its connection will be reset. When this happens, the staging server will clean up on its side and set the status of the process to "failed". However, the import process continues on the other end (remote server) and might succeed. You'll end up with a wrong status and a "last publication date" not correctly set. To avoid this situation:

  • Try to optimize your infrastructure (CPU, Memory, IO, ...)
  • Configure timeouts (proxy server, application server, switch, etc.) for your conveniance. But don't set timeout value too high!

If you get into the situation that the publication technically succeeded but was reported as failure because of some timeout, start another publication and set the date range to "Last 12 hours". This smaller publication should execute faster, succeed and update the "last publication date" correctly. Your environment will then be ready for further publications.

Remote publishing, best practices

  • Disable version history by default (journal.publish.version.history.by.default=false). On production, you won't need all the versions but only the latest approved ones.
  • When using asset publisher, don't choose a scope which is different than "Current site" or "Global" with staging. Your asset publisher won't work after publication any more because the group ids are different on the two environments. There are strategies to make it work but they are too complicated to be explained here :-)
  • Enable and experiment remote publishing as soon as possible. Make sure to test the process with archives that are bigger than 10 MB before assuming that everything works perfectly.

Still doesn't work?

If remote publishing won't work in your environment, consider trying one of these:

  • If you're using a cluster, try to disable all but one node.
  • If a proxy server is installed in front of your application server, try publishing directly to the application server (by opening firewalls if any)
  • If using a SSL connexion for the remote publishing process, try without it (http).
  • Try moving your remote server on the same network.
  • Try moving your remote server on the same machine.
Above experimentations are not permanent solutions but will help you identify the issue.

--

Share your experience with me!

Configure Liferay for Canada

General Blogs 24 de Dezembro de 2012 Por Sven Werlen

(pour la version francophone, cliquer ici)

A small blog entry to summarize how to prepare a Liferay installation for a canadian portal.

Liferay natively supports a lot of languages, including French and English, the two official languages in Canada. However, the included translations are fr_FR (French from France) and en_US (English from USA). In most of the cases, this would be perfectly fine because it mostly affects editors and administrators who are accessing the user interface to manage contents. However, some organizations might be punctilious (specially in Québec) on the language ("courriel" rather than "email", "télécopieur" rather than "fax", etc.) and on some visual details (en_US rather than en_CA, US flag rather than Canadian one). This is the reason why I actively worked, with the help of Liferay, on facilitating the installation and configuration of Liferay in Canada.

You'll notice in the following instructions that it's very easy to enable fr_CA (French Canadian) and en_CA (English Canadian) languages.

1) Enable the languages fr_CA and en_CA

It consists in adding a few lines of configuration in portal-ext.properties and system-ext.properties. A reminder: those two files must be located at the root of your Liferay installation (liferay-portal...) or in the classpath.

system-ext.properties

user.country=CA
user.language=en
user.timezone=America/Montreal

portal-ext.properties

company.default.locale=en_CA
company.default.time.zone=America/Montreal
locales=en_CA,fr_CA 

2) Install the translations hook

The configurations in the previous section enable and limit the languages to fr_CA and en_CA. However, no specific translation is included in Liferay by default (yet). The portal will then automatically switch to basic French and English translations.

  • Download this hook (Marketplace)
  • Install it by droping it into the /deploy folder

The hook includes the last  French Canadian translations, flags and translated templates.

3) Add language into Liferay web.xml file

In order to make the language portlet work with the newly introduced language, you'll have to add one entry into the web.xml file (ROOT/WEB-INF/web.xml typically in TOMCAT_HOME/webapps).

<servlet-mapping>
  <servlet-name>I18n Servlet</servlet-name>
  <url-pattern>/fr_CA/*</url-pattern>
</servlet-mapping>
<servlet-mapping>
  <servlet-name>I18n Servlet</servlet-name>
  <url-pattern>/en_CA/*</url-pattern>
</servlet-mapping>

 

------------------------------------------------------------------------------------------------------------------------------

Un petit billet de blogue pour résumer les étapes nécessaires à la préparation d'une installation de Liferay pour un portail 100% canadien.

Nativement, Liferay supporte de nombreuses langues dont le français et l'anglais, les deux langues officielles au Canada. Cependant, la version francophone incluse est fr_FR (français de France) et en_US (anglais des États). Dans la plupart des cas, cela ne posera pas de problème puisque cela concerne surtout les administrateurs qui utilisent l'interface de gestion des contenus. Mais, certaines organisations sont plus pointilleuses sur des éléments linguistiques ("courriel" plutôt que "email", "télécopieur" plutôt que "fax", etc.) et sur les détails visuels (fr_FR plutôt que fr_CA, drapeau Français plutôt que Canadien). C'est la raison pour laquelle je travaille, avec l'aide de Liferay, à faciliter une installation et configuration 100% canadienne. 

Vous verrez dans les quelques instructions qui suivent qu'il est extrêmement simple d'activer les langues fr_CA (français Canadien) et en_CA (anglais Canadien)

1) Activer les langues fr_CA et en_CA

Il s'agit d'ajouter quelques configurations dans portal-ext.properties et system-ext.properties. Pour rappel, ces deux fichiers doivent être situés à la racine d'une installation (liferay-portal...) ou dans le classpath.

system-ext.properties

user.country=CA
user.language=fr
user.timezone=America/Montreal

portal-ext.properties

company.default.locale=fr_CA
company.default.time.zone=America/Montreal
locales=en_CA,fr_CA

2) Installer le hook des traductions

Les configurations de la section précédente ont pour effet d'activer et restreindre les langues au français et l'anglais du Canada. Cependant, aucune traduction spécifique n'est incluse dans Liferay par défaut. Le portail basculera donc automatiquement sur les traductions françaises et anglaises de base.

  • Téléchargez ce hook (Marketplace)
  • Installez-le en le déposant dans le répertoire de scrutation /deploy

Le hook inclut les dernières traductions francophones canadiennes, les drapeaux ainsi que des gabarits traduits (courriels).

 

3) Ajouter les langues dans le fichier web.xml de Liferay

Afin de rendre le portlet "Langues" fonctionnel avec les nouvelles langues, vous devrez effectuer un dernier changement dans le fichier web.xml (ROOT/WEB-INF/web.xml typiquement dans TOMCAT_HOME/webapps).

 

<servlet-mapping>
   <servlet-name>I18n Servlet</servlet-name>
   <url-pattern>/fr_CA/*</url-pattern>
</servlet-mapping>
<servlet-mapping>
   <servlet-name>I18n Servlet</servlet-name>
   <url-pattern>/en_CA/*</url-pattern>
</servlet-mapping>

 

Mostrando 3 resultados.