Forums

Home » Liferay Portal » English » 3. Development

Combination View Flat View Tree View
Threads [ Previous | Next ]
toggle
Jakub Liska
Document Library API - additional metadata extraction
February 27, 2011 2:50 PM
Answer

Jakub Liska

Rank: Regular Member

Posts: 187

Join Date: March 25, 2010

Recent Posts

Hey,

I have a portlet where users upload documents (MS office OLE2 documents, ODS documents, PDF etc.) and I have to persist them with some metadata available. ( Some Doublin core properties, language detection etc.).

I know how would I do that without using Liferay, I'd probably use Apache solr with Apache Tika (UpdateRichDocuments and ExtractingRequestHandler) or Apache Jackrabbit that are using Apache Tika under the hood (org.apache.jackrabbit.extractor.*).

But there is not jackrabbit-text-extractors library in Liferay, so I suppose If I wanted metadata to be extracted from PDF, DOCs, ODS documents, I would have very hard times... Mostly because I've never been using Jackrabbit and Document Library API

Could please anybody help me here ?
Bijan Vakili
RE: Document Library API - additional metadata extraction
March 5, 2012 2:50 PM
Answer

Bijan Vakili

Rank: Regular Member

Posts: 139

Join Date: March 10, 2009

Recent Posts

Hey,

I haven't tried this myself, but in 6.1 the following API seems to do what you want:

http://docs.liferay.com/portal/6.1/javadocs/index.html?com/liferay/portlet/documentlibrary/model/package-summary.html

Thanks.
Bijan