Fórum

Need better separation of instance data

Ben Starr, modificado 15 Anos atrás.

Need better separation of instance data

Regular Member Postagens: 103 Data de Entrada: 27/11/07 Postagens Recentes
Liferay Portal supports multiple "instances" which have their own virtual hosts, each with their own users, communities, etc. For administrative purposes the separation of data between instances needs to be better. At the moment instance data is stored in a single database schema and the file system data is in a single location on the disk (e.g. Jackrabbit repository, Lucence index, etc).

For various reasons I would like to be able to easily move instances between Liferay deployments. Examples include moving an instance from a development deployment to a production deployment, moving an instance between two production deployments (multiple server hosting without clustering), restoring an instance from a backup after a hardware failure, etc.

At the moment it is not easy to do this because of the way the instance data is stored together. It is possible that I could write a script to extract out all of the instance-specific data from the database but I'm not sure if there will be problems with referenced IDs in different databases not matching if I tried to do an import into another database. I'm pretty sure it would be impossible to separate out the file system data at the moment and I would rather not have to store all of my files in the database.

A simple way to provide this functionality would be to store each instance in a different database schema and have a different file system data directory for each instance. I don't really mind if there is no instance import/export function in the GUI and in the case of a hardware failure the export function would not be very useful anyway because you would have to do a full restore of the deployment in order to use the export function.

If this proposal was adopted then in theory it should be possible to move an instance to another Liferay deployment simply by moving the database schema and file system data across. It would require some modifications to the instance maintenance function in the admin portlet to allow you to specify the database schema and file system data location when adding a new instance and you would need the ability to add an existing instance (i.e. "register" an instance that has been copied from elsewhere by specifying the standard instance information, database schema and file system data location). It would also be useful to be able to delete an instance (this could just be a "un-register" function, it doesn't have to delete the actual instance data).

The addition of this functionality would make the creation of new instances slightly more complex but for the benefits it would be worth it. At the moment the Liferay instance functionality does not offer the full flexibility of true virtual hosting and the alternative of having separate Liferay deployments for each instance is not viable given the amount of server resources a single deployment requires.

A slightly separate improvement that would offer more flexible instance virtual hosting would be the ability to configure all of the properties in the portal.properties file on a per-instance basis (except for the ones that are obviously deployment-specific rather than instance-specific).
Ben Starr, modificado 15 Anos atrás.

RE: Need better separation of instance data

Regular Member Postagens: 103 Data de Entrada: 27/11/07 Postagens Recentes
Would anyone else who is using multiple instances find this useful? The instance data isn't separated well enough at the moment to provide the full flexibility of virtual hosting.
thumbnail
Jeffrey Handa, modificado 15 Anos atrás.

RE: Need better separation of instance data

Liferay Master Postagens: 541 Data de Entrada: 01/12/08 Postagens Recentes
Hi Ben,

I don't work with multiple instances directly, but the feature you are describing sounds very useful. Would something like what Alex describes in this blog about sharding be a possibility?

http://www.liferay.com/web/achow/blog/-/blogs/sharding-databases-an-unorthodoxed-design-1?_33_redirect=%2Fweb%2Fachow%2Fblog
thumbnail
Mika Koivisto, modificado 15 Anos atrás.

RE: Need better separation of instance data

Liferay Legend Postagens: 1519 Data de Entrada: 07/08/06 Postagens Recentes
Sharding is the answer to this "problem". And yes it is something that I find useful.
thumbnail
Josh Asbury, modificado 15 Anos atrás.

RE: Need better separation of instance data

Expert Postagens: 498 Data de Entrada: 08/09/06 Postagens Recentes
Ben,

I think that this is being addressed in Sharding (see Alex's blog post here: http://www.liferay.com/web/achow/blog/-/blogs/sharding-databases-an-unorthodoxed-design-1). I think that this will be very useful, though I haven't touched it yet...

Josh
Ben Starr, modificado 14 Anos atrás.

RE: Need better separation of instance data

Regular Member Postagens: 103 Data de Entrada: 27/11/07 Postagens Recentes
Sharding certainly looks promising but I'm a bit unsure about a few things:

1. Although it looks like you can use it to split data between databases (e.g. on company ID) I'm not sure that it will allow you to move data between Liferay Portal instances. What I want to be able to do is move all of the data for one company from one Liferay Portal instance to another. If I was using sharding on both Liferay Portal instances then in theory I could remove the database for a given company from the shards of one Liferay Portal instance and add it to the shards for another instance of Liferay Portal. The problems I see here are a) the primary keys are probably specific to one Liferay Portal instances shards and b) I am assuming there is going to be reference data not specifically related to companies that might also have different IDs in a different Liferay Portal instances shards. Where does the shared reference data get stored in the shard?

2. It does not address the issue of splitting disk storage between instances. At the moment uploaded files get stored on disk in the Jackrabbit repository. I don't really want to store these in the database as it will bloat it significantly. The problem at the moment is that the files for all companies are stored together in a single Jackrabbit repository so it does not seem possible to move the files for one company from Liferay Portal instance to another.

Sorry for the delay in replying - I was away on leave and have been busy since I returned.
Ben Starr, modificado 14 Anos atrás.

RE: Need better separation of instance data

Regular Member Postagens: 103 Data de Entrada: 27/11/07 Postagens Recentes
Can anyone advise whether sharding will solve the problem of being able to move instances between Liferay Portal deployments with respect to the issues in my previous message (i.e. different primary keys in different databases and shared reference data)? I can see that sharing could be used to split information per company but I'm not sure whether it will allow moving of instances between Liferay Portal deployments. For me this would be a really useful and important feature.

Thanks.
thumbnail
Lisa Simpson, modificado 14 Anos atrás.

RE: Need better separation of instance data

Liferay Legend Postagens: 2034 Data de Entrada: 05/03/09 Postagens Recentes
That sounds really useful in getting Liferay into the virutal hosting environments. I'm all for that...
N J, modificado 14 Anos atrás.

RE: Need better separation of instance data

New Member Postagens: 12 Data de Entrada: 22/06/09 Postagens Recentes
Ben Starr:
Can anyone advise whether sharding will solve the problem of being able to move instances between Liferay Portal deployments with respect to the issues in my previous message (i.e. different primary keys in different databases and shared reference data)? I can see that sharing could be used to split information per company but I'm not sure whether it will allow moving of instances between Liferay Portal deployments. For me this would be a really useful and important feature.


It won't solve that problem. It's probably a step in the right direction, but even with sharding, you would need to do some work to move the data and to make sure that referential integrity is maintained.

Additionally, as you brought up, Jackrabbit data isn't sharded. This is a big problem for us. An alternative to sharding the Jackrabbit data that would be acceptable (though maybe not as good) would be the ability to compartmentalize the jackrabbit data by source. What I mean by this is that one could also theoretically use a different jackrabbit repository for each different portlet that uses jackrabbit.

Does anybody have any ideas about how we could do this (compartmentalize jackrabbit data into different databases)? Or have any other ideas about making the jackrabbit storage scalable?
thumbnail
Lisa Simpson, modificado 14 Anos atrás.

RE: Need better separation of instance data

Liferay Legend Postagens: 2034 Data de Entrada: 05/03/09 Postagens Recentes
Just as a suggestion for the future, wouldn't it be better to seperate out what liferay as a server needs from each virtual host? Put the liferay stuff in lportal and then put each virtual in it's own database so you can just toss them around as you need to.... same way we do virtual machines on physical hardware.... It would help in balancing loads on VM's, for sure, if they were easier to move around.
Ben Starr, modificado 14 Anos atrás.

RE: Need better separation of instance data

Regular Member Postagens: 103 Data de Entrada: 27/11/07 Postagens Recentes
It is good to see that there are other people who would also find this functionality useful. I think there needs to be better data separation in a number of ways:
  • Between Liferay core and instances
  • Between Liferay core and instances at the database level
  • Between Liferay core and instances at the file system level (e.g. Jackrabbit)

The ideal scenario would be the ability to move an instance to another Liferay deployment simply by "registering" the instance with the other Liferay deployment and pointing it to the database and Jackrabbit repository of that instance (assuming the same Liferay versions were being used between the two deployments). Of course there would be other configuration required such as moving cnames, virtual host configuration, etc but from a data level this should be all that is required.

Is this something that the Liferay core developers see as a important? I think it would make the system a lot more attractive to hosting providers and make backup and recovery a lot more straightforward. At the moment the integration of data between instances is a bit scary and I am not relishing having to move an instance or recover one in the case of failure.

Ben
thumbnail
Lisa Simpson, modificado 14 Anos atrás.

RE: Need better separation of instance data

Liferay Legend Postagens: 2034 Data de Entrada: 05/03/09 Postagens Recentes
Not just that but consider load balancing...

ABC and XYZ both start off with a new web site. They both purchase a basic hosting package and start building their community. XYZ takes off and becomes wildly popular. So popular in fact, that it starts crashing the server just from sheer traffic numbers. If you're the hosting provider, you might need to move XYZ to better hardware but since ABC, JLK, MNO, etc. aren't wildly popular, you don't want to have to move them too just to keep XYZ from crashing your other customers.

That's a pretty common scenario with a hosting company.

I can also see where having virtuals for "vanity sites", special events, etc. might be desirable even for a "regular" Liferay instance.
thumbnail
Lisa Simpson, modificado 14 Anos atrás.

RE: Need better separation of instance data

Liferay Legend Postagens: 2034 Data de Entrada: 05/03/09 Postagens Recentes