PloneRSS Design Document
Notes for Plone 3
The MySQL made PloneRSS really scalable and very attractive to large scale
hosters, however this also made it difficult to install and put off the majority
of potential small scale users. As I was re-writing the product pretty much from
scratch (!) I figured I'd correct the design .. flaw. If anyone wants the
SQL stuff back, it can be done if someone wants it badly enough to pay me.
In the meantime, the newer version is looking much smoother, and the portlet
handling is great .. now I've figured out how to make it work .. after 2 days
of reverse engineering broken examples .. :(
- Gareth.
Design Goals
- To run without the need for ZEO
- To satisfy the design requirements of CMFFeed
- To make the system more efficient in a multi-server / virtual hosting environment
- To make installation / upgrades easier
- To get rid of ZClasses and make the product pure Plone
A bit of history
In January 2005 we release CMFFeed which was an early attempt to efficiently pipe external RSS feeds into a Plone instance. The idea being that RSS news items were stored inside the ZODB to allow for searching at a later date and also to contribute to useful content. The initial system required the use of ZEO (which wasn't too popular) and as it turned out, had some problems running on windows, and indeed on releases of Plone after 2.0.
Use of an SQL Database
Instead of using ZEO, which everyone hated, I've now switched to using MySQL (or another SQL) , which I'm guessing everyone is going to hate. The logic goes like this;
Pulling down RSS feeds can be expensive, you might for example want 5 servers with 10 instances on each server, with each instance supporting say 100 users. Let's say your average user signs up for 5 feeds, you might end up with 25,000 URL's which might be updating every hour. (this 'could' be quite a lot!)
Now we have the added bonus that if a number of them subscribe (for example) to SlashDOT, then the hosting IP is going to get a ban for pulling down too frequently from one IP.
So, what we do is consolidate all feeds down to a central SQL database (that all servers can connect to) and run the interval timer from this, so if 10 users share the same feed, we draw it down once and store in the the SQL db, then distribute to the appropriate Plone instances as required.
Going one better, within the Plone instance, if more than one user subscribes to a feed, that feed is only imported once and the feed links into the RSS URL via the catalogue so's to prevent massive duplication.
Although it might be possible to store information common to (x) servers in ONE of the Zope instances at root level, there's not really an easy way to provide global access like this to other servers, at least nothing as elegant as using a common SQL connector.
Caveat:: it requires a "little" savvy on the part of users. When they come to add new feeds, they need to see from a drop down list whether the feed is already on the system and select it, rather than making a new feed import.
Catchall:: Turn on the option in rss_manager that required feeds to be published, that way the sysadmin can check that users aren't being too dumb!
Where's the Control Panel?
- Don't need one
- Documentation on writing control panel applets leaves a little to be desired, so I wasn't driven to find out how.
Features
- Once the portlet is installed in a side panel, it will recursively look up the acquisition tree looking for valid rss_instances. Once it finds a folder containing a valid instance, it will display information based on all instances within that folder. (i.e. when you create an instance, it will turn on the RSS display for that folder and all sub-folders)
- Adding an instance with no feeds will remove the portlet from a folder, again it's downwards recursive.
- The portlet with generate a portlet box "per instance" you create in a given folder
- When you create an instance you can specify as many feeds as you like to be merged / displayed within the portlet. Ordering is done by publication date/time.
- "More news uses" the standard catalog search to display all items relating to the feeds specified in the instance.
- The system was designed using ArchGenXML for ease of maintenance