Venus and a single cache directory?
Sam Ruby
rubys at intertwingly.net
Wed Oct 4 10:07:51 EST 2006
David L. Sifry wrote:
> Hey, love Venus (and Planet!) I've been playing around a lot with it. I
> was wondering if there were any plans to allow for a single cache
> directory that a number of venous (or planet) installations could share
> - It sure would be a waste of bandwidth to have a few installations that
> shared the same set of base RSS feeds, for example, but had a few that
> were unique.
>
> Any plans? Or could someone point me to the right place in the code that
> would need to be extended to allow for this behavior? I can go brush off
> my python skills and take a few hacks at it. :-)
Venus allows you to invoke 'spider' and 'splice' operations separately.
http://intertwingly.net/code/venus/docs/venus.svg
So... if you had (for example) three config.inis which specified three
(possibly overlapping) sets of feeds, but specified the same
cache_directory, and you defined a fourth config.ini which contained the
union of feeds and specified the same cache_directory, you could
"spider" the latter (fetching the data), and then serially "splice"
using the original definitions (producing the output).
- - -
To make things a bit easier to manage, one thing you can do today is to
split out the subscriptions into a separate file, and subscribe to that
file, specifying a content_type. Content types supported today are
"opml" and "foaf". A working example can be found here:
http://intertwingly.net/code/venus/examples/opml-top100.ini
And some of the outputs produced:
http://planet.intertwingly.net/top100/
http://planet.intertwingly.net/top100/mobile.html
The point being that it might be easier to maintain if the subscriptions
lists are kept separate from the rest of the configuration. Each OPML
or FOAF file could be referenced twice, once from a "splice"
configuration, and once from the common "spider" configuration.
Note: if there was interest, additional formats (like XOXO) could easily
be supported, leading to...
- - -
One format that would be trivial to support would be the same config.ini
format that the rest of planet uses. This means that the "spider"
configuration would reduce to a series of sections that merely listed
the other configurations as input. If this is of interest, let me know.
Notes:
* While subscriptions referenced by config.ini files are normally URLs,
they can also simply be relative or absolute file paths.
* If you reference mulitiple subscription lists, it is OK for
individual subscriptions to be appear multiple times; what you will
get is a proper union (i.e., the feed will only be fetched one time).
* What you put in the various config.inis is actually up to you. I've
described a use case where you put subscriptions in there, but you
actually could factor out the common [planet] definitions (like the
cache_directory, for example) into a separate config.ini
What I like about all this is that you can start simple (like with
"classic" planet) and put everything in one file, but as your needs
grow, you can re-arrange to your hearts content.
- Sam Ruby
More information about the devel
mailing list