Venus and a single cache directory?

David L. Sifry dsifry at technorati.com
Wed Oct 4 10:22:26 EST 2006


Sam,

Thanks for the response.  I've got a few questions inline:

Sam Ruby wrote:
> David L. Sifry wrote:
>> Hey, love Venus (and Planet!) I've been playing around a lot with 
>> it.  I was wondering if there were any plans to allow for a single 
>> cache directory that a number of venous (or planet) installations 
>> could share - It sure would be a waste of bandwidth to have a few 
>> installations that shared the same set of base RSS feeds, for 
>> example, but had a few that were unique.
>>
>> Any plans? Or could someone point me to the right place in the code 
>> that would need to be extended to allow for this behavior? I can go 
>> brush off my python skills and take a few hacks at it. :-)
>
> Venus allows you to invoke 'spider' and 'splice' operations separately.
>
>     http://intertwingly.net/code/venus/docs/venus.svg
>
> So... if you had (for example) three config.inis which specified three 
> (possibly overlapping) sets of feeds, but specified the same 
> cache_directory, and you defined a fourth config.ini which contained 
> the union of feeds and specified the same cache_directory, you could 
> "spider" the latter (fetching the data), and then serially "splice" 
> using the original definitions (producing the output).
>
OK, I am already using the OPML capabilities for each config.ini, but if 
I use the same cache directory, the results look very strange, showing 
feeds in one planet that aren't listed in the OPML of its config.ini.  
Is there any additional information that is kept in the cache directory 
to preserve state? I'm using the latest code from 
http://intertwingly.net/code/venus/.

>  - - -
>
> To make things a bit easier to manage, one thing you can do today is 
> to split out the subscriptions into a separate file, and subscribe to 
> that file, specifying a content_type.  Content types supported today 
> are "opml" and "foaf".  A working example can be found here:
>
>     http://intertwingly.net/code/venus/examples/opml-top100.ini
>
> And some of the outputs produced:
>
>     http://planet.intertwingly.net/top100/
>     http://planet.intertwingly.net/top100/mobile.html
>
> The point being that it might be easier to maintain if the 
> subscriptions lists are kept separate from the rest of the 
> configuration.  Each OPML or FOAF file could be referenced twice, once 
> from a "splice" configuration, and once from the common "spider" 
> configuration.
Could you show me what a "spider" configuration would look like, and 
what a "splice" configuration would look like? Are these different 
config options in the config.ini files?
>
> Note: if there was interest, additional formats (like XOXO) could 
> easily be supported, leading to...
>
>  - - -
>
> One format that would be trivial to support would be the same 
> config.ini format that the rest of planet uses.  This means that the 
> "spider" configuration would reduce to a series of sections that 
> merely listed the other configurations as input.  If this is of 
> interest, let me know.
>
> Notes:
>
>  * While subscriptions referenced by config.ini files are normally URLs,
>    they can also simply be relative or absolute file paths.
>
>  * If you reference mulitiple subscription lists, it is OK for
>    individual subscriptions to be appear multiple times; what you will
>    get is a proper union (i.e., the feed will only be fetched one time).
>
>  * What you put in the various config.inis is actually up to you.  I've
>    described a use case where you put subscriptions in there, but you
>    actually could factor out the common [planet] definitions (like the
>    cache_directory, for example) into a separate config.ini
>
> What I like about all this is that you can start simple (like with 
> "classic" planet) and put everything in one file, but as your needs 
> grow, you can re-arrange to your hearts content.
>
I love this philosophy, but maybe I just need a bit more handholding - 
or if you could show me where in the code these options are defined, 
I'll go experiment...
> - Sam Ruby
Thanks again!

Dave

-- 
David L. Sifry
Founder and CEO, Technorati, Inc.
dsifry at technorati.com
415 846-0232 (Mobile)



More information about the devel mailing list