Problem loading images from Blogger in post. (fetch_filter)

Gabriel Labelle glabelle at optimized.org
Tue Oct 24 07:05:07 EST 2006


On Mon, 2006-10-23 at 07:30 -0400, Sam Ruby wrote:

> 
> I'm sorry that I don't know how to make the Coral CDN faster or more 
> reliable, but if the problem is that it takes longer the first time an 
> image is ever referenced, perhaps it would make sense for the filter to 
> fetch the image?  This would make the planet run take longer, but that's 
> generally a background process.
> 
> Thoughts?

I just tried something, but I'm unable to code it correcty. I took the
current coral_cdn_filter and tried to create a fetch_filter from it in
order to do the following:

- Remap some blog engine's image to match my local cache location (i.e)

http://photos1.blogger.com/blogger2/454/1217/200/erable.png

to

http://www.quebecois.eu/fetch_cache/photos1.blogger.com/blogger2/454/1217/200/erable.png

*****
"""
Remap all images to take advantage of a local cache after fetching the
images.
"""

import sys, urlparse, xml.dom.minidom

entry = xml.dom.minidom.parse(sys.stdin).documentElement

for node in entry.getElementsByTagName('img'):
    if node.hasAttribute('src'):
        component = list(urlparse.urlparse(node.getAttribute('src')))
        if component[0]=='http' and component[1].find(':')<0:
            component[0] += 'www.quebecois.eu/fetch_cache/'
            node.setAttribute('src', urlparse.urlunparse(component))

print entry.toxml('utf-8')
*****

But the closest I got is this :

httpwww.quebecois.eu/fetch_cache/://photos1.blogger.com/blogger2/454/1217/200/erable.png

Which is obviously not good. If someone could help me to get the correct
url layout from the code the only thing left would be to make the filter
do a wget command before modifying the actual image's URL.

wget -P /var/www/quebecois.eu/fetch_cache/ -x URL-TO-IMAGE

The -P option would be used to specify the local cache directory and the
-x option would force the creation of a hierarchy of directories, even
if one would not have been created otherwise.

This last option would simplify the URL remap and prevent two images
with the same filename to overwrite each others.

I think it would be great but I just suck about the coding part.

-- 
Gabriel Labelle <glabelle at optimized.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : /archives/devel/attachments/20061023/95c006b0/attachment.pgp


More information about the devel mailing list