A Data Management Fantasy [en]

[fr] Mon rêve: un système qui cacherait sur un espace donné de mon SSD (disons 50GB) les fichiers les plus récemment ouverts se trouvant sur mon disque dur externe. Ainsi, j'aurais à portée de main et sur disque dur rapide tous mes fichiers courants. Vous connaissez une solution qui fait ça?

I’m now running a happy MacBook with a 120Gb SSD (too big or to small depending on how you look at it, but I was in a hurry and dependant on what was in stock in the shop). I have an external 500Gb HDD to store all my junk on.

And here’s my dream. Wouldn’t it be nice if I could devote a certain amount of space on the SSD to my files, say 50Gb, and have that space occupied by cached copies of the files from the external drive that I most recently used? When I modify the files, the cached copies and those on the HDD would sync. And if I haven’t touched a file for long enough, it would be removed from the cache to free up space.

Like that my “current” files would be on the super-fast SDD and close at hand when I’m on the road.

I’m sure a solution to do this already exists — heard of anything?

MagpieRSS Caching Problem [en]

I have a caching problem using the PHP MagpieRSS library to parse feeds. Any help welcome.

[fr] J'ai un problème de cache utilisant la librarie PHP MagpieRSS. Toute aide bienvenue!

I’ve been stuck on a problem with MagpieRSS for weeks. This is a desperate call for help.

At the top of my sidebar, I have two lists of links which are generated by parsing RSS feeds: Delicious Linkball and Recently Playing. They don’t update.

If I delete the cache files, the script creates them all right. If I keep an eye on the cache files, I see their timestamp is updated every hour, but not the contents. I’ve uploaded the PHP code which parses the feeds.

Any suggestions welcome. I’m not far from giving up and setting cron jobs to regularly delete the cache files. Thanks in advance.

Update 13:00: The Recently Playing list updates once an hour (when the cache is “force-refreshed”), it seems — but not the Delicious Links one.

14:00: Some progress: http://del.icio.us/rss/steph/ doesn’t seem to update unless I clear the cache on my machine. (Huh?) http://ws.audioscrobbler.com/rdf/history/Steph-Tara, on the other hand, is — but why does the cache update only once an hour, and not each time the feed is modified?

15:00: crschmidt just pointed out that the last-modified date on my del.icio.us RSS feed was horribly wrong. Might be something that was done at the time when my caching problems were causing me to nastily abuse the poor del.icio.us server. I’ve sent a mail to Joshua to see if indeed this could be the problem.

15:50: Still thanks to the excellent crschmidt, I’ve finally understood how this caching is supposed to work. (Yes, I know, we’re starting to have lots of edits on this post.) There is a setting which determines how old the cache must be to become “stale”. As long as the cache is not stale, any requests made will use the cache directly, without pulling the feed in question. If the cache is stale, a request is sent to the server hosting the feed to check if it has changed since it was last accessed. If it has changed (i.e., if Last-Modified is more recent than the cache), it gets a fresh version of the feed. Otherwise, nothing happens (the cache age is just “reset”).

Now, for a LinkLog service like del.icio.us, setting the cache age to a couple of hours is more than enough as far as I’m concerned. However, for a list of recently played songs, every few minutes should be better. MagpieRSS seems to allow this to be set on a per-call basis by defining MAGPIE_CACHE_AGE, but it doesn’t seem to be working for me. Another variable is set on a per-installation basis: var $MAX_AGE = 1800; — but changing that won’t really help, as I want different values for Recently Playing and Delicious Links. Suggestions on this secondary problem welcome too!

16:40: After exchanging a few e-mails with Joshua, it seems that there was indeed a problem with the Last-Modified date on my feed. Not quite sure how it came about (somebody requesting the feed when I hadn’t posted in some time?), but it should be fixed now. I’ve cleared my cache files to see if my 30-minute “stale time” is working or not.

17:30: (See how I’m updating every 50 minutes? Freaky.) So, the not-so-nice things about PHP constants is that they are constant and (?) local to the function in which they are defined. (Not sure I go that bit right, but.) Important thing here is to note that MAGPIE_CACHE_AGE can’t be used to set different “stale cache” ages for different feeds. The stale cache age needs to be set at the bottom of rss_fetch.inc (the only place I hadn’t touched) — so my cache is now refreshing every half-hour. (Which is a bit too often for del.icio.us, and not often enough for Audioscrobblers.) oqp says he can write a wrapper to get around this limitation — I’m waiting impatiently for him to do it!