Monday, March 13, 2006

DMoz vs RDF Repository (Sesame)

What can you do - there are just some days that just should not happen. I think one of them was when I decided to integrate DMoz ontology into JeromeDL and FOAFRealm. It all looked so harmless - especially when I got (small !) part of the ontology from Andreas. I build the whole mind model around that, finally even set up a JOnto project to deliver unified API to handle taxonomies. And ...
I decided to download DMoz RDF, or what ever they claim to be an RDF :( It took me some time to realize what was wrong. And eventually I got some help from Hee Chul and Krystian with nice converting scripts. I though that it was the end of the problems - I had a sample RDF (a true one) that worked. And a real RDF version of full DMoz ontology. But it was not the end of the problems :(
I decided to upload the 800MB RDF-DMoz file to Sesame. But after a couple of hours of waiting, 100% CPU usage, almost 80C CPU temperature of my laptop, I gave up.


Daniel suggested I should just point to the RDF file and make the memory repository. Well - it went quick that way - "out of heap memory" error :( Later I took the "divide and conquer" approach. Cut this 800MB file into 10 smaller. First one got uploaded very quickly (relatively). And so, encouraged by that example I started uploading the rest 9. Each next of them was taking much much longer to be uploaded, until the 10th one that obviously must had make Sesame hanging - as there was no progress for the whole night (I went to sleep btw).

I cleaned the repository and uploaded only the first chunk again. But trying to use it - with browsing or SeRQL querying was way to sluggish. Finally I came to my brains and "slimed" the DMoz RDF removing (with modified Krystian's script) all information that was not defining dmoz:Topic or using dc:title and dmoz:narrow{12}. Luckily I got 200MB RDF file that went smoothly into Sesame.
And now JOnto-DMoz is finally kicking the ass :)

[I will upload the scripts and the final RDF file to jonto.sf.net soon]

Wednesday, March 01, 2006

Carnival 2006


pict8974
Originally uploaded by skruk.

And so we made it. Everyone said it would be a shame to be in Rio during the Carnival and miss the Samba Schools Parade.
Luckily I have even managed to convince myself to take my precious camera with me, shot almost 600 photos, and bring them all home safe.
The funny thing is that what ever people say about Brazil, although you can get easily paranoid with respect to ones safety, Carioca are really nice people. During the parade the only hostile people where two geyish Englishmen trying to quarrel with everyone (including myself) that was trying to get too close to the fence (as they claimed to be the only ones that should occupy that place and shot photos). Weird, isn't it?
Any way - presentation was magnificent - we were only sorry we could not make it to the end - after 5 school we decided to go home (it was 6am BTW) - as I could not see a thing (600 photos / 6h can make your eyes hurt - believe me) and Ewelina was tired as well.