Sunday, February 10, 2008

Why do I prefer ntriples?

A thought experiment (actually I had to do that just a minute ago): you have a number of publications backed up from JeromeDL. Each publication is in a separate folder, named as an ID of this publication. Inside you will find dublin core file (XML), couple of binary files (PDFs and such), and RDF description of the resource.

The task: Map a title to each resource using anything you can get on MacOSX or Linux.

Solution: The RDF description in JeromeDL is exported using ntriples format. Which means - one statement per line. Therefore a solution is a very simple workflow:

  1. find the RDF files

  2. prepare grep command

  3. execute


Which on any UNIX system will translate into:
find . -name "rdf.abstract.ntriples" | awk '{print "grep \"xontology#hasTitle\"",$0}' | sh -

Teaser: Try to do that spending only as little time as I did with either RDF/XML serialization or/and Windows. Good luck.

No comments: