RDF tools as workhorse from Geoff Chappell on 2005-09-14 (semantic-web@w3.org from September 2005) (original) (raw)
Hi Mark,
-----Original Message----- From: semantic-web-request@w3.org [mailto:semantic-web-request@w3.org] On Behalf Of Mailing Lists Sent: Tuesday, September 13, 2005 4:47 PM To: semantic-web@w3.org Subject: RDF tools as workhorse
Hi all,
Does anyone on the list have some real-world stories to share about using RDF and its tools as a backend technology? The company I work for maintains a database of metadata. I'd like to explore using RDF instead of our current schemas.
For example: I have a lot of data about books. I'd like to translate the data into RDF/XML and dump it into an RDF database. Then, taking a particular book, I'd like to query the database to extract related information like: other books by the same author, other books with the same subject code, etc.
My concerns relate to:
Performance -- Right now we query the database using SQL. Sometimes it is very slow. That's mainly because the data is distributed across tables and there are a lot of joins going on. It seems like using RDF would allow us to use simple queries.
Scalability -- Our triplestore would be HUGE. I'd estimate 10-20 Million triples. Is that small or large in RDF circles?
As a real-world example of performance and scalability you might be interested to check out some work with did with the Uniprot protein database (262 million triples) and RDF Gateway. See:
http://labs.intellidimension.com/uniprot/default.rsp
for a description of the effort and some live example queries (including a link to an experimental sparql query interface).
- Productivity -- It's usually easier for me to envision creating RDF from our source data than massaging the data to fit into our database schema. The same goes for when I'm extracting data - it seems like it would be much easier to express my query as a triple using wildcards for the data I want.
One of the big benefits I find from working with RDF is the ability to evolve/adapt your data as your project changes. As opposed to a relational schema, which you're really forced to get right the first time because it's typically pretty inflexible to change once an app is built around it, a rdf schema is much more fluid and seems to allow for a more iterative development model (particulary if your store supports inference that let's you easily values as a schema changes).
Any information will be helpful. I'm interested in learning from other peoples' experiences.
Thanks, Mark
..oO Mark Donoghue ..oO e: mark@ThirdStation.com ..oO doi: http://dx.doi.org/10.1570/m.donoghue
Best,
Geoff Chappell
Received on Wednesday, 14 September 2005 11:19:19 UTC