On 8 May 2008, at 12:03, Scott Wilson wrote:
> We adopted RSS-style, as there were no appreciable performance or
> synchronization problems associated with an RSS-style approach even
> at the scale of syndicating 750,000 courses.
That's a useful data point.
> Its actually entirely feasible for the aggregator to process the
> whole state and implement diffs rather than to do selective
> harvesting with updates and deletions; and its far, far easier on
> the provider, which doesn't then have to observe and propagate
> transaction states as updates/deletes; it just has to dump the
> catalogue state as-is.
It's entirely possible that the OAI-PMH harvesting model was a
response to arXiv's historic paranoia about unrestrained web crawlers.
You can still see evidence of it today in http://arxiv.org/RobotsBeware.html
. Nowadays the injunction to beware of "gigabytes" of data seems a
little old fashioned.
> This approach reduces the barrier to entry significantly. You can do
> an RSS feed, even with extra SWAP/PRISM/whatever metadata, for a
> departmental no-frills publications website; you can't do an OAI-PMH
> for it without investing in the whole repository shebang
...unless you use the available perl, java or PHP modules for your
websites and services...
> and having some friendly geek get the hood up to play with the OAI-
> PMH code until it (sort of) (almost) works (ish).
A development methodology from which all other forms of network
protocol are doubtless immune.
> Overall OAI-PMH comes across as violating the YAGNI principle; this
> is a very, very common issue with "anticipatory" standards (I've
> made the same mistakes myself in IMS work). So, good effort back in
> the day, but time to move on!
I'm not sure that OAI was based on much anticipation - it was more an
answer to the question "how can we existing archives share our
holdings openly".
But as you say, things have moved on. We have REST and RSS interfaces,
and I'd be very interested to hear from anyone who has built wide-
scale services using them.
--
Les
|