But when you access a paper it’s for reading it, correct?
It is worrying if places that are “libraries” of knowledge aren’t taking the opportunity to keep searchable/parseable data, but it’s no worse than a library of books.
That's not my complaint in the first place. The problem is that while we progressed beyond books on the device side in terms of even just the viewport, we seemingly can't move past the letter-sized paged format. The format may be a bit better than books—what with it being easily distributed and with occasionally copyable text—but not enough so.
I'm not even touching the topic of info extraction here, since it's pretty hard on its own and despite it also being better with HTML.
Yeah, it's better with HTML than with PDF, but it's still pretty terrible... Use some actually structured data format like XML (XHTML would be good), because you don't want to include a complete browser just to search for text
It is worrying if places that are “libraries” of knowledge aren’t taking the opportunity to keep searchable/parseable data, but it’s no worse than a library of books.