Tuesday, March 20, 2007

Machine Analysis of Scientific Papers

There's a lot of exciting work going on in NLP right now, and it's hard to keep up...and even harder to maintain a blog about all of it! Larry pointed me to an article from last year detailing an automated tool for analyzing and comparing experimental reports.

This project sounds like some sort of XML markup scheme for outlining scientific papers, similar to the ontologies powering the semantic web initiative. It probably involves too much overhead to be widely adopted and therefore be useful, as it likely requires the author to spend an awful lot of additional time constructing papers such that the EXPO system could parse it. A much more elegant method would be for the system to perform an automatic analysis and markup of the text, however that would require NLP technology beyond what's currently available.

As you might imagine, a similar hurdle exists for the adoption of the semantic web in general. However, the analysis & synthesis of peer-reviewed journals presents us with yet another "killer app" for NLP. There is simply far too much information covering any given topic being generated for a human being to digest, even experts in a particular field...let alone a renaissance man or polymath to synthesize from diverse fields. Only a machine with advanced NLP capabilities would be able to make sense of it all and create new knowledge from what's already available.

No comments: