So, I have to say that (despite the weather) the first annual Computer Science Research Day at UD was a success. The keynote speaker from Stanford provoked quite a reaction, at least from me (more on that soon). During the panel discussion on Privacy & Security regarding digitizing medical records, Emily made a good point by suggesting that the controversy could be avoided by giving individuals the ability to choose exactly what data would be made available.
I presented a poster about ICICLE (my first as a grad student), and ran some demos afterwards to anybody who was interested. In addition to ICICLE, I also demonstrated my Answer Machine and summary program. I'm hoping to have a CGI version of the summarizer online soon for anyone to try out.
By the way, I'm thinking of calling it 'gist,' which could stand for either "Greenbacker's Instant Summary Tool" or "gist Is a Summary Tool," depending on how narcissistic I'm feeling at the moment...
Showing posts with label answer machine. Show all posts
Showing posts with label answer machine. Show all posts
Friday, February 22, 2008
Sunday, January 29, 2006
Moving Forward
Last night I did some work on the Answer Machine, and today I'm working on adding topic detection to AutoSummary.
Check out the entry in the Answer Machine project log for details about its new functionality. As for AutoSummary, I came up with a good idea about how to implement topic detection within the current framework of the program. I was checking out this post at the Search Science blog that I read, and thought "I can do that."
What I plan on doing is after determining the likely sense of a given word, I'll build a list of all of the possible topics that word is connected to (using the WordNet domain function). From that I'll be able to find the topic of a sentence by taking the best intersection of all the anchor words (not "the" or "and" etc).
You can probably see how it will scale from there, building intersections of sentences into paragraphs, and paragraphs into entire texts. So very shortly I just might be able to take an fully body of text and determine just what the heck it is all about. Could be an interesting step forward in search relevance and data mining...
Check out the entry in the Answer Machine project log for details about its new functionality. As for AutoSummary, I came up with a good idea about how to implement topic detection within the current framework of the program. I was checking out this post at the Search Science blog that I read, and thought "I can do that."
What I plan on doing is after determining the likely sense of a given word, I'll build a list of all of the possible topics that word is connected to (using the WordNet domain function). From that I'll be able to find the topic of a sentence by taking the best intersection of all the anchor words (not "the" or "and" etc).
You can probably see how it will scale from there, building intersections of sentences into paragraphs, and paragraphs into entire texts. So very shortly I just might be able to take an fully body of text and determine just what the heck it is all about. Could be an interesting step forward in search relevance and data mining...
Labels:
answer machine,
autosummary,
data mining,
nlp,
programming,
research,
search
Wednesday, August 10, 2005
Wild about Wildcards
The C|Net News Google Blog entry explaining the use of wildcards in Google searches gave me a great idea for another small project.
'Fill in the blank' type questions can easily be converted into a wildcard search. A carefully constructed intermediary process could be used to tap the vast resources of the Google databases to provide real instant answers to straightforward questions. Once I release the initial version of my new semantic analysis engine, I will code up a PHP page that takes user input in the form of a simple English question ("What is the capital of Delaware?"), convert it into an appropriate Google wildcard query (the capital of Delaware is *), capture the output from Google and reformat that output in the form of an answer to the original question. The answers won't be 100% accurate, but it should be a bit more realistic than other attempts at natural language question answering.
'Fill in the blank' type questions can easily be converted into a wildcard search. A carefully constructed intermediary process could be used to tap the vast resources of the Google databases to provide real instant answers to straightforward questions. Once I release the initial version of my new semantic analysis engine, I will code up a PHP page that takes user input in the form of a simple English question ("What is the capital of Delaware?"), convert it into an appropriate Google wildcard query (the capital of Delaware is *), capture the output from Google and reformat that output in the form of an answer to the original question. The answers won't be 100% accurate, but it should be a bit more realistic than other attempts at natural language question answering.
Labels:
answer machine,
google,
internet,
question answering
Subscribe to:
Comments (Atom)