Thursday, July 14, 2005

It Knows...

That's what I said to myself while working on my project tonight.

Work was really slow today, so I had some time to scrutinize the probability algorithm for my part-of-speech disambiguator. Basically, the tool I'm working on now is meant to correctly identify the part-of-speech for each word in a given phrase. When I got home and tinkered with the code a bit, I was astonished with how well it worked. The program is now analyzing all possible combinations, and reports the instance it determines is most likely to occur, based on general usage statistics.

When I realized it was doing what I hoped it would do, and the statistical formula was working properly, I was pleasantly surprised. Later on, I will combine this with other methods of semantic analysis in an attempt to determine the specifc meaning of each word in a sentence. But for now, I'm cleaning up the code a bit, and improving the performance by eliminating unnecessary steps. The algorithm is working... but its slow, with an efficiency on the order of O(4^n)! Once I speed it up and get it into a presentable format, I will make it available to download and try out.

Not exactly what you would call a 'Eureka!' moment, but I was pleased nonetheless.

No comments: