Monday, February 13, 2006

Frustrating Recursion

I thought I made a mistake once, but I was wrong. For some reason, this happens to me all the time when programming.

I was working on AutoSummary this weekend, adding a contextual framework using hypernym (superordinate) information for individual senses of a given word. In order to do this I needed to create a b-tree data structure. I thought I had set everything up properly, except the tree wouldn't populate past the second level. Weird crashes, etc. I tried everything, checked all of the functions and methods, testing everything I could think of. Nothing.

Tonight, I started checking everything over again, trying some different approaches. I did some digging and determined something was generating a null pointer exception. I checked everything all over again, and again... nothing. After a period of insufferable aggrivation, I discovered by trial and error that the exception was caused by the fact that I had forgotten to initialize the data container (an ArrayList). I was so worried about getting all the "hard stuff" figured out that I had overlooked a beginner's error.

The moral of the story? The simplest answers are often the hardest to find.

Sunday, January 29, 2006

Moving Forward

Last night I did some work on the Answer Machine, and today I'm working on adding topic detection to AutoSummary.

Check out the entry in the Answer Machine project log for details about its new functionality. As for AutoSummary, I came up with a good idea about how to implement topic detection within the current framework of the program. I was checking out this post at the Search Science blog that I read, and thought "I can do that."

What I plan on doing is after determining the likely sense of a given word, I'll build a list of all of the possible topics that word is connected to (using the WordNet domain function). From that I'll be able to find the topic of a sentence by taking the best intersection of all the anchor words (not "the" or "and" etc).

You can probably see how it will scale from there, building intersections of sentences into paragraphs, and paragraphs into entire texts. So very shortly I just might be able to take an fully body of text and determine just what the heck it is all about. Could be an interesting step forward in search relevance and data mining...

Saturday, January 21, 2006

U.S. Losing Edge on UAVs?

Recent developments suggest other nations might be catching up to the U.S. military in UAV technology.

Last week, reports surfaced that the Dept of Defense will terminate the J-UCAS program (parent of the X-45) as part of a Quadrennial Defense Review plan to modernize the USAF bomber fleet. While the Air Force and Navy will continue to develop their own independent unmanned aircraft programs, this move could be a death blow to a program showing enormous promise. The new bomber could incorporate some of the J-UCAS technology, and officials have not ruled out the possibility of building an unmanned bomber.

Meanwhile, the British just unveiled a stealth drone of their own. Dubbed the "Corax," the unmanned aircraft features a tailess, stealthly airframe and will be used as a platform to develop new command and control systems.

At the same time, the South Koreans have announced plans to develop sophisticated military robots, including "eight-legged autonomous combat vehicles."

So while other militaries forge ahead with unmanned weapons systems, the U.S. cuts its most advanced unclassified UAV program, even though the DOD recently said cancelling the program would erode its unmanned aircraft advantage. Fortunately, South Korea and the UK are friendly nations, but how long will it be until a rival demonstrates the willingness to challenge American air superiority with unmanned fighters of their own? That might be the only developlment likely to cause unmanned systems to replace fighter pilots in their air-to-air combat role.

Latest News: The Air Force Times is reporting that officials are planning a new career field for UAV operators, entirely distinct from traditional aircraft pilots. So instead of forcing traditional pilots to non-voluntarily retrain into UAV operators, this move aims to train pilots to specialize in unmanned systems from the get-go. Perhaps a new breed of leadership grown from the ranks of these UAV specialists will not retain the nostalgic attachment to the 'feel and experience' of traditional aviation, and will enable this technology to progress without prejudice.

Monday, August 29, 2005

U.S. Army to Deploy Combat Robots in Iraq

The L.A. Times is reporting that the Army will deploy unmanned robotic ground vehicles into combat in Iraq later this year.

The robots, known as 'Tactical Amphibious Ground Support' and built by Northrop Grumman, will be the first autonomous ground vehicles used in a combat zone. The Army intends to use them for surveillance and border security missions, and the machines can navigate entirely on their own (although a remote operator can take control in the event of an emergency). Robots like the bomb-dismantling PackBot by iRobot have been serving in Iraq for quite some time, but this will mark the first use of completely autonomous vehicles in ground combat.

AI: Now and In the Future

Red Herring has a couple of excellent articles online regarding the present state and future possibilities of Artificial Intelligence.

The first article covers existing AI systems such as data mining and intelligence search, while the second chronicles Palm Pilot inventor Jeff Hawkins' quest to revolutionize the way we look at intelligence.

PHP: Longest Common Substring

While working on my answer machine program, I came across a need to find recurring patterns in strings. What I wanted to do was take two strings and return the longest substring contained by both. After about 30 minutes worth of searching Google for a quick solution, I realized that this was not a trivial problem. It seems that finding the longest common substring is a classical problem in computer science. Helpful examples of LCS algorithms were very hard to come by, and there was virtually nothing available for PHP. Maybe when I earn a PhD I will try to impress people with useless formulas and diagrams, but for now I figured I would just write some realtively straightforward code that gets the job done.

The solution that seemed the simplest to me was to compare the strings character by character looking for matches that would signify the possible beginning of a common substring. I really don't consider one character substrings to be significant, so my solution only initiates a match when it finds two identical substrings two characters in length. It continues down each string char-by-char as long they match, and then tosses the completed substring into an array. Every common substring is obtained this way, and then we simply return the longest of these as the result. If more than one common substring is longest, the one that appears first in the first source string is returned.

See it in action and view the source.

After all that, I'm starting to think it might be more appropriate for my application to tokenize the strings and then capture the longest common sequence of tokens. In other words, break the string down into words and return the longest common phrase. It won't be that much more difficult to implement both methods in my answer machine, and compare their performance to see which one is better.

The World's First Robotic Domestic Assistant

David is 11 years old. He weighs 60 pounds. He is 4 feet, 6 inches tall. He has brown hair. His love is real. But he is not.

A child-sized robot designed for house-sitting and caring for the elderly will go on sale in Japan next month. The robot, named "Wakamaru," has a vocabulary of 10,000 words and can be programmed to watch for burglars or gently wake you up in the morning. Peace of mind and synthetic companionship can now be yours, thanks to Mitsubishi-Heavy Industries Ltd.