Monday, August 29, 2005

U.S. Army to Deploy Combat Robots in Iraq

The L.A. Times is reporting that the Army will deploy unmanned robotic ground vehicles into combat in Iraq later this year.

The robots, known as 'Tactical Amphibious Ground Support' and built by Northrop Grumman, will be the first autonomous ground vehicles used in a combat zone. The Army intends to use them for surveillance and border security missions, and the machines can navigate entirely on their own (although a remote operator can take control in the event of an emergency). Robots like the bomb-dismantling PackBot by iRobot have been serving in Iraq for quite some time, but this will mark the first use of completely autonomous vehicles in ground combat.

AI: Now and In the Future

Red Herring has a couple of excellent articles online regarding the present state and future possibilities of Artificial Intelligence.

The first article covers existing AI systems such as data mining and intelligence search, while the second chronicles Palm Pilot inventor Jeff Hawkins' quest to revolutionize the way we look at intelligence.

PHP: Longest Common Substring

While working on my answer machine program, I came across a need to find recurring patterns in strings. What I wanted to do was take two strings and return the longest substring contained by both. After about 30 minutes worth of searching Google for a quick solution, I realized that this was not a trivial problem. It seems that finding the longest common substring is a classical problem in computer science. Helpful examples of LCS algorithms were very hard to come by, and there was virtually nothing available for PHP. Maybe when I earn a PhD I will try to impress people with useless formulas and diagrams, but for now I figured I would just write some realtively straightforward code that gets the job done.

The solution that seemed the simplest to me was to compare the strings character by character looking for matches that would signify the possible beginning of a common substring. I really don't consider one character substrings to be significant, so my solution only initiates a match when it finds two identical substrings two characters in length. It continues down each string char-by-char as long they match, and then tosses the completed substring into an array. Every common substring is obtained this way, and then we simply return the longest of these as the result. If more than one common substring is longest, the one that appears first in the first source string is returned.

See it in action and view the source.

After all that, I'm starting to think it might be more appropriate for my application to tokenize the strings and then capture the longest common sequence of tokens. In other words, break the string down into words and return the longest common phrase. It won't be that much more difficult to implement both methods in my answer machine, and compare their performance to see which one is better.

The World's First Robotic Domestic Assistant

David is 11 years old. He weighs 60 pounds. He is 4 feet, 6 inches tall. He has brown hair. His love is real. But he is not.

A child-sized robot designed for house-sitting and caring for the elderly will go on sale in Japan next month. The robot, named "Wakamaru," has a vocabulary of 10,000 words and can be programmed to watch for burglars or gently wake you up in the morning. Peace of mind and synthetic companionship can now be yours, thanks to Mitsubishi-Heavy Industries Ltd.

More Details about the UAV COS

Larry pointed out some good info to me about the UAV Common Operating System on the DARPA J-UCAS website.

There is a COS FAQ (which unfortunately seems to be more focused on answering Boeing & Northrop's questions about protecting intellectual property than explaining what COS is!) and a Common Systems & Technology presentation which includes a very interesting section on the COS.

Some items of note from the 15.2MB PPT slideshow:

- The COS is more of a POSIX-compliant middleware than an 'operating system,' as it resides on top of an embedded OS.

- Integrates sensors, weapons, communications & control systems to generate decisions and provide a real-time understanding of the operating environment

- Allows autonomous operations, eliminating need for human operators

- Enables accelerated software development... 'best of breed' algorithms shared between platforms (survival of the fittest?)

- Automatic determination of flight path based on mission parameters and threat environment

- Open architecture, non-proprietary source code can be freely used, copied, modified and redistributed

- Initial development timeframe begins Dec 05 and ends Dec 08, with 4 builds planned using a spiral approach

The most significant COS component, in my opinion, would have to be the Discovery/Mediation functionality...

The discovery component provides a set of services that enables the formulation and execution of search activities to locate data assets (files, databases, directories, web pages, streams) by exploitation metadata descriptions stored in and or generated by Information Technology (IT) repositories (directories, registries, catalogs, etc). This also includes the ability to search for metadata semantics to support mediation.

This mediation component provides a set of services that enable transformation processing (adaptation, aggregation, transformation, and orchestration), situational awareness support (correlation and fusion), negotiation (brokering, trading, and auctioning services), and publishing.

What this sounds like to me is a method for storage and retrieval of shared memory resources (by way of metadata [xml?] and semantic analysis), and using those memories to enable adaptation to dynamic situations. That bears a striking resemblance to the way a conscious being uses personal and communal experiences to deal with real-world environments and react to change in real-time. The more I think about it, the more I see the possibility for something truly intelligent to emerge from this architecture. I'm definitely going to have to keep tabs on this project, and if it ends up being open source (as the presentation claims), taking a look under the hood. Now do you understand why I want to work for these people?

Thursday, August 25, 2005

Inside the Mind of a UAV

DARPA is currently developing a common operating system that will serve as the 'brains' for the military's next generation of autonomous systems.

This common platform will allow UAVs to share information and work together more efficiently, and port new advances from platform to platform. Additionally, it will merge inputs from all of the sensor systems to create a single integrated picture of the environment, and interface with the on-board weapons array as well. Could this internal model of the outside world eventually be the seat of consciousness for an intelligent machine? DARPA's own report is not surprisingly vague...

Tuesday, August 23, 2005

Google Tops in Machine Translation reports that Google scored highest in a recent language translation competition run by the U.S. Government.

Google beat out competitors such as IBM and the University of Southern California on the Arabic-to-English and Chinese-to-English translation tests. Interestingly enough, although Google offers an Arabic homepage, Google language tools does not currently offer Arabic translation. Looks like we've stumbled across some more top-secret Google software...

Sunday, August 21, 2005

Smart Cars of the Future

There is a interesting article in Popular Science looking at the future of intelligent systems in automobiles, and how these systems will make transportation by car safer and more efficient.

Highlights from the UAS Roadmap

I've finally had the chance to review the entire Unmanned Aerial System (UAS) Roadmap, and picked out some highlights of interest:

Future UAS will evolve from being remotely operated to fully autonomous system capable of 'self-actualization.' This will require human-level intelligence, specifically pattern recognition skills. The roadmap foresees development of enabling technologies in the 2015-2030 timeframe. (page 52)

Advances in flight autonomy and cognitive processes will allow UAS to move away from remote control by skilled operators towards full autonomy. Several stages of these technologies are reviewed, from remotely piloted vehicles in Vietnam, to high-endurance hand-flown reconaissance systems in the 1970s, to pre-programmed autonomous UAS in the 80s, to Global Hawk today and J-UCAS in years to come. (page D-9)

Several advantages of UAS over manned systems in terms of safety are listed on page F-10:

- Many aircraft mishaps are the direct result of poor decisions by human operators. Robotic aircraft are not programmed to take chances.

- Mishaps from failed life support systems are not an issue.

- Smoke from damaged non-vital systems does not affect UAS in the way that smoke in the cockpit of a manned aircraft does.

- Automated take-offs and landings reduce the need for local training missions, leading to less opportunities for mishaps at home stations. An historical look at UAS in combat is provided on page K-1:

- TDR-1 assault drones flown by pilots via television were used to drop bombs on Japanese positions in 1944.

- AQM-34 remotely piloted vehicles were flown on reconnaissance missions in Vietnam from 1964-1969.

- 18 RQ-2 Pioneer UAS were lost in combat in Desert Storm.

- 26 UAS of various types were lost in combat during the war in Kosovo.

- In the current conflicts in Afganistan & Iraq, combat losses of UAS have been reduced to an average of 2 aircraft per year.

Saturday, August 20, 2005

The Age of Intelligent Machines: The Film

I just came across a old video from called The Age of Intelligent Machines (Real Media).

It was made back in 1987, so some of the specific technologies are dated, but many of the topics are still relevant. One of the more interesting quotes came from Yale Professor Roger Shank, who said "The real problem is not can machines think, but can people think well enough about how people think to be able to describe how they think to machines."

The tech behind Able Danger

All politics aside, an NPR program covering 'Able Danger' and pre-9/11 intelligence (Real Audio & Windows Media) includes a look into the technology enabling the sophisticated data-mining software that allegedly identified four of the Al Qaeda hijackers in the U.S. well before the 2001 attacks.

The Lieutenant Colonel who oversaw the program explains how the smart algorithms sifted through 2.5 terabytes of open souce intelligence looking for patterns in the unstructured data that would lead to terrorists links.

Thursday, August 18, 2005

Automating the Art of War

The Pentagon has just released the latest Unmanned Aerial System (UAS) Roadmap, laying out the development strategy for the next 25 years.

The document is well over 200 pages long, and has plenty of cool pictures. I'm just starting to pour over everything, but here are some of the highlights:

- Name changed from UAV (vehicle) to UAS (system)

- Number one goal is developing unmanned fighters

- 2010: Electronic supression of enemy air defenses (EA-6B)

- 2015: Penetrating deep-strike missions (F-117)

- 2020: Mid-air refueling tankers (KC-10 & KC-135)

- 2025: Air-to-air combat (F-15)

- 2030: Airlift (C-17 & C-130)

- 2030: Air superiority (F-22)

- Roadmap includes detailed specs on current & near future UAVs

I'll post more when I get a chance to dig into this thing further. Addtional coverage is available at

Now if only the DoD took a fraction of the money spent on the F-22 Raptor and F-35 JSF programs and put it towards UAVs...

When the flaw in the software is the human element...

If you read between the lines of the most recent controversy concerning the intelligence failures and the 9/11 commission you'll find an interesting story concerning artificial intelligence and human error.

The NY Post has a piece covering the "Able Danger" data-mining software (registration required) used to identify and track suspected terrorists. It seems as though Able Danger located several of the 9/11 hijackers (including mastermind Mohammed Atta) in the United States well before September 2001. This information, however, was not disseminated or acted upon by intelligence personnel (for whatever reason), and the terrorists were allowed to continue planning their attack. So what we have is software designed to protect American citizens executing its mission successfully, but poor judgement (granted, in hindsight) on the part of the human actors possibly led to the deaths of thousands. Like I've said before, until we become comfortable with fully automated systems making life and deaths decisions on our behalf, society will insist on keeping humans in the loop. This will only change after the human element is repeatedly identified as the single point of failure in the decision chain, and I fear this example will unfortunately be the first of many yet to come.

Wednesday, August 17, 2005

AutoSummary Alpha Release

It pleases me to announce that I have just released the first alpha testing version of the AutoSummary Semantic Analysis Engine under the BSD License.

When completed, AutoSummary will generate contextually-relevant summaries of plain text documents using various statistical and rule-based methods of Natural Language Processing. First, the part-of-speech and specific word-sense (meaning) is determined for each word. Next, each sentence is deconstructed and the subject/predicate/object is identified. A map of relationships between word is then created. From this, specific themes are identified and memes (general ideas) are generated. Finally, the memes are used to create summaries of the original document of varying length and detail.

Currently, only the part-of-speech tagger has been implemented. A JPhrase object is created which contains semantic information about a given phrase of words, including part-of-speech scores and possible word-sense combinations. The part-of-speech tagging method takes the JPhrase and returns a marked-up string with each word in the phrase associated with a tag corresponding to the part-of-speech (noun, verb, adjective, adverb) determined to be most likely used for this particular instance. To make this determination, usage statistics from the WordNet semantic concordance texts are used.

AutoSummary uses the WordNet lexical reference system via the JWords Java Interface for WordNet as its source of lexigraphical information. In order to run AutoSummary, you must first install WordNet 2.0 and edit the JWords configuration file.

This alpha test release is EXTREMELY limited. Support for articles (a, an, the, etc), plurals, tenses, pronouns, and the verb 'to be' has not yet been implemented. If these types of words are entered into the current version, the program will likely crash. The simple demo included in the release is merely intended to show how the part-of-speech tagging system will determine the part-of-speech for each word in a given phrase.

I've been pushing pretty hard over the last week or so to get this initial release out. Once I take a must deserved rest, I will get a publicly available prototype up and running for my wildcard-based general question answering script, go back and add WordNet 2.1 support to JWords, and also take a serious look at setting up a web version of the AutoSummary demo program.

AutoSummary is available for download at

Javadoc documentation for AutoSummary is available here.

Wednesday, August 10, 2005

Wild about Wildcards

The C|Net News Google Blog entry explaining the use of wildcards in Google searches gave me a great idea for another small project.

'Fill in the blank' type questions can easily be converted into a wildcard search. A carefully constructed intermediary process could be used to tap the vast resources of the Google databases to provide real instant answers to straightforward questions. Once I release the initial version of my new semantic analysis engine, I will code up a PHP page that takes user input in the form of a simple English question ("What is the capital of Delaware?"), convert it into an appropriate Google wildcard query (the capital of Delaware is *), capture the output from Google and reformat that output in the form of an answer to the original question. The answers won't be 100% accurate, but it should be a bit more realistic than other attempts at natural language question answering.

Just Around the Corner

The ability to transfer your consciousness to a machine, and therefore live forever, is in our very near future according to futurist and AI advocate Ray Kurzweil.

Rapid advances in artificial intelligence and nanotechnology will even lead to "experience beamers," which will enable one person to receive the total sensory input of another through full-immersion virtual reality. Soon we may all know what it's like to be John Malkovich.

Tuesday, August 9, 2005

JWords Update Release Notice

I am happy to announce that the latest version of the JWords Java Interface for WordNet has been released. JWords 0.2.0 Alpha includes new methods for calculating the relative likeliness that a word will appear as a particular part of speech, which may be used for statistical semantic analysis (like POS tagging), as well as an updated documentation. These features were specifically designed for a forthcoming semantic analysis package... which will be released shortly.

JWords is available for download at

Javadoc documentation for JWords is available here.

Giving Away the Farm

IBM has announced that it will release its new Unstructured Information Management Architecture (UIMA) technology under an open source license for software developers to build upon.

UIMA uses natural language processing and other techniques to analyze and annotate vast amounts of unstructured information, producing relevant knowledge for users to deal with. This will enable the creation of intelligent search engines that can understand the underlying conceptual relationships between documents, as opposed to current systems that focus simply on keywords and hyperlinks. The IBM website for this project seems rather dull at first, but if they can deliver on the capabilities they're promising, we certainly have a lot to look forward to.

Too Bad it's Remote-Controlled

The C|Net News Science Blog picked up a story about a new unmanned ground vehicle currently under development for the US Marine Corps.

Carnegie Mellon University gave its first public demonstration of the Gladiator Tactical Unmanned Ground Vehicle (TUGV) on 4 August. This remote-controlled robot is designed for reconaissance and search & discovery mission in hostile environments, but the military is already looking to expand its capabilities by adding machine guns and other weapons. The latest prototype even includes hand grenades which can be used to clear obstacles (and personnel?). Perhaps the developers could work with CMU's DARPA Grand Challenge team to integrate some autonomous control systems into the Gladiator...

Monday, August 8, 2005

Better Off Without the Butterfly

Despite its questionable track-record of innovation, many people often assume the coming revolution in Artificial Intelligence will be led by Microsoft.

Although MS has devoted significant resources to this cause, my utter lack of faith was recently amplified by this banner ad for MSN Search. In an attempt to showcase the new 'intelligent' instant answers provided by their search engine, MS has demonstrated new levels of artificial stupidity. A definition of champagne (the bubby alcohol beverage) is not a valid answer for a question seeking to ascertain the location of the Champagne region of France. MS is completely missing the point, and I suspect this trend will continue. I'm betting on another couple of grad students defeating the billion-dollar warchest of Microsoft in this battle as well.

Have you ever seen the sun rise ... twice in a row ... three times...?

Paul Graham recently had a very interesting article about what businesses can learn from open source.

The entire piece was very well written, and the author makes a lot of good points, but what captured my imagination the most was the part about the optimal working environment for a programmer. Paul says that people working in the comfortable surroundings of their own home are often exponentially more productive than corporate rats in a stuffy cubicle farm. Based on my own personal experience, I would certainly agree. I find I do my best work in marathon sessions, often involving massive amounts of caffeine and highly irregular sleep patterns. I understand many programmers share similar experiences. Once I get 'in the zone,' as they say, I will work solidly for hours on end paying little attention to time or the world around me. Were I restricted to a structured work environment constantly struggling against interruptions and distractions like co-workers, meetings and email, I fear my programming would suffer.

I am currently working hard to finish up the next release of JWords, as well as the initial release of another semantic analysis tool. I've been trying to complete this stuff for weeks now, but I find the only way I can get anything done is if I rush straight home from work and devote the entire evening to programming. If I decide to go workout, play basketball, see a movie or go out to dinner, and try to pick up where I left off and program a little bit at a time, I never get anything done. By the time I sit down and get going, its time to sign off for the night. My old friend the all-nighter is no longer an option with my responsibilities at work. Usually weekends offer a real chance to get down to business, but I'm afraid my willpower is just barely too weak to resist the temptation of living it up at the beach. But on those rare occasions when I can devote a significant amount of time to this endeavor, and everything is firing on all cylinders, nothing else compares.

Tuesday, August 2, 2005

"Your mama was a snowblower"

I just got back from seeing Stealth, the newly released motion picture about an Unmanned Combat Aerial Vehicle that goes haywire and turns on its allies, and I have to say that it was much better than I thought it would be.

I was anticipating a horrendous bag of shite, but the movie was surprisingly tolerable. The special effects were fun, the story was somewhat believable, and the AI themes were adequate. The story explored our resistance to recognizing intelligent behavior by a robot. our reluctance to trust an artificial entity, and our fear of losing control of or being replaced by a sentient machine. In my opinion, the biggest shortcoming of the film was the sophomoric reliance on a lighting strike to instigate consciousness in the agent. Sound familiar?

Hopefully, with enough R&D money and brilliant engineers we will see the deployment of advanced intelligent systems like these in the near future.