A Computer Program Learns About 'Everything' Through Images

A computer program is lending new credence to the idiom "a picture is worth a thousand words," by learning from contextual phrase association with pictures on the internet.

Computer programmers from the University of Washington (UW) and the Allen Institute for Artificial Intelligence (AIAI) have created a fully automated computer program that they claim can "learn everything about anything."

This boastful claim also serves as the name for the program, commonly called LEVAN, for short.

Using pre-designed image recognition, the computer program can comb through millions upon millions of images found on the internet that are even slightly associated with a word or phrase. It then thins down these selections, breaking them into categories and associating them with new compound words.

"It is all about discovering associations between textual and visual data," Ali Farhadi, a UW assistant professor of computer science and engineering, said in a statement. "The program learns to tightly couple rich sets of phrases with pixels in images. This means that it can recognize instances of specific concepts when it sees them."

The program currently has 161 main concepts listed on the LEVAN website, with over 65,000 subcategories and nearly 50 million images processed.

And this is just the start. The program learns from searches made, associating concepts with new categories and even formulating new concepts as it "learns," according to a paper on LEVAN that its designers will present later this month at the Computer Vision and Pattern Recognition annual conference in Columbus.

However, LEVAN does not really "learn everything" as its name claims. A video introducing LEVAN to the world briefly shows how it works, using the word "dance" as an example. Like Google images - and no doubt even pulling pictures from Google images - LEVAN found a large number of various types of dance, including "waltz" (a form of dance) and "last dance" (a familiar concept in many cultures).

This is where things can get tricky. Analyzing photos associated with the term "waltz," the program quickly recognized a similar pattern and form among the millions of images, allowing it to conclude that waltz is a sub-category of dance. However, the more obscure concept of a "last dance" shows a significantly more varied number of images. LEVAN will filter out this concept as there is no general consensus of what it means - at least visually.

Still, to simply say "last dance" isn't a real concept just because it cannot be visually represented may seem like an excuse to some critics.

You can check out LEVAN for yourself at levan.cs.uw.edu to see how much it really has learned.

Tags Research