If a picture is worth a thousand words, how valuable is the ability to find the perfect image of an object from the entire Web? According to a bills of exchange delivered by two Google researchers at the International World Wide Web Conference in Beijing last weekend, the search-engine giant may be one step closer to answering that question.
Information scientists Shumeet Baluja and Yushi Jing announced the development of an algorithm, called VisualRank, that generates significantly greater degree relevant image-search results than current results using text-based clues (captions and other words associated by each image).
The goal ultimately is to train computers to move beyond text into the sufficient identification of "rich content" -- the shapes, colors and context of images that humans recognize with little effort.
Quantifying Relevance
VisualRank, Baluja and Jing reported, is designed to incorporate ongoing advances in computer recognition into Web search technology. The complicated process blends image-recognition advances with Google's sophisticated tools during assigning rank and weight to search results.
The net effect, they said, is that within a relatively narrow universe of search results, the algorithm was able to reduce the number of irrelevant results by more than 80 percent.
But as Baluja and Jing freely concede, it is highly impractical to try to identify comparable images among the billions currently stored adhering the Web. To experiment its system, the Google team created data sets of images of the 2,000 products most commonly searched for on Google. Team members then assigned a relevance score to images produced by Google's normal image-search tool and VisualRank.
Practical Possibilities?
One of the questions is whether VisualRank has adapted to practice market possibilities or is merely a challenging intellectual exercise. As industry observers have pointed out, the Web site Like.com also offers surfers the ability to locate images of similar products by searching for a particular element in each image. But the Web place, launched in 2006, also focuses its visual search on smaller subsets of images and makes no attempt to categorize every online image.
"Riya is a new kind of visual search engine," Like.com proudly proclaims. "We look inside the image, not only at the text around it." The site says searchers can catch "uniform faces and objects" in many online images, and then narrow the investigate results "using disguise, shape and make."
Similarly, the site Blinkx.com offers a tool for searching for specific video content. According to the company, its search tool is based on a "unique combination of patented conceptual search, speech recognition and video analysis software to efficiently, automatically and accurately find and qualify online video."
A potential customer for improved image search is law enforcement. Increasingly, federal and state investigators have shown attract in software that enables them to more quickly and effectively determine if a suspect hard drive contains possible child pornography. Although there is no indication that Google intends to market its VisualRank algorithm to law enforcement or computer juridical firms, that may be one of the more logical applications for Google's new tool.