August 31, 2005

Not-So-Smart Answers at AskJeeves

The latest additions to AskJeeves "Smart Answers" were reported by Gary Price in his August 22 Search Engine Watch blog. Although the latest additions expand on the existing Jeeves collection of sources that are pre-selected to handle queries for many common factual searches, they remain, like so many other "answer" capabilities of the major search engines, pretty mundane. Search Portfolio's research team tested many popular topics (e.g. marijuana, botox) to see if Smart Answers would deliver pre-selected content, and it didn't. Surely those sample searches are as common as one for burkina faso, which in AJ turns up through Smart Answers via the CIA World Factbook. Nothing new here: all the major search engines would turn up the CIA WF in the first 10 hits for the same query.

AJ's Smart Answers harkens back to the early days of AskJeeves, when it differentiated itself from other search engines by matching simple queries for common questions against a set of pre-determined web sites which could provide a variety of content that could answer the "question". Eventually, the site morphed into a meta-search engine, and then, with the integration of Teoma, became a more conventional search-engine-with-benefits. There is clearly little new hear, and not enough of real value to recommend this over other answer engines.

August 26, 2005

More Silly Search Engine Size Stories

Since Yahoo disclosed the jump of its index size to just over 19 billion (!) documents, I've been following a series of interesting posts at the Technologie du Langage blog from Jean Véronis, professor of Information and Technology at the University of Provence. In great detail (and in English), Véronis recounts, with good link references, the index-size story starting with Yahoo's announcement. He then systematically and persuasively refutes both the allegations of database size and the research methodology of a US study comparing database sizes of Google and Yahoo.

On the US study, Véronis concludes, "I find it amazing how quickly such a flawed study could be quoted with so much excitement all over the blogosphere and even make its way to the respectable New York Times." Those of us who are used to the republication as "news" of unverified company press releases are, sadly, not so surprised.

Although most of Véronis's posts at Technologie du Langage are in French, the blog is an outstanding (and rare) source of competent criticism of search engines, and deserves to be in the RSS feeds of serious web-watchers.

August 22, 2005

Google Tests "Commercial" Listings in "Organic" Search Results

From August 19's Clickz, an article on Google's testing of commercial listings in the 6th-8th position in "pure" search results. I replicated the test of the keywords on demand and the results are clearly visible. Interestingly, what isn't at all obvious is that the results are commercial/sponsored/paid in nature. The only apparent difference is the line above and below the commercial results, and the absence of either CACHED or SIMILAR PAGES links that usually company Google results.

I hate to say I told you so, but I predicted that Google would cross the line into paid-into-pure integration as early as 2003, when rumours began to surface about a possible public offering of Google. This appears to be the first public indication that the company is seriously testing the waters.

August 17, 2005

Data Mining Primer

From the US government, Data Mining, An Overview, is a short primer for those wanting to understand what data mining is all about. By Jeffrey Siebert, an infomation analyst at the US Congressional Research Service. In PDF format.

August 04, 2005

Interesting Tool: Copyscape

Plagiarist alert! There's an interesting online tool to help those of you who want to track those who has lifted content off your website. Copyscape uses Google API technology to identify distinctive sentences and phrases from your site and then sniffs around for other sites that use the same or similar phrases. Although Copyscape tends to sniff out a lot of blogs (probably because bloggers copy or paraphrase stuff a lot from other sources), this is a great way to track unapproved uses of your web site content. The folks who make Copyscape also produce Google Alert

August 02, 2005

An Info Pro's View on Yahoo Search Subscriptions

In "Searching More of the Opaque Web" Mary Ellen Bates provides an excellent overview of the relative merits of Yahoo! Search Subscriptions, Yahoo's new (and still fairly modest) service selling low-cost journal and news articles from a small group of sources such as Consumer Reports, New England Journal of Medicine, Wall Street Journal, Lexis-Nexis and Factiva.

Bates reminds us that the service doesn't allow for comprehensive coverage: it only allows searching of a subset of each journal/service's full text content, and the focus is on recent items.

Description
SiteLines is written by Rita Vine, a professional librarian, web search trainer, and lead site evaluator of the Search Portfolio web search product.

Together with other members of the Search Portfolio selection team, Rita monitors over 50 key alerting services related to web search tools, site announcements, and the business of web search. SiteLines is intended to present a distillation of the most important trends, news, and new web search tools and directories.

Sitelines is sponsored by the Search Portfolio, a licensed web desktop of the 100 top peer-reviewed web sites for searching.

Subscribe
Subscribe Unsubscribe
Search


Archives
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
Recent Entries
Not-So-Smart Answers at AskJeeves
More Silly Search Engine Size Stories
Google Tests "Commercial" Listings in "Organic" Search Results
Data Mining Primer
Interesting Tool: Copyscape
An Info Pro's View on Yahoo Search Subscriptions
Categories
Boolean Searching (1)
E-Mail (4)
Google (53)
Handheld Computers (1)
Images (2)
Information Literacy (10)
Internet Filters (3)
Miscellaneous (15)
News Stories (17)
Patents (1)
Podcasts (1)
RSS (3)
Resources - Business (13)
Resources - Health (21)
Resources - Misc. (47)
Search Engines (8)
Search Engines - Best Practices (14)
Search Engines - Business Issues (26)
Search Engines - Impact on Searching (8)
Searching - Best Practices (16)
Searching - User Behavior (10)
Software (9)
Spyware (2)
Staying Current (3)
Trends & Predictions (4)
Weblogs (1)
Yahoo! (2)
Links
SiteLines Home
Workingfaster.com
Upcoming Courses
Search Portfolio
XML for Site Syndication(XML)