February 28, 2005

How Google's Blogspot Spreads Spyware

Interesting article today by Harvard economics student and spyware sleuth Ben Edelman. "How Google's Blogspot Helps Spread Unwanted Software" reflects on how many blogs hosted by Google-owned Blogspot have javascript that tries to trick the user into installing (un)necessary software to power the blog -- only to install spyware on the user's machine.

February 23, 2005

Thanks Pandia! SiteLines is One to Look Out For

Thank you Pandia Search for citing SiteLines in its 2004 roundup of best weblogs on searching. (The winner was Gary Price's Resourceshelf.com -- kudos to Gary!)

Google Uncle Sam -- What It Hits, and Misses

Interested in best practices for using search engines to find US federal government information on the web? Then make haste to Peggy Garvin's excellent assessment of government web search tools in this month's LLRX.com. This is a great guide to help researchers wade through the forest of publicly accessible government information.

Although Garvin starts off with a discussion of the merits and limitations of Google Uncle Sam, the article also covers the Firstgov search appliance, and suggests Vivisimo's FirstGov cluster search enhancement and the Department of Defense's own internal search tool.

Garvin concludes with the expert web searcher's maxim -- use more than one search tool!

February 21, 2005

Y! - Yahoo's Attempt to Improve Searching (sort of)

Earlier this month, Yahoo! unveiled Y!Q, it's new beta service. Designed to improve concept searching by allowing users to grab selections of words -- from web pages, added keywords/concepts, "more like this" options, and more, the service enables users to quickly try out various options using these selections.

It's an interesting way to generate a variety of results lists in a large search engine, and will be attractive to the vast majority of end users who don't want to -- or can't -- formulate Boolean search statements.

Barbara Quint has a nice overview of the beta service in InfoToday Newsbreaks for February 14, but interested users will want to play with the tool in order to get a sense of its relative value.

February 16, 2005

"Ready to Know" Consumers Want Instant Information

A Search Portfolio staff member turned me on to a wonderful trend-spotting web site, Trendwatching.com. In the January 2005 newsletter, Trendwatching.com gave voice to "ready-to-know" -- a trend that will undoubtedly affect everyone serving the information needs of consumers. From the newsletter:

"Ready-to-know" refers to "demanding consumers [who] are in a constant 'Ready To Go, READY-TO-KNOW' state of mind, expecting any information deemed relevant to be available instantly, at their own terms. The latter is crucial: we're talking pull here, not push. Expect to see more click-and-know, more point-and-know, more text-and-know, more touch-and-know and more snap-and-know than ever before."

You can read the entire newsletter at http://www.trendwatching.com -- the READY-TO-KNOW trend has its own little page at http://www.trendwatching.com/trends/READY-TO-KNOW.htm

February 08, 2005

How Google Scholar (and others like it) help libraries

I always appreciate when readers take the time to write to me with their comments -- most of us have so little time, so the effort is doubly appreciated. Ben Toth, Director of the National Electronic Library for Health (in the UK), took me to task (very gently) on my generally positive views about libraries and rather negative views about many commercial search engines. Toth felt that my views didn't fully take into consideration the many good things that popular commercial search tools have done for the library community:

"Many of us who welcome Google do so because of the challenge it is making to traditional libraries, not because it is a superior search tool. My feeling is that we don't fully appreciated the value of the disruption that Google/Amazon etc are causing to libraries - forcing us to think outside our professional boundaries to a degree that wouldn't have happened had Amazoogle not been invented. For my money Google is working towards the dream of Paul Otlet for universal access and for that reason alone it is vital to work with Google rather than against it."

Google Scholar is a Full Year Late Indexing PubMed Content

In case you don't want to read any more, tests conducted by me on Feburary 8 2005 suggest that Google Scholar is currently missing almost a full year of PubMed records. The conclusion is pretty simple: No serious researcher interested in current medical information or practice excellence should rely on Google Scholar for up to date information.

Read on if you're interested...

I've been following the buzz lately in weblogs and around the water cooler as medical students and faculty discover that Google Scholar can search against Medline records from PubMed. One medical school's student web site has this caption beside a link to Google Scholar: "Give Google Scholar a try. It searches Medline and it's fast."

Academic librarians find themselves in a quandry -- responding to students who just want it fast, thank you -- when they know about the deep limitations of these quick one-box tools like Google Scholar and other search engines. But they haven't done a very good job so far at showing just how seriously flawed these third-party tools are when it comes to serious medical or scientific research.

I've written about this before in a previous Sitelines posting but then decided that perhaps some real research into the differences might help clear up the problem, or at least give librarians more ammunition to convert users.

MYTH: Google Scholar searches Medline and it's fast.
FACT: Google Scholar is fast.
FACT: Google Scholar does not search Medline. It searches whatever Medline records NLM happened to give Google. We have no idea when NLM gave Google the records. We can't anticipate when the next batch will be delivered and the Google Scholar database updated. Remember, Google Scholar is just BETA. PubMed is...well, decidedly NOT beta, and full of the important checks and balances that make it so special.

It is possible to test the existing PubMed content in Google and draw some conclusions about currency, in order to advise students and researchers on best practices (and dangerous practices).

Here are the details of my test.

I conducted the following searches in order to compare the total number of Google Scholar records for 2004 and 2005 with those in PubMed for the same period.

Google Scholar - I searched for any link from site ncbi.nlm.nih.gov (which is the closest I could get to limiting to stuff from PubMed) AND dates anywhere in 2004 or 2005:
http://scholar.google.com/scholar?hl=en&lr=&q=site%3Ancbi.nlm.nih.gov&as_ylo=2004&as_yhi=2005&btnG=Search

PubMed - any article published between Jan 1 2004 and April 1 2005:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=PureSearch&db=pubmed&details_term=%222004/01/01%22%5BPDAT%5D%20%3A%20%222005/04/01%22%5BPDAT%5D

Results: For the 2004-2005 period, Google delivered 29,500 records; PubMed delivered over 658,000 for the same period.

While some of these PubMed records in the larger set are PreMedline records that will ultimately be dumped from the database, there is simply no accounting for the enormous difference in the numbers, except to hypothesize that Google Scholar is missing very significant quantities of relatively recent records from PubMed.

To check my hypothesis, I searched PubMed for randomized controlled trials about breast cancer, published from mid-2003 to the end of 2004. I selected about 20 from this group of important articles (that any practitioner should be interested in) , which spanned the entire time period of the set.

My goal was to see which of these articles showed up in Google Scholar. By tracking the dates of the articles captured in Google Scholar against those that were missing from Google Scholar, I could venture a guess as to the last update of the PubMed content in Google.

I tested by copying and pasting a distinctive phrase from the title of each article into Google Scholar, then clicked SEARCH to see if Google Scholar would retrieve the article. My results showed that Google Scholar failed to retrieve any PubMed content after February-March 2004, making Google Scholar almost one full year in arrears of PubMed.

February 02, 2005

A First Look at Google's Video Search Tool

Richard Wiggins reflects on Google's newly launched video search tool, in this newsbreak from Information Today, and finds it a little too lite. Speculating that Google launched the tool mainly as defense against other competitor products by Yahoo and Microsoft, Wiggins tried several searches and discovered that, for most search results, there is no actual video to watch. To which Google says, stay tuned...

Description
SiteLines is written by Rita Vine, a professional librarian, web search trainer, and lead site evaluator of the Search Portfolio web search product.

Together with other members of the Search Portfolio selection team, Rita monitors over 50 key alerting services related to web search tools, site announcements, and the business of web search. SiteLines is intended to present a distillation of the most important trends, news, and new web search tools and directories.

Sitelines is sponsored by the Search Portfolio, a licensed web desktop of the 100 top peer-reviewed web sites for searching.

Subscribe
Subscribe Unsubscribe
Search


Archives
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
Recent Entries
How Google's Blogspot Spreads Spyware
Thanks Pandia! SiteLines is One to Look Out For
Google Uncle Sam -- What It Hits, and Misses
Y! - Yahoo's Attempt to Improve Searching (sort of)
"Ready to Know" Consumers Want Instant Information
How Google Scholar (and others like it) help libraries
Google Scholar is a Full Year Late Indexing PubMed Content
A First Look at Google's Video Search Tool
Categories
Boolean Searching (1)
E-Mail (4)
Google (48)
Handheld Computers (1)
Images (2)
Information Literacy (10)
Internet Filters (3)
Miscellaneous (13)
News Stories (16)
RSS (2)
Resources - Business (12)
Resources - Health (19)
Resources - Misc. (45)
Search Engines (4)
Search Engines - Best Practices (14)
Search Engines - Business Issues (24)
Search Engines - Impact on Searching (7)
Searching - Best Practices (15)
Searching - User Behavior (6)
Software (7)
Spyware (2)
Staying Current (3)
Trends & Predictions (2)
Links
SiteLines Home
Workingfaster.com
Upcoming Courses
Search Portfolio
XML for Site Syndication(XML)