February 28, 2005
How Google's Blogspot Spreads Spyware
Interesting article today by Harvard economics student and spyware sleuth Ben Edelman. "How Google's Blogspot Helps Spread Unwanted Software" reflects on how many blogs hosted by Google-owned Blogspot have javascript that tries to trick the user into installing (un)necessary software to power the blog -- only to install spyware on the user's machine.
February 23, 2005
Thanks Pandia! SiteLines is One to Look Out For
Thank you Pandia Search for citing SiteLines in its 2004 roundup of best weblogs on searching. (The winner was Gary Price's Resourceshelf.com -- kudos to Gary!)
Google Uncle Sam -- What It Hits, and Misses
Interested in best practices for using search engines to find US federal government information on the web? Then make haste to Peggy Garvin's excellent assessment of government web search tools in this month's LLRX.com. This is a great guide to help researchers wade through the forest of publicly accessible government information.
Although Garvin starts off with a discussion of the merits and limitations of Google Uncle Sam, the article also covers the Firstgov search appliance, and suggests Vivisimo's FirstGov cluster search enhancement and the Department of Defense's own internal search tool.
Garvin concludes with the expert web searcher's maxim -- use more than one search tool!
February 21, 2005
Y! - Yahoo's Attempt to Improve Searching (sort of)
Earlier this month, Yahoo! unveiled Y!Q, it's new beta service. Designed to improve concept searching by allowing users to grab selections of words -- from web pages, added keywords/concepts, "more like this" options, and more, the service enables users to quickly try out various options using these selections.
It's an interesting way to generate a variety of results lists in a large search engine, and will be attractive to the vast majority of end users who don't want to -- or can't -- formulate Boolean search statements.
Barbara Quint has a nice overview of the beta service in InfoToday Newsbreaks for February 14, but interested users will want to play with the tool in order to get a sense of its relative value.
February 16, 2005
"Ready to Know" Consumers Want Instant Information
A Search Portfolio staff member turned me on to a wonderful trend-spotting web site, Trendwatching.com. In the January 2005 newsletter, Trendwatching.com gave voice to "ready-to-know" -- a trend that will undoubtedly affect everyone serving the information needs of consumers. From the newsletter:
"Ready-to-know" refers to "demanding consumers [who] are in a constant 'Ready To Go, READY-TO-KNOW' state of mind, expecting any information deemed relevant to be available instantly, at their own terms. The latter is crucial: we're talking pull here, not push. Expect to see more click-and-know, more point-and-know, more text-and-know, more touch-and-know and more snap-and-know than ever before."
You can read the entire newsletter at http://www.trendwatching.com -- the READY-TO-KNOW trend has its own little page at http://www.trendwatching.com/trends/READY-TO-KNOW.htm
February 08, 2005
How Google Scholar (and others like it) help libraries
I always appreciate when readers take the time to write to me with their comments -- most of us have so little time, so the effort is doubly appreciated. Ben Toth, Director of the National Electronic Library for Health (in the UK), took me to task (very gently) on my generally positive views about libraries and rather negative views about many commercial search engines. Toth felt that my views didn't fully take into consideration the many good things that popular commercial search tools have done for the library community:
"Many of us who welcome Google do so because of the challenge it is making to traditional libraries, not because it is a superior search tool. My feeling is that we don't fully appreciated the value of the disruption that Google/Amazon etc are causing to libraries - forcing us to think outside our professional boundaries to a degree that wouldn't have happened had Amazoogle not been invented. For my money Google is working towards the dream of Paul Otlet for universal access and for that reason alone it is vital to work with Google rather than against it."
Google Scholar is a Full Year Late Indexing PubMed Content
In case you don't want to read any more, tests conducted by me on Feburary 8 2005 suggest that Google Scholar is currently missing almost a full year of PubMed records. The conclusion is pretty simple: No serious researcher interested in current medical information or practice excellence should rely on Google Scholar for up to date information.
Read on if you're interested...
I've been following the buzz lately in weblogs and around the water cooler as medical students and faculty discover that Google Scholar can search against Medline records from PubMed. One medical school's student web site has this caption beside a link to Google Scholar: "Give Google Scholar a try. It searches Medline and it's fast."
Academic librarians find themselves in a quandry -- responding to students who just want it fast, thank you -- when they know about the deep limitations of these quick one-box tools like Google Scholar and other search engines. But they haven't done a very good job so far at showing just how seriously flawed these third-party tools are when it comes to serious medical or scientific research.
I've written about this before in a previous Sitelines posting but then decided that perhaps some real research into the differences might help clear up the problem, or at least give librarians more ammunition to convert users.
MYTH: Google Scholar searches Medline and it's fast.
FACT: Google Scholar is fast.
FACT: Google Scholar does not search Medline. It searches whatever Medline records NLM happened to give Google. We have no idea when NLM gave Google the records. We can't anticipate when the next batch will be delivered and the Google Scholar database updated. Remember, Google Scholar is just BETA. PubMed is...well, decidedly NOT beta, and full of the important checks and balances that make it so special.
It is possible to test the existing PubMed content in Google and draw some conclusions about currency, in order to advise students and researchers on best practices (and dangerous practices).
Here are the details of my test.
I conducted the following searches in order to compare the total number of Google Scholar records for 2004 and 2005 with those in PubMed for the same period.
Google Scholar - I searched for any link from site ncbi.nlm.nih.gov (which is the closest I could get to limiting to stuff from PubMed) AND dates anywhere in 2004 or 2005:
http://scholar.google.com/scholar?hl=en&lr=&q=site%3Ancbi.nlm.nih.gov&as_ylo=2004&as_yhi=2005&btnG=Search
PubMed - any article published between Jan 1 2004 and April 1 2005:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=PureSearch&db=pubmed&details_term=%222004/01/01%22%5BPDAT%5D%20%3A%20%222005/04/01%22%5BPDAT%5D
Results: For the 2004-2005 period, Google delivered 29,500 records; PubMed delivered over 658,000 for the same period.
While some of these PubMed records in the larger set are PreMedline records that will ultimately be dumped from the database, there is simply no accounting for the enormous difference in the numbers, except to hypothesize that Google Scholar is missing very significant quantities of relatively recent records from PubMed.
To check my hypothesis, I searched PubMed for randomized controlled trials about breast cancer, published from mid-2003 to the end of 2004. I selected about 20 from this group of important articles (that any practitioner should be interested in) , which spanned the entire time period of the set.
My goal was to see which of these articles showed up in Google Scholar. By tracking the dates of the articles captured in Google Scholar against those that were missing from Google Scholar, I could venture a guess as to the last update of the PubMed content in Google.
I tested by copying and pasting a distinctive phrase from the title of each article into Google Scholar, then clicked SEARCH to see if Google Scholar would retrieve the article. My results showed that Google Scholar failed to retrieve any PubMed content after February-March 2004, making Google Scholar almost one full year in arrears of PubMed.
February 02, 2005
A First Look at Google's Video Search Tool
Richard Wiggins reflects on Google's newly launched video search tool, in this newsbreak from Information Today, and finds it a little too lite. Speculating that Google launched the tool mainly as defense against other competitor products by Yahoo and Microsoft, Wiggins tried several searches and discovered that, for most search results, there is no actual video to watch. To which Google says, stay tuned...