SiteLines - Ideas About Web Searching http://www.workingfaster.com/sitelines/ Rita Vine, a professional librarian, web search trainer, and lead site evaluator of the Search Portfolio web search product.

Together with other members of the Search Portfolio selection team, Rita monitors over 50 key alerting services related to web search tools, site announcements, and the business of web search. SiteLines is intended to present a distillation of the most important trends, news, and new web search tools and directories.

Sitelines is sponsored by the Search Portfolio, a licensed web desktop of the 100 top peer-reviewed web sites for searching. ]]> en-us 2006-01-09T17:14:39-05:00 Google Scholar gets better at indexing PubMed content, but it's still several months behind. http://www.workingfaster.com/sitelines/archives/2006_01.html#000365 With over a year since the launch of Google Scholar, I thought it was time to revisit my test of Google Scholar's indexing of PubMed content. In my Sitelines article, Google Scholar is a Full Year Late Indexing PubMed Content of February 8 2005, I ran a test to see how GS's coverage of PubMed stacked up. Using a randomly selected list of clinical trials on breast cancer (I wanted important articles that no physician would want to miss) spanning approximately 18 months of publication coverage, I discovered that GS was about one full year behind in coverage of PubMed.... Google ritavine 2006-01-09T17:14:39-05:00 The Challenge of Evaluating Health Search Tools http://www.workingfaster.com/sitelines/archives/2005_10.html#000361 Tony Gentile of Healthline has posted a long and interesting comment on my review of Healthline.com at http://www.buzzhit.com/2005/10/rita-doesnt-dig-us.html Near the end of his article, Gentile asks, “Is your fundamental believe [sic] that only ad-free content can be trusted? If so, unfortunately, that would put many companies out of business.” I believe that there is good information on health matters provided by all kinds of web sources -- educational institutions, commercial web sources, and maybe even in some wiki-style sources (although even Wikipedia’s founder has publicly admitted to serious problems with content quality). In my view, the issue of trustworthiness of... Resources - Health ritavine 2005-10-27T12:38:22-05:00 Wikipedia Founder Admits to Serious Quality Problems http://www.workingfaster.com/sitelines/archives/2005_10.html#000359 It's hard to believe that the Wikipedia has led such a charmed life. Encyclopedia-by-committee, even with some editorial oversight, is prone to hazards. There's amazingly variable quality between entries, and it's almost impossible to prevent always-present hackers from inputting bad, wrong, or dubious information ...just for fun. So it's no surprise that the Register reports this week that Wikipedia's founder is reporting serious quality problems. Since when did information-by-committee replace serious editorial review? Read the article at http://www.theregister.co.uk/2005/10/18/wikipedia_quality_problem/.... Searching - Best Practices ritavine 2005-10-19T21:01:20-05:00 Scratching Under the Surface of a "New" Health Search Engine http://www.workingfaster.com/sitelines/archives/2005_10.html#000357 Lots of buzz this week about Healthline.com, a new vertical search engine for medical information. Chris Sherman, in his SearchDay review, quotes the company's promotional material, which indicates that the site covers "62,000 web sites with between 45-50 million pages... [and] hosted content licensed from reliable content providers." However, my own initial examination showed a site that offers little to rival the best quality ad-and-sponsorship-free medical content on the web through sites like Medline Plus. Healthline relies principally on content from popular pre-existing 3rd party .com sources that could be obtained from any commercial search engine. I conducted a search... Resources - Health ritavine 2005-10-18T15:46:55-05:00 Google Scholar Grows - An Update http://www.workingfaster.com/sitelines/archives/2005_10.html#000356 Google Scholar's chief engineer, Anurag Acharya, contributed a presentation “Searching Scholarly Literature: A Google Scholar Perspective” at the 9th World Congress on Health Information and Libraries, September 23, 2005. Some key points: The index has grown significantly in the last six months, although the company does not disclose the actual index size Coverage by category is focused on medicine and sciences -- medicine 22%; engineering 14%; biology and sociology, 13% each; physics 12% GS indexes full text of all publishers except for Elsevier and ACS (probably because Elsevier's competitor product, Scirus, is the publisher's preferred source of Elsevier's full-text content)... Google ritavine 2005-10-17T16:16:19-05:00 <![CDATA[New! <a href="http://blogsearch.google.com">Google Blog Search</a>]]> http://www.workingfaster.com/sitelines/archives/2005_10.html#000354 Not to be outdone by upstart competitors (Technorati, Blogdigger, Feedster, and more), Google has announced a beta-version of its blog search. This is still a baby-beta version: it covers blog content back only to June 2005 so far, although it's reasonable to expect that the coverage will increase as takeup of the product ramps up. Unlike most Google search appliances, Google's Blog Search doesn't search the full text of blogs -- rather, it only searches the "feed" -- the part of the blog posting that an author sends out through an RSS feed. Most bloggers only send a short part... Weblogs ritavine 2005-10-05T19:09:24-05:00 SiteLines on Hiatus Until October 1 http://www.workingfaster.com/sitelines/archives/2005_09.html#000352 No new postings to SiteLines until October 1 2005.... News Stories ritavine 2005-09-10T14:04:08-05:00 WAY more Google real-estate devoted to ads http://www.workingfaster.com/sitelines/archives/2005_09.html#000351 Have you noticed that Google has expanded the maximum number of horozontal ads on its results pages? In this example of a search for 360 degree feedback, it's rather startling to see just how much real estate of the initial results page is devoted to ads. In my 1024x768 display, I'd estimate that between 35-40% of the initial screen contains "pure" search results, with the rest devoted to ads or header information. How interesting.... Google ritavine 2005-09-02T16:39:52-05:00 Not-So-Smart Answers at AskJeeves http://www.workingfaster.com/sitelines/archives/2005_08.html#000348 The latest additions to AskJeeves "Smart Answers" were reported by Gary Price in his August 22 Search Engine Watch blog. Although the latest additions expand on the existing Jeeves collection of sources that are pre-selected to handle queries for many common factual searches, they remain, like so many other "answer" capabilities of the major search engines, pretty mundane. Search Portfolio's research team tested many popular topics (e.g. marijuana, botox) to see if Smart Answers would deliver pre-selected content, and it didn't. Surely those sample searches are as common as one for burkina faso, which in AJ turns up through Smart... Search Engines ritavine 2005-08-31T16:32:05-05:00 More Silly Search Engine Size Stories http://www.workingfaster.com/sitelines/archives/2005_08.html#000345 Since Yahoo disclosed the jump of its index size to just over 19 billion (!) documents, I've been following a series of interesting posts at the Technologie du Langage blog from Jean Véronis, professor of Information and Technology at the University of Provence. In great detail (and in English), Véronis recounts, with good link references, the index-size story starting with Yahoo's announcement. He then systematically and persuasively refutes both the allegations of database size and the research methodology of a US study comparing database sizes of Google and Yahoo. On the US study, Véronis concludes, "I find it amazing how... Search Engines ritavine 2005-08-26T11:46:39-05:00 Google Tests "Commercial" Listings in "Organic" Search Results http://www.workingfaster.com/sitelines/archives/2005_08.html#000342 From August 19's Clickz, an article on Google's testing of commercial listings in the 6th-8th position in "pure" search results. I replicated the test of the keywords on demand and the results are clearly visible. Interestingly, what isn't at all obvious is that the results are commercial/sponsored/paid in nature. The only apparent difference is the line above and below the commercial results, and the absence of either CACHED or SIMILAR PAGES links that usually company Google results. I hate to say I told you so, but I predicted that Google would cross the line into paid-into-pure integration as early as... Google ritavine 2005-08-22T09:56:08-05:00 Data Mining Primer http://www.workingfaster.com/sitelines/archives/2005_08.html#000341 From the US government, Data Mining, An Overview, is a short primer for those wanting to understand what data mining is all about. By Jeffrey Siebert, an infomation analyst at the US Congressional Research Service. In PDF format.... Miscellaneous ritavine 2005-08-17T18:31:31-05:00 <![CDATA[Interesting Tool: <a href="http://www.copyscape.com">Copyscape</a>]]> http://www.workingfaster.com/sitelines/archives/2005_08.html#000340 Plagiarist alert! There's an interesting online tool to help those of you who want to track those who has lifted content off your website. Copyscape uses Google API technology to identify distinctive sentences and phrases from your site and then sniffs around for other sites that use the same or similar phrases. Although Copyscape tends to sniff out a lot of blogs (probably because bloggers copy or paraphrase stuff a lot from other sources), this is a great way to track unapproved uses of your web site content. The folks who make Copyscape also produce Google Alert... Software ritavine 2005-08-04T17:09:22-05:00 <![CDATA[An Info Pro's View on <a href="http://search.yahoo.com/subscriptions">Yahoo Search Subscriptions</a>]]> http://www.workingfaster.com/sitelines/archives/2005_08.html#000338 In "Searching More of the Opaque Web" Mary Ellen Bates provides an excellent overview of the relative merits of Yahoo! Search Subscriptions, Yahoo's new (and still fairly modest) service selling low-cost journal and news articles from a small group of sources such as Consumer Reports, New England Journal of Medicine, Wall Street Journal, Lexis-Nexis and Factiva. Bates reminds us that the service doesn't allow for comprehensive coverage: it only allows searching of a subset of each journal/service's full text content, and the focus is on recent items.... Yahoo! ritavine 2005-08-02T13:26:46-05:00 <![CDATA[Recommended Resource: <a href="http://hnn.us/">History News Network</a>]]> http://www.workingfaster.com/sitelines/archives/2005_07.html#000337 Think of HNN as news with a historical bent. HNN features articles by historians on current events, and as such occupies a unique position among the many web-based news aggregators available in the websphere. So much news appears without (or with utter disregard for) the historical context: this site seeks to bring a historical perspective to news reporting. There's also a "Hot Topics" section which links to relevant articles on a variety of timely issues. This is a great addition to high school and college web collections that support critical thinking on historical issues. HNN is a project of the... Resources - Misc. ritavine 2005-07-27T18:45:21-05:00