Millie, Google now has a "Blog search" Beta under their "more" menu. I tried searching for "Writing and the Digital Life" and it came up right at the top. Guess they have been thinking about the problem! -Peter Ciccariello ARTIST'S BLOG - http://invisiblenotes.blogspot.com/ -----Original Message----- From: Millie Niss <[log in to unmask]> To: [log in to unmask] Sent: Mon, 19 Sep 2005 06:04:27 -0400 Subject: [WDL] googling WDL blog, the dangers of Google I have twice in recent days (since I just moved between my two homes, and don't have access yet to my old internet bookmarks and old mail) Googled the WDL blog to try to find it. I was alarmed to discover that it is quite hard to find the blog on Google. The first few results (if one gives search terms such as "thomas", "digital" and so forth in addition to "WDL", which has many other meanings) are references to the blog, but most are from discussions of the blog before it ever existed, so they do not lead to the blog's address. The correct blog link is somewhere well down on the list, and is only there at all with certain choices of search terms (I had to try several to find the right ones). I don't know if Sue can do anything to make the blog easier to find on Google -- as you know, getting good search engine results for sites is now an entire profession although the tricky part is supposed to be making the client's website come up at the top when relevant general categories are searched; it is really unfortunate that searching for the site by its actual name doesn't work very well... But aside from what my experience means about marketing the WDL blog, it is worthwhile to consider that Google, usually much-praised for good reason -- may be the weak link in the web's status as a useful and reliable research tool. If the only way -- or at least the usual way -- to find anything on the web is to Google it, sites which are not easy to Google can simply become lost on the web. Not only won't they get new visitors who find them through Googling, they will slowly lose their old visitors because people will forget to bookmark the site and then will be unable to find it again when they want it. This is bad enough when you know the site exists and know many details about it, but in that case you can start asking around for the URL and searching in cleverer ways so you could find the site again, but if someone puts up a valuable and excellent web site that isn't widely marketed, people will not discover it if Google doesn't lead them to it. This problem wasn't quite as bad when there was more competition in the search engine market, so that if Google didn't lead you to the site, at least the people who used some other search engine might get there, and many people even used multiple search engines for the same search to get wider results, but now Google is completely dominant and many other apparently distinct search sites are actually "powered by Google" so they won't give unique results. I don't blame Google for this state of things -- it is understandable for them to try to beat their competition and they actually do provide a better service than most other search engines (and they aren't known for eliminating their competition in dishonest and/or unfair ways, the way Microsoft does), but I think it could become a really bad problem, especially regarding use of the web for academic and other serious research purposes. Too often, a web search forms the primary basis of initial research, even published research in journals, so that someone could conceivably write a survey article on something that completely omits a major point of view or even a major set of facts, if the omitted material isn't easily accessible by Google. If the researcher were instead to use a library, subject-specific databases on CD ROM, indexes to periodicals, actual journals and their indices, published collections of abstracts, and so forth they would be much less likely to miss something major because those sources of data have systematic indexing systems designed by librarians (even if the index seems much less flexible than a computer search) and are also edited by human beings so as not to omit things. (The academic field I studied was math. In math, there is a monthly publication called "Current Math Publications" and it lists every paper in many journals, so that if you search the CMP index, you will find every paper on the your topic, not just the ones which happen to accrete to search terms on Google by the secret algorithms of the Google webspiders.) I really fear that there will be an increasing number of "literature survey articles" or even supposedly scientific "meta-analyses" which purport to draw conclusions about an actual subject (not just about the state of the literature that is on the web about a subject) by analyzing what all the different papers one finds on the web say about the subject. For example, there is a respected tradition of "meta analyses" in the medical research literature, where all the studies ever done on a certain subject are collected and the results are presented in aggregate, generally with some statistical methods which are supposed to measure how reliable the results are and weight better or bigger studies more heavily in the analysis and so forth. Hopefully the mathematics improves the quality of the results, but clearly a simple minded meta analysis could yield truly worthless results. Suppose one did a meta analysis of whether internet use causes insanity. The meta analysis collects a bunch of published studies on this topic. One study might be a randomized clinical trial in which a well-balanced sample of 10,000 random people was compiled, and each person's amount of internet use was correlated with their reported episodes of mental illness and also with the results of a standardized psychiatric examination. A second study might be a study of 10 psychotic murderers (out of a bigger group of 30 psychotic murderers where the 10 were the ones who consented to be interviewed) whom an untrained investigator has asked whether or not they liked to go online before committing their crimes. The simpleminded meta alanysis would try to make a standard coding for all the studies (all two of them in my example) and would consider that the aggregate results of the studies was equivalent to a single larger study of the total number of subjects (10,010 in our example). Of course in our example the two studies are not at all comparable -- even though they purport to answer the same question. Our results would not be total garbage only because the second, much less reliable study used many fewer subjects, so it counts for less in the final statistics. But the result of the meta-analysis would be substantially LESS reliable than the results of the better study. (Note that the 10 psychotic murderers would mess up the results more than proportionally to their number, because they are cases of actual insanity and many of them may have been internet users -- as many people in any sample are -- whereas out of the 10,000 people there would be maybe 100 psychotic people and perhaps no people AS psychotic as the psychotic murdereres, and the study method would also fail to identify many people who really were psychotic despite being well-designed.) Thus we can see that a simpleminded meta analysis will give very lousy results. But a respectable meta-analysis has a systematic way off compiling the studies it uses, for example every study published in every issue of a large group of journals during a fiuxed time period is included. It is hoped that by looking only at (supposedly) reliable sources for the studies, and then by including all of them that meet the set criteria, one exercises some quality control over the studies and does not omit important results, and also it is thought that problems with one study that alter the results in one direction will be balanced out by errors in other studies that cause an opposite bias. I find the whole process to be rather suspect and can see a lot to criticize in it, but the point is that there is an accepted methodology for doing these meta analyses, and it tries to address all the major problems with the process. Now imagine that the meta analysis gets all the studies it uses by a Google search. You can immediately see that there will be big problems if Google leaves out a lot of important stuff, overrepresents other things, and so forth. The method I am describing (especially using Google) sounds so terrible that it may be hard to believe that anyone would consider it to be a valid type of medical research, but unfortunately this is really the case, and some of the studies really do use internet searches. This, then, is a case where Google's faults could lead to people getting the wrong medical treatments, if decisions are made using results of meta analyses. (Fortunately, most medical authorities don't rely much on these kinds of studies, but they are increasingly being performed and published, precisely because the internet makes these studies easy to do!) Millie Niss ********** * Visit the Writing and the Digital Life blog http://writing.typepad.com * To alter your subscription settings on this list, log on to Subscriber's Corner at http://www.jiscmail.ac.uk/lists/writing-and-the-digital-life.html * To unsubscribe from the list, email [log in to unmask] with a blank subject line and the following text in the body of the message: SIGNOFF WRITING-AND-THE-DIGITAL-LIFE ********** * Visit the Writing and the Digital Life blog http://writing.typepad.com * To alter your subscription settings on this list, log on to Subscriber's Corner at http://www.jiscmail.ac.uk/lists/writing-and-the-digital-life.html * To unsubscribe from the list, email [log in to unmask] with a blank subject line and the following text in the body of the message: SIGNOFF WRITING-AND-THE-DIGITAL-LIFE