Scholarly Internet Research:
Is It Real?
by Robert H. Nigohosian
Robert H. Nigohosian is an Assistant Professor in the Department of Finance and Economics at the Salt Lake Community College.
The Internet has evolved into a powerful tool for research, especially over the last several years. But in some ways it is a bizarre place that does not fit into the customary box of procedures and tools well known to the information professionals of yesteryear. This network of networks is viewed by some as an unstructured information resource with no overall control offering a mix of services with no standard format, subject organization, or comprehensive index similar to the printed and electronic sources familiar to librarians. Although there are subject indexes, their multiplicity and fundamental differences make using them confusing at times and raises such questions as: Which one is best? Do we need to use them at all? What exactly are these indexes searching? Is there any one group that is responsible for their coordination and validation?[Winship]
My objective in this paper is to address several issues relative to Internet research and its ethical formation in this new era of information explosion and the resulting information anxiety. First, it may be interesting to review some situations in which misleading information has appeared. Secondly, the question about responsiblity for the management of evaluation and validation will be discussed. Finally, a review of some examples of search tool evaluation will be provided with respect to their usefulness in providing ethical and valid information to students.
Misleading Information
Over the last year, increased accessibility to Netscape has led me to encourage my students to use the World Wide Web (WWW) as a research tool in gathering information for papers they present in a course I teach: Economic History of the United States. I was amazed by the availability of sources for the various subjects the students selected to research, but one daunting feature of this method of research continued to surface as we went from search to search. This rapidly growing feature is the presence of "junk" information.
For example, one student went to Webcrawler (a so-called search engine) and typed "slavery" into the "search" box. It was fascinating to see that of the 25 "hits" that were returned, only about three were of any value at all. Among the others was a listing for "S & M Shops in Seattle"! (I guess slavery occurs there also!)
I will omit discussing what the students in my Macroeconomics class found when they did a search on "bonds". However, it is obvious that some instruction was required to get them to specifiy the types of bonds they were interested in to keep them from winding up up in Seattle again!
The evidence is clear: the Internet has enabled a whole new group of people to enter the world of publishing who have not learned the culture of the print publishing trade. Does anyone have the responsibility to explain the rules to these new publishers, just as the Internet community inculcates new users with the Internet etiquette rules of the road?
For example, consider a web site devoted to Gilbert & Sullivan. Hope Tillman, in a presentation at the John F. Kennedy School of Government, Harvard University, made these observations about this site:
There is, she observed, a pretty clear table of contents. The welcome message from the site's introduction points to the Savoynet listserv as well as to short bios of Wiliam S. Gilbert and Arthur S. Sullivan:
"Welcome to the Gilbert and Sullivan Archive. This archive is devoted to the works of William S. Gilbert and Arthur S. Sullivan, and is operated as a service to Gilbert & Sullivan fans by members of Savoy Net distribution list. The G & S Archive was established in September, 1993, by several Savoy Net members. It includes a variety of G & S related items, including clip art, librettos, song scores, and newsletter articles. New items are being added regularly."
An information professional might ask: Is there a reason for Boise State University hosting this web site in addition to like sites hosted by MIT and Harvard-Radcliffe? Preliminary research revealed that there is a music department at Boise State, but it does not host a Gilbert & Sullivan festival and, according to its calendar of upcoming musical performances, no Gilbert & Sullivan performances were scheduled.
It just so happens that a G & S afficianado is an associate professor in Boise State's Math Department, and he has obviously been instrumental in the hosting of the site there. What is the authority of the moderators? Jim Farron and Alex Feldman are listed. Only Alex Feldman is at Boise State. He is listed as an Associate Professor in the Department of Mathematics and Computer Science whose professional interests include theory of computation and recursion theory. From the web site, the identity of Jim Farron could not be determined. [Tillman]
Apart from the comedic and apparently insignificant connection some Web sites seemed to have to my students' subjects of research, the requirement of proper validation and evaluation of sources has become more and more apparent to my classes. This need appears to be growing daily, as numerous homepages and commercial sites proliferate the landscape of the Web. Novice users are fair game for tricksters and tactless entrepreneurs that disguise their pages as valid sources of information when they may sometimes be nothing more than conjecture, opinion, and manipulated statistical reporting.
However, for all its vagaries and flaws, the Internet still hosts an incredible amount of useful and scholastic information. It has become the instructor's responsibility to educate student researchers regarding proper evaluation and validation tools which can be used in guiding them to scholarly and verifiable information and away from useless and confusing "junk". Consider the following sugestions for proper research evaluation of Internet information:
Some "home page" publishing may be nothing more that a form of vanity publishing. [Tillman, 1]"This may even include sites where an individual decides to share working papers or information they have been working on for a dissertation. Some "home pages" may appear as scholarly-journal-type articles, but may actually be disguised and manipulated information published by an individual posing as a professor from an institution of higher education. On the other hand, many home pages have been through a rigorous review process and should not be equated with the term "vanity".
What is a "vanity" work? It may be a very specific document that has information of great value, but it hasn't been throught the peer review process intrinsic to scholarship or disseminated by the trade publishing industry. Prior to the information explosion promulgated by the accessibility of the Internet via Netscape and other browsers, vanity and short-run publishing has been possible in print, and it can be "quality" in nature, although that may not be as easy to determine without analysis. [Tillman, 1]
Depending upon the curriculum, some instructors limit student researchers to manufacturers' "home pages". For example, in a UNIX class taught by Bruce Worthen of Salt Lake Community College, students are not allowed to use any other home pages, and are cautioned to review http addresses that have a "~" in their address, as these are most likely to be found in personal home pages. Other borderline sources to be wary of may be those sources displaying an address containing "xmission.com" or "compuserve.com", or "aol.com". These sources may indeed be validated and scholarly, but students are advised to proceed with caution when confronting such addresses.
Janet Hovorka and Keith Slade, in their web site paper entitled "Evaluating and Citing Sources: Internet Truth or Fiction," cite several common E-Mail hoaxes that pervade the Internet. For example, some people actually believed that Microsoft had bought the Catholic Church. Some of the false Microsoft stories stated that Microsoft was buying the church outright, while others stated that only its art collection had been secured. According to the "Good Times Virus" hoax, if you open an E-mail message with the subject line, "Good Times," you will get this virus. However, E-mail comes in the form of text (ascii) which can not transmit a virus.
The methods for discovering such hoaxes include:
Evaluation
Students may streamline their searches by using four major steps in the research process:
If students cannot identify the usefulness of an information source immediately, it should be considered a low priority to save, print or read it online.
Ask the question:
Evaluate the format. Can you clearly identify what type of information it is?
Validation
The discerning student should also continue a critical information of sources by examining the credentials of the author:
In many cases cost becomes an issue in evaluating research materials; so researchers shuld ask themselves:
Advantages of Valid Internet Research
My colleague Bruce Worthen suggests that Internet research presents a quick and easy way to verify sources listed by students in their papers. Unlike the "old" days, when professors had to sit in the library (or send their research assistants) to spot check sources in periodicals and books, they can now sit in the comfort of their office at their computer and check the addresses of cited Internet works in a matter of minutes. Indeed, even Hope Tillman, Director of Libraries at Babson College, admits:
"Sharyn Ladner and I wrote our first book surveying the Internet use of special librarians in 1991 and 1992 [and we noted that] the Internet allows all types of publishing in the broadest sense--much of the infoormation contained in the Internet resident discussion groups is transitory--and this network of networks will continue to expand exponentially so that bibliographic control will continue to be out of reach ". [Ladner and Tillman]
What a difference a couple of years makes! Ladner and Tillman now admit that their crystal ball was not very good, because today there is the potential for a whole lot more bibliographic control; and at the same time there is increasing complexity. Hence, there are more reasons for information professionals' dedication to developing their skills for "search" tool development for whatever the Internet is going to become. [Tillman]
Some of the search engines have developed into dependable vehicles for verification and evaluation of sources. Consider the following projects:
The W3 Virtual Libraries Project
The W3 Virtual Libraries' initial approach to subject guides to the Internet is purported to be a scholarly one. They sought subject experts to develop annotated lists of sites in their fields, both broadly and narrowly. The problem has become the uneven quality of the guides and even the different approaches which grew out of the creativity of their developers. While there are clues on their pages, some have not been maintained and represent an initial or periodic effort, rather than an ongoing one. Others are very up-to-date and complete. As the web has exploded, keeping up with these subject guides has become much more complex and difficult.
Clearinghouse Project
This project is lead by Louis Rosenfeld, a Ph.D. candidate at the University of Michigan library school. According to information on the Clearinghouse web site, he plans to rate each of the guides according to four criteria:
Yahoo
Originally, Yahoo was started as a project by its two co-authors, who wanted to share their Web bookmarks. Although they started as graduate students at Stanford, they have since left there and reside at Netscape, where they have a staff to help them. At last glimpse, they wer advertizing for a cataloging librarian. They are soliciting URLs, categorizing them, and adding them to their database. They do not guarentee quality. However, one good feature of Yahoo is their technique of automatically polling sites to see if they are "up" or available.
In this world of meta information, or information about information, perhaps the next service to come along will be a group that provide "evaluations" of Internet groups that "evaluate" Internet resources. Are these resources that provide evaluations truly unbiased, or are they subjective in their analysis? Examine, for instance:
Point Communications
This is an independent company in New York with a staff of 10-25 reviewers. They use the Lycos search engine for the Point Search function. They claim no relationsip between their advertisng and their reviews of what they term "the largerst and best collection of entertaining review of the Web on the Web". Recently sold to Lycos, Point's staff claims to "...surf the Web daily looking for the best, smartest, and most entertaining sites around. If we review a page it means we think is is among the best 5% of all Web sites in content, presentation, and/or experience. Point makes no distinction between commercial, private, or student pages. Excellence is our only criterion."
The McKinley
The McKinley Internet Directory (Magellan) is an online directory of described, rated, and reviewed Internet resources and other key facts instantly accessible to users as they scan the result of their search in the McKinley. It uses the PLS search engine. Reviews are performed by a team of highly skilled international publishers, technologists and information specialists. According to the information on its web menu, "The McKinley currently contains over 20,000 evaluated, reviewed and rated sites, of which approximately 35 percent are international in origin..."
The Rating System:
The star rating that appears near the top of each review is an average of the ratings from each of four categories (4 stars is the maximum rating): HTTP, Gopher, FTP and Telnet rating measure; completeness of content presented in the resource; organization of the resource; up-to-date-ness of the information presented, and ease of access to the resource. Because of their differing functions, the ratings assigned to newsgroups, mailing lists and listservs reflect a slightly different system than the ratings for the other sites mentioned above..."
Currently, the McKinley is free. It has been licensed by the internet provider Netcom and also by IBM for use in its infomarket service.
Gale Guide
This web site is an example of a publisher offering updated information online as a supplement to their print publication. It also has descriptive information for 145 specialized home pages. [Tillman]
Structure and Search Technique
Ian R. Winship, in his research at the Information Services Department, University of Northumbria at Newcastle, UK had postulated that, prior to his investigation concerning Internet research evaluation, retrieval performance would be of primary importance. However, he found that record structure and technique rose in greater significance as he toured the Web landscape. For example, to get some indication of the practical value of the different search engines, test searches on three subjects were carried out. The main topics he selected were:
Table One shows the number of items he found.
TABLE ONE
Yahoo | Worm | WebCrawler | Lycos | Harvest | Galaxy | |
ebola | 7 | 27 | 124 | 295 | 17 | 11 |
Alberta | 0 | 0 ? | 42 | 42 | 4 | 6 |
Chirac | 0 | 0 ? | 7 | 27 | 2 | 0 |
The zeros for the Worm are questioned because that system tells you that if you get a zero response it may be because the computer is too busy to process your request and not because there is nothing relevant.
It may be worthwhile to note that when the ebola search was repeated on Lycos two weeks later, there were 504 items!
Winship warns that an immediate response does not mean that only Lycos is of real use. Analysis of results shows an excessive duplication of sources, with many of them at best of marginal interest. There may seem to be no more than 10 major collections of information on ebola, but there are hundreds of references to these from other related or personal homepages. Indeed, the services with scoring all had only 6 or 7 items in the top half of the scoring range. Consequently, the appearance of the word "ebola"' in a document title or as part of the URL is more likely to indicate a precise hit than when it is in the text of a page. Therefore, tools that search only for these will give good results. Services like Galaxy and Yahoo have a more structured collection of sources that should, in information retrieval tems, give lower recall, but higher precision. These services often advise checking their classified groupings first. When there is no source specifically on a topic, as in the Chirac example, then these are less helpful. [Winship]
One may agree with Winship in his contention that it may be more fruitful to use browsable collections like the BUBL Subject Tree, especiallly if they also include gopher material, which is becoming too easily overlooked in the Web dominated world. We should also remember that these servicees are not intended for information professionals per se, and despite their deficiencies in documentation and structure, they are very popular. This leads us to these questions:
Should teachers and information professionals get involved in the design of search tools to make them more effective and usable?
Can these tools be incorporated into the mainstream of online searching as we know it?
Is it the responsibility of teachers using Internet to provide clear instruction on evaluation and validation of sources before assigning research projects to students?
Do students and teachers have an ethical responsibility to validate all sources used in scholastic research, so as to build a new culture of tradition for Internet use, or is this just more unnecessary procedure?
Won't the information just take care of itself, or stand on its own merits?
It is clear that the Internet has exploded into a place where many of he traditional rules of research and scholarship have changed significantly. In the case of homepage proliferation, gone are the publishing house rules of jury and peer review. Slipping fast are the memories of old fashioned research using the Reader's Guide to Periodical Literature in print form, and the microfiche parties in the dungeons of libraries, standing in line with pockets of quarters behind other totally irritated students and researchers, waiting a turn to obtain a poor copy of some possibly outdated piece of information. And, in many places, the card catalog has been replaced by the computer, with the distant possibilty of some Luddite [*] rebels entering the libraries one day, smashing the monitors in defiance of this information takeover. Yes, the rules have changed dramatically, but have the teachers and information professionals moved at the same rate of speed? Should there be some task force, some self-appointed cadre of beings, who will arise as the masters of information validation?
Imagine, for a moment, a world in which all teachers and information professionals turned their backs to these questions, and paid no mind to the ethical considerations of an untended information explosion. Perhaps some utopian scenario may arise in which information continues to float in cyberspace, available to all, yet discernable by only a few, and harmless to all. A more realistic situation, however, might find our students subject to more massive manipulation by the media and the information industry.
Dillon, Martin et. al. "Assessing Information on the Internet; Toward Providing Library Services for Computer Mediated Communication," World Wide Web. Available from: http://www.oclc.org:5047/oclc/research/publications/aii/table.html
Hanson, C. "Internet Navigator - Resource Discovery," World Wide Web. This is available from: http://www.slcc.edu/lr/navigator/discovery/discover.htmlIt was developed by a consortium of information professionals led by Ms. Hanson for an online course for internet instruction under a grant from the Higher Education Technology Initiative, State of Utah.
Hovorka, Janet and Slade, Keith. "Evaluating and Citing Sources: Truth or Fiction?" World Wide Web, 1996. Available from: http://www.slcc.edu/lr/library/intwork/intwork.htm
Ladner, Sharon and Tillman, Hope. Internet and Special librarians: Use, Training, and the Future. Washington, D.C.: Special Libraries Association, 1993, p .58).
Large, J.A. "Evaluating Online and CD-ROM Sources."Journal of Librarianship 21 (2) April 1989, 87-108.
Hope Tillman, "Finding Quality on the Internet or a Needle in a Haystack?"; prepared for presentation at the NEASIS program, "Evaluating the Quality of Information on the Internet" at the John F. Kennedy School of Government, Harvard University, Cambridge, Massachusetts, September 6, 1995.Available from: http://www.tiac.net/users/hope/findqual.html.
Winship, Ian R. "World Wide Web Searching Tools - An Evaluation." VINE (99) 1995, 49-54.Library Information Technology Centre, South Bank University, London. Also available from http://www.bubl.bath.ac.uk/BUBL/IWinship.html
Editor's Note: Listings of numerous search engines can be found at:
Note: Inspired by Steven Ruffus, Professor of English and co-author of English 101 On-Line, a writing course offered on the World Wide Web funded by the Higher Education Technology Initiative, State of Utah.