Performance Evaluation

Performance Evaluation

Performance evaluation was completed to evaluate performance of 63 candidate digital libraries, during October 2010 – December 2010. Since Well-Designed Digital Libraries (WDDLs) should provide rapid and accurate responses and reliability, retrieval time for queries and relevance of obtained results evaluation criteria are selected. Generally, they are very important performances for users to use a digital library without waiting for long time and to have confidence in obtained results.

  • Response, retrieval time: how much time does it take to carry out tasks (navigate, browse, search, or obtain resources); the average time that a digital library takes to process all requests including the link response time and the search response time.
  • Relevance of obtained results (effectiveness, efficiency, and usefulness): how precise obtained results are from requested queries of users.

They are selected as performance criteria, since they are effective for performance evaluations, and can be measured automatically by programs with Python computer language. For performance evaluations, computer programs are designed with Python computer language. Depending on each evaluation criteria, each methodology and equation are differently applied to evaluate the candidate digital libraries.

Performance Evaluation Criteria

Response and Retrieval Time

Response, retrieval time is how much time it takes to carry out tasks such as navigation, browse, search, or obtain resources. In the project, response time is also the average time that a digital library takes to process all requests including the link response time and the search response time.

Reeves (2003) emphasizes information retrieval is important in evaluating a digital library, because it reveals how effectively the information is located in a library (Reeves & Woo, 2003). Nielsen (1993) argues that response times should be fast, explaining the basic response times with three limitations:

  • “0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.
  • 1.0 second is about the limit for the user’s flow of thought to stay uninterrupted, even though the user will notice the delay. Normally, no special feedback is necessary during delays of more than 0.1 but less than 1.0 second, but the user does lose the feeling of operating directly on the data.
  • 10 seconds is about the limit for keeping the user’s attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect” (Nielsen J. , Usability engineering, 1993).
Relevance

Johnson defines relevance as how precise obtained results are from the requested queries of users (Johnson, 2008). Nielsen (2010) points out that bad searching including relevance reduce the performance of digital libraries (Nielsen J. , 2011). Xie (2006) points out that the performance of digital library system is related to the relevance of the retrieval results and efficiency of the retrieval process (Xie, 2006).

Many methods to measure relevance performance are used such as precision, recall, fall-out, F-measure, mean average precision, discounted cumulative gain, etc. Google uses keyword density to measure relevance. Google says “the keyword density tool is useful for helping webmasters and SEOs achieve their optimum keyword density for a set of key terms” (SEO).

Thus, in the project, keyword density is used to measure relevance performance, as Google does. But, the method to measure relevance is dissimilar with Google.

Methodology, Results, and Analyses of Response / Retrieval Time Evaluation

In the paper, Response, retrieval time, is defined as how much time it takes to carry out tasks such as navigation or browsing links, and searching or obtaining resources. It is calculated as the average time that a digital library takes to process all requests. Read more about Methodology, Results, and Analyses of Response / Retrieval Time Evaluation …

Methodology, Results, and Analyses of Relevance Evaluation

Lastly, the relevance of obtained results from the requested queries is evaluated. It measures accuracy, effectiveness, reliability, and usefulness of the search engine. In the prototype, the relevance criteria measure how much relevant the retrieved websites for query are. It is measured by calculating how many words are matched with the query in retrieved websites. This method is based on keyword density of Google Rankings Ultimate SEO Tool (SEO Tools – Keyword Density). Read more about Methodology, Results, and Analyses of Relevance Evaluation …

Total Results and Rank by the Performance Evaluations 

Subject Area The Highest Score Digital Library (ies) in each Subject Area
Geography (5) NASA’s Visible Earth
Science(4.5) National Science Foundation
Military science (4.5) Military History and Military Science of The Library of Congress
Military science (4) United States Department of Defense
Medicine(4) AMA (American Medical Association)
Political Science and Law(4) GPO Access
Agriculture(4) Western Waters Digital Library
Science(4) U.S. Department of Energy
History of the America (3.5) The National Archives Online ExhibitsUniversity of California Digital library- calisphere
Military science(3.5) US Military Academy Digital Collections
Geography(3.5) Census Atlas of the United States
Education(3.5) The National Library of Education
Philosophy, psychology, religion (3.5) Philosophy Resources at Harvard
Technology(3) National Institute of Standards and Technology
World History and history of Europe (3) British Library Online Gallery
Agriculture(3) Core Historical Literature of Agriculture in Cornell University
Geography(3) David Rumsey Map Collection
Philosophy, psychology, religion (3) Association of Religion Data Archives
Language and literature(3) International children’s Digital LibrarySoutheast Asia Digital Library (SADL)
History of the America(3) Library of Congress: Historic Newspapers
Science(3) National science Digital Library

*More details are in the paper, Chapter VI. Performance Evaluation. This website and the paper are developed by the same person.

Comments are closed.