Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Click here to sign up for SAGE Journal Email Alerts today!

Sign In to gain access to subscriptions and/or personal tools.
Journal of Information Science
This Article
Right arrow Full Text (PDF)
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Web of Science (6)
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Nicholas, D.
Right arrow Articles by Huntington, P.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

Micro-Mining and Segmented Log File Analysis: A Method for Enriching the Data Yield from Internet Log Files

David Nicholas

Ciber (Centre for Information Behaviour and the Evaluation of Research), Department of Information Science, City University, London, nicky{at}soi.city.ac.uk

Paul Huntington

Ciber (Centre for Information Behaviour and the Evaluation of Research), Department of Information Science, City University, London

The authors propose improved ways of analysing web server log files. Traditionally web site statistics focus on giving a big (and shallow) picture analysis based on all transaction log entries. The pictures are, however, distorted because of the problems associated with resolving Internet protocol (IP) numbers to a single user and cross-border IP registration. The authors argue that analysing extracted sub-groups and categories presents a more accurate picture of the data and that the analysis of the online behaviour of selected individuals (rather than of very large groups) can add much to our understanding of how people use web sites and, indeed, any digital information source. The analysis is labelled `micro' to distinguish it from traditional macro, big picture transactional log analysis. The methods are illustrated with recourse to the logs of the Surgery Door (www.surgerydoor.co.uk) consumer health web site. It was found that use attributed to academic users gave a better approximation of the sites' geographical distribution of users than an analysis based on all users. This occurs as academic institutions, unlike other user types, register in their host country. Selecting log entries where each user is allocated a unique IP number can be particularly beneficial, especially to analyses of returnees. Finally the paper tracks the online behaviour of a small number of IP numbers, in an example of the application of microanalysis,

Key Words: websites • data mining • user profiles • user behaviour • consumer health information

Journal of Information Science, Vol. 29, No. 5, 391-404 (2003)
DOI: 10.1177/01655515030295005


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?


This article has been cited by other articles:


Home page
Journal of Information ScienceHome page
P. Huntington, D. Nicholas, H. R. Jamali, and A. Watkinson
Obtaining subject data from log files using deep log analysis: case study OhioLINK
Journal of Information Science, August 1, 2006; 32(4): 299 - 308.
[Abstract] [PDF]


Home page
Health Informatics JournalHome page
G. Madle, P. Kostkova, J. Mani-Saada, and A. Roy
Lessons learned from evaluation of the use of the National electronic Library of Infection
Health Informatics Journal, June 1, 2006; 12(2): 137 - 151.
[Abstract] [PDF]