Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

CiteULike is a free service for managing and discovering scholarly references - click here to get started.

Sign In to gain access to subscriptions and/or personal tools.
Journal of Information Science
This Article
Right arrow Full Text (PDF)
Right arrow Erratum for Guojun Mao, Xindong Wu, Xingquan Zhu, Gong Chen, and Chunnian Liu
Right arrow A correction has been published
Right arrow All Versions of this Article:
0165551506068179v1
33/3/251    most recent
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via Web of Science (2)
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Mao, G.
Right arrow Articles by Liu, C.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

Mining maximal frequent itemsets from data streams

Guojun Mao

Department of Computer Science, University of Vermont, Burlington VT 05405, USA, maoguojun{at}bjut.edu.cn

Xindong Wu

Department of Computer Science, University of Vermont, Burlington VT 05405, USA

Xingquan Zhu

Department of Computer Science, University of Vermont, Burlington VT 05405, USA

Gong Chen

Department of Computer Science, University of Vermont, Burlington VT 05405, USA

Chunnian Liu

School of Computer Science, Beijing University of Technology, Beijing 100022, P.R. China

Frequent pattern mining from data streams is an active research topic in data mining. Existing research efforts often rely on a two-phase framework to discover frequent patterns: (1) using internal data structures to store meta-patterns obtained by scanning the stream data; and (2) re-mining the meta-patterns to finalize and output frequent patterns. The defectiveness of such a two-phase framework lies in the fact that the two stages provide barriers to dynamically and immediately finding frequent patterns with online functionalities. It is expected that a single-phase algorithm can fulfil frequent pattern mining from data streams in such a way that the users can see patterns in an immediate and dynamic manner, as soon as the patterns have become frequent. In this paper, we propose INSTANT, a single-phase algorithm for discovering frequent itemsets from data streams. The theoretical foundation of INSTANT is based on a framework theory on a set of itemsets, which is also presented in the paper. The novel design of INSTANT ensures that it employs compact data structures to mine frequent patterns from data streams in a single phase. Our experimental results demonstrate the time and space efficiency of the proposed algorithm.

Key Words: data mining • data stream • frequent itemset • set of itemsets

This version was published on June 1, 2007

Journal of Information Science, Vol. 33, No. 3, 251-262 (2007)
DOI: 10.1177/0165551506068179


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?