| Sign In to gain access to subscriptions and/or personal tools. |
Mining maximal frequent itemsets from data streamsDepartment of Computer Science, University of Vermont, Burlington VT 05405, USA, maoguojun{at}bjut.edu.cn
Department of Computer Science, University of Vermont, Burlington VT 05405, USA
Department of Computer Science, University of Vermont, Burlington VT 05405, USA
Department of Computer Science, University of Vermont, Burlington VT 05405, USA
School of Computer Science, Beijing University of Technology, Beijing 100022, P.R. China Frequent pattern mining from data streams is an active research topic in data mining. Existing research efforts often rely on a two-phase framework to discover frequent patterns: (1) using internal data structures to store meta-patterns obtained by scanning the stream data; and (2) re-mining the meta-patterns to finalize and output frequent patterns. The defectiveness of such a two-phase framework lies in the fact that the two stages provide barriers to dynamically and immediately finding frequent patterns with online functionalities. It is expected that a single-phase algorithm can fulfil frequent pattern mining from data streams in such a way that the users can see patterns in an immediate and dynamic manner, as soon as the patterns have become frequent. In this paper, we propose INSTANT, a single-phase algorithm for discovering frequent itemsets from data streams. The theoretical foundation of INSTANT is based on a framework theory on a set of itemsets, which is also presented in the paper. The novel design of INSTANT ensures that it employs compact data structures to mine frequent patterns from data streams in a single phase. Our experimental results demonstrate the time and space efficiency of the proposed algorithm.
Key Words: data mining data stream frequent itemset set of itemsets
This version was published on June
1, 2007 Journal of Information Science, Vol. 33, No. 3,
251-262 (2007) |
|||