I. Introduction
Cost estimation is central to query optimization. Errors in cost estimation can lead to the choice of a poor execution plan for the query. A key parameter of the cost model is the distinct page count, i.e., the number of distinct pages that need to be fetched from a table. This parameter affects the I/O cost estimation of the query and plays a significant role in the choice of access methods (e.g., Index Seek vs. Table Scan) and join methods (e.g., Index Nested Loops (INL) Join vs. Hash Join). Surprisingly, unlike cardinality estimation, there has been relatively little work focused on it. The following example illustrates the importance of the distinct page count parameter.