Loading [MathJax]/extensions/MathMenu.js
A stochastic programming approach for range query retrieval problems | IEEE Journals & Magazine | IEEE Xplore

A stochastic programming approach for range query retrieval problems


Abstract:

One of the important issues in range query (RQ) retrieval problems is to determine the key's resolution for multi-attribute records. Conventional models need to be improv...Show More

Abstract:

One of the important issues in range query (RQ) retrieval problems is to determine the key's resolution for multi-attribute records. Conventional models need to be improved because of their potential degeneracy, less-than-desired computability and possible inconsistency with the partial match query (PMQ) models. This paper presents a new RQ model to overcome these drawbacks and introduces a new methodology, stochastic programming (SP), to conduct the optimization process. The model is established by using a monotone-increasing function to characterize range sizes. Three SP approaches - the wait-and-see (WS), here-and-now (HN) and scenario tracking (ST) methods - are integrated into this RQ model. Analytical expressions of the optimal solution are derived. It seems that HN has advantage over WS because the latter usually involves complicated multiple summations or integrals. For the ST method, a nonlinear programming software package is designed. Results of numerical experiments are presented that optimized a 10-dimensional RQ model and tracked both middle-size [100] and large-size (1,000) scenarios.
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 14, Issue: 4, July-Aug. 2002)
Page(s): 867 - 880
Date of Publication: 31 August 2002

ISSN Information:

References is not available for this document.

1 Introduction

In modern database systems, a file is a collection of records and each record has a number of attributes. A subset of the attributes acts as a key for each record. The primary key is the key that uniquely distinguishes a record from all others, while all the remainings are the secondary keys. Each attribute in a key is called a field. A query is a specification of values for zero or more fields of a record. In a partial match query (PMQ), one of the fileds is specified in a record while the other is not. In a range query (RQ), the ranges of some fields are specified. Typical approaches for solving PMQs and RQs include B-trees ([39], [44]), inverted files [9], and multiattribute hashing [41]. Our study focuses on applying multiattribute hashing to the RQ problems.

Select All
1.
K. Abdel-Ghaffar and A. Abbadi, "Optimal Disk Allocation for Partial Match Queries", ACM Trans. Database Systems, vol. 18, no. 1, pp. 132-156, 1993.
2.
A. Aho and J. Ullman, "Optimal Partial Match Retrieval when Fields are Independently Specified", ACM Trans. Database Systems, vol. 4, no. 2, pp. 168-179, 1979.
3.
L. Arge, V. Samoladas and J. Vitter, "On Two-Dimensional Indexability and Optimal Range Search Indexing", Proc. 18th ACM SIGMOD-SIGACT-SIGART Symp. PODS, pp. 346-357, 1999.
4.
J. Bentley and J. Friedman, "Data Structures for Range Searching", Computer Surveys, vol. 11, no. 4, pp. 397-410, 1979.
5.
K. Bogart, Introductory Combinatorics, Harcourt Brace Jovanovich, Inc., 1990.
6.
A. Bolour, "Optimality Properties of Multiple Key Hashing Functions", J. ACM, vol. 26, no. 2, pp. 196-210, 1979.
7.
W. Burkhard, "Hashing and Trie Algorithms for Partial Match Retrieval", ACM Trans. Database Systems, vol. 1, no. 2, pp. 175-187, 1976.
8.
W. Burkhard, "Interpolation-Based Index Maintenance", Proc. Second ACM SIGACT-SIGMOD Symp. Principals of Database Systems (PODS), pp. 76-89, Mar. 1983.
9.
A. Cardenas, "Analysis and Performance of Inverted Database Structures", Comm. ACM, vol. 18, no. 5, pp. 253-263, 1975.
10.
C. Chen, H. Lin, C. Chang and R. Lee, "Optimal Bucket Allocation Design of k-ary MKH Files for Partial Match Retrieval", IEEE Trans. Knowledge and Data Eng., vol. 9, no. 1, pp. 148-160, 1997.
11.
R. Courant, "Variational Methods for the Solution of Problems of Equilibrium and Vibration", Bull. Am. Math. Soc., vol. 49, pp. 1-23, 1943.
12.
R. Dembo, "Scenario Optimization", Annals of Operations Research, vol. 30, pp. 63-80, 1991.
13.
M. Dempster, Stochastic Programming, New York:Academic Press, 1980.
14.
H. Du and J. Sobolewski, "Disk Allocation for Cartesian Product Files on Multiple-Disk Systems", ACM Trans. Database Systems, vol. 7, no. 1, pp. 82-101, 1982.
15.
C. Faloutsos, "Access Methods for Text", Computer Surveys, vol. 17, no. 1, pp. 49-74, 1985.
16.
C. Faloutsos, "Gray Codes for Partial Match and Range Queries", IEEE Trans. Software Eng., vol. 14, no. 10, pp. 1381-1393, 1988.
17.
C. Faloutsos and R. Chan, "Fast Text Access Methods for Optical and Large Magnetic Disks: Design and Performance Comparison", Proc. 14th Int'l Conf. Very Large Databases, pp. 280-293, Sept. 1988.
18.
M. Fredman, "A Lower Bound on the Complexity of Orthogonal Range Queries", J. ACM, vol. 28, no. 4, pp. 696-705, 1981.
19.
M. Freeston, "The BANG File: A New Kind of Grid File", Proc. ACM SIGMOD Ann. Conf. Management of Data, pp. 260-269, May 1987.
20.
R. Gustafson, "Elements of the Randomized Combinatorial File Structure", ACM SIGIR Proc. Symp. Information Storage and Retrieval, pp. 163-174, Apr. 1971.
21.
E. Harris and K. Ramamohanarao, "Optimal Dynamic Multi-Attribute Hashing for Range Queries", BIT, vol. 33, pp. 561-579, 1993.
22.
J. Hellerstein, E. Koutsoupias and C. Papadimitriou, "On the Analysis of Indexing Schemes", Proc. Sixteenth ACM SIGACT-SIGMOD-SIGART Symp. Principals of Database Systems (PODS), pp. 249-256, May 1997.
23.
M. Hestenes, "Multiplier and Gradient Method", J. Optimization Theory and Applications, vol. 3, pp. 303-320, 1969.
24.
P. Kall, Stochastic Linear Programming, Berlin, Heidelberg:Springer-Verlag, 1976.
25.
P. Larson, "Linear Hashing with Partial Expansions", Proc. Sixth Conf. Very Large Data Bases, pp. 224-232, Oct. 1980.
26.
D. Lee, Y. Kim and G. Patel, "Efficient Signature File Methods for Text Retrieval", IEEE Trans. Knowledge and Data Eng., vol. 7, no. 3, pp. 423-435, 1995.
27.
W. Litwin, "Linear Hashing: A New Tool for File and Table Addressing", Proc. Sixth Conf. Very Large Data Bases, pp. 212-223, Oct. 1980.
28.
X. Liu and G. Slemon, "An Improved Method of Optimization for Electrical Machines", IEEE Trans. Energy Conversion, vol. 6, no. 3, pp. 492-496, 1991.
29.
J. Lloyd, "Optimal Partial-Match Retrieval", BIT, vol. 20, pp. 406-413, 1980.
30.
J. Lloyd and K. Ramamohanarao, "Partial-Match Retrieval for Dynamic Files", BIT, vol. 22, pp. 150-168, 1982.
Contact IEEE to Subscribe

References

References is not available for this document.