Loading web-font TeX/Math/Italic
Group Testing With Random Pools: Optimal Two-Stage Algorithms | IEEE Journals & Magazine | IEEE Xplore

Group Testing With Random Pools: Optimal Two-Stage Algorithms


Abstract:

We study the group testing of a set of N items each of which is defective with probability p. We focus on the double limit of small defect probability, p ≪ 1, and large n...Show More

Abstract:

We study the group testing of a set of N items each of which is defective with probability p. We focus on the double limit of small defect probability, p ≪ 1, and large number of variables, N ≫ 1, taking either p → 0 after N → ∝ or p = 1/Nβ with β ∈ (0,1/2). In both settings the optimal number of tests which are required to identify with certainty the defectives via a two-stage procedure, T̅(N, p), is known to scale as Np |log p|. Here we determine the sharp asymptotic value of T̅(N,p)/(Np|log p|) and construct a class of two-stage algorithms over which this optimal value is attained. This is done by choosing a proper bipartite regular graph (of tests and variable nodes) for the first stage of the detection. Furthermore we prove that this optimal value is also attained on average over a random bipartite graph where all variables have the same degree and the tests connected to a given variable are randomly chosen with uniform distribution among all tests. Finally, we improve the existing upper and lower bounds for the optimal number of tests in the case p = 1/Nβ with β ∈ [1/2,1).
Published in: IEEE Transactions on Information Theory ( Volume: 57, Issue: 3, March 2011)
Page(s): 1736 - 1745
Date of Publication: 17 February 2011

ISSN Information:


I. Introduction

The aim of Group Testing (GT) is to detect an unknown subset of defective (also called positive or active) items out of a set of objects by means of queries (the tests) in the most efficient way. More precisely we are given a set of objects, , each of which can be either defective or not. The task is to identify the subset of defectives, , by means of the fewest possible number of queries of the form “Does the pool (where is a subset of ) contain at least one defective item?” This problem was originally introduced in relation with efficient mass blood testing [1]. Afterwards, it has been also applied in a variety of situations in molecular biology: blood screening for HIV tests [2], screening of clone libraries [3], [4], sequencing by hybridization [5], [6]. Furthermore it has proved relevant for fields other than biology including quality control in product testing [7], searching files in storage systems [8], data compression [9], and more recently in the context of data gathering in sensor networks [10] and multiaccess communication. We refer to [12] and [13] for reviews on the different applications of GT. Here we deal with the very much studied gold-standard case, namely the idealized situation in which tests are perfect: there can be neither false positives nor false negatives in their outcomes. It is important to keep in mind for future work that, however, in many biological applications one should include the possibility of errors in the test answers.

Contact IEEE to Subscribe

References

References is not available for this document.