Journals & Magazines >IEEE Access >Volume: 10

Belief Propagation With Optimized Pool Size for Non-Adaptive Group Testing: An Empirical Study

The test design for belief propagation (BP) in non-adaptive group testing is empirically studied with the consideration of the effectiveness of messages over edges. It al...

Abstract:

In this paper, an empirical study shows that positive tests containing multiple defectives are unlikely to provide effective messages in belief propagation (BP) for non-a...Show More

Metadata

Abstract:

In this paper, an empirical study shows that positive tests containing multiple defectives are unlikely to provide effective messages in belief propagation (BP) for non-adaptive group testing. Thus, an objective function is proposed to measure the effectiveness of messages over edges, especially in the low-noise region. The maximization of the objective function allows us to optimize the pool size for BP. Simulation results show that the error performance of BP in the low-noise region is significantly improved by our pool size optimization.

The test design for belief propagation (BP) in non-adaptive group testing is empirically studied with the consideration of the effectiveness of messages over edges. It al...

Published in: IEEE Access ( Volume: 10)

Page(s): 107170 - 107176

Date of Publication: 06 October 2022

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2022.3212730

Funding Agency:

Citations are not available for this document.

Contents

SECTION I.

Introduction

In order to detect few defectives among millions of items, group testing was proposed by Dorfman [1] to reduce the huge number of tests required by individual testing. In the following decades, the theory of group testing has been further explored [2], [3], [4] and applied in various fields like information and communication technology [5], [6], [7], [8], data science [9], [10], and pandemic detection [11], [12], [13]. In general, the group testing strategies can be divided into adaptive and non-adaptive ways. The former sequentially designs each test based on the feedback of the previous tests, while the latter designs all tests in advance. This paper focuses on non-adaptive group testing (NAGT), due to its parallelizability and lower computation cost [8]. Further, noisy group testing is considered here because of its practical application [4], where the output of each test can be corrupted by noise.

In the scenario of noisy NAGT, the belief propagation (BP) algorithm empirically performs well in classifying the items [4], [14], [15], [16]. In [14], it indicates that different test designs may affect the error performance of BP in NAGT. In general, the Bernoulli random test design with fixed probability $p=\ln 2/K$ is used to extract the maximum amount of information from each test [17], where $K$ denotes the expected number of defectives. As a result, half of the tests are positive and one bit of information is contained in each test [17]. However, it still lacks studies that how much information from tests is effectively used in the BP algorithm.

In this paper, we study the test design for BP to improve the error performance in NAGT with the consideration of the effectiveness of messages over edges. The main contributions of this work are given as follows:

We distinguish the tests in terms of their transmitted messages in BP. Based on the bipartite subgraphs corresponding to various classes of tests, we show that the positive tests containing multiple defectives are unlikely to provide effective messages to their associated items, especially in the low-noise region.
Instead of choosing the pool size to make the positive and negative tests approximately equal [15], [17], we only consider the tests with effective messages and propose a novel objective function to measure their effects over edges in the bipartite graph. Then, the optimized pool size in NAGT for BP is determined by the maximization of the objective function.
We evaluate the error performance and computational complexity of BP with various test designs. Simulations show that the bit error rate (BER) of BP with our optimized pool size is about one order magnitude less than that with conventional test designs [15], [17] under the binary symmetric channel (BSC) with low flipping probability. Moreover, the expected computational complexity of the former is less than that of the latter.

The outline of this paper is as follows. Section II introduces the model of group testing and the BP algorithm. Section III proposes pool size optimization for BP. In Section IV, we test the error performance of BP with optimized pool size. Last, conclusions are drawn in Section V.

SECTION II.

Preliminaries

In this section, we first describe the basics of group testing. Then, we review the BP algorithm that will be further analyzed in later sections.

A. Model of Group Testing

Consider the detection of a small number of defectives in $N$ items with the defective rate $\alpha$ of each item. In other words, the defective items are independent and identically distributed with the expected number $K=N\alpha$ . The random binary vector, $\boldsymbol {X}=(x_{1},\ldots,x_{j},\ldots,x_{N})\in \{0,1\}^{N}$ , denotes the status of $N$ items, where $x_{j}=1$ represents that the $j$ -th item is defective and $\Pr (x_{j}=1)=\alpha$ for $1 \leq j \leq N$ .

Let $T$ be the number of tests to detect the defectives. For $1 \leq i \leq T$ and $1 \leq j \leq N$ , the $i$ -th test is represented by $\boldsymbol {V}_{i}=(v_{i1},\ldots,v_{ij},\ldots,v_{iN})\in \{0,1\}^{N}$ , where $v_{ij}=1$ if the $j$ -th item is in the pool of that test; otherwise, $v_{ij}=0$ . The noiseless output of the $i$ -th test is denoted as ${z}_{i}$ . If no defective item participates in that test, $z_{i}=0$ ; otherwise, $z_{i}=1$ . Formally we have

$\begin{equation*} {z_{i}}=\bigvee \limits _{j=1}^{N}{({v_{ij}}\wedge {{x_{j}}})},\tag{1}\end{equation*}$ View Source

where the symbols

$\vee$

and

$\wedge$

stand for Boolean sum and Boolean product, respectively. In practice, the output

${z}_{i}$

may be corrupted by noise and its noisy observation is

${y}_{i}$

. In this paper, we consider the widely-adopted BSC noise model with flipping probability

$\epsilon < 0.5$

, denoted as BSC(

$\epsilon$

). Let

${n}_{i}$

be the BSC noise, then we have

$\begin{equation*} {y_{i}}=z_{i} \oplus n_{i}= \Big (\bigvee \limits _{j=1}^{N}{({v_{ij}}\wedge {{x_{j}}})}\Big)\oplus n_{i},\tag{2}\end{equation*}$

View Source

where the symbol

$\oplus$

denotes the XOR operation. Similarly, the noiseless outputs and their observations are denoted as

$\boldsymbol {Z}=(z_{1},\ldots,z_{T})$

and

$\boldsymbol {Y}=(y_{1},\ldots,y_{T})$

, respectively. After observing

$\boldsymbol {Y}$

, the estimation

$\hat { \boldsymbol {X}}$

$\boldsymbol {X}$

is calculated to classify the

$N$

items. Three error performance metrics including BER, false negative rate (FNR) and false positive rate (FPR) are considered in the estimation of items, where

$\begin{align*} \text {FNR}=&\frac {\text {Num}(\hat {x}_{j}=0 |x_{j}=1)}{\text {Num}(x_{j}=1)}, \tag{3}\\ \text {FPR}=&\frac {\text {Num}(\hat {x}_{j}=1 |x_{j}=0)}{\text {Num}(x_{j}=0)}.\tag{4}\end{align*}$

View Source

In the above equations,

$\text {Num}(\hat {x}_{j}=0 |x_{j}=1)$

is the number of defectives falsely classified, and the others are similar.

The pool of $T$ tests is denoted as a pooling matrix $\mathbf {V}$ , where $\mathbf {V}=[\boldsymbol {V}_{1};\ldots; \boldsymbol {V}_{i};\ldots; \boldsymbol {V}_{T}]\in \{0,1\}^{T \times N}$ . For $1 \leq i \leq T$ and $1 \leq j \leq N$ , the row weight $\rho _{i} \in [1,N]$ and the column weight $\gamma _{j} \in [1,T]$ of $\mathbf {V}$ denote the pool size of the $i$ -th test and the number of tests participated by the $j$ -th item, respectively. If $\mathbf {V}$ has constant row weight $\rho$ and constant column weight $\gamma$ , it is called ( $\rho,\gamma$ )-regular.

B. The BP Algorithm

From the view of graphical representation, the matrix $\mathbf {V}$ can be described by a bipartite graph [18]. In analogy with the Tanner graphs [19] of low-density parity-check (LDPC) codes [20], we use variable nodes (VNs) and check nodes (CNs) to represent items and tests, respectively. As shown in Fig. 1, only the nodes of different types can be connected by the edges. Given this representation, it is convenient to estimate $\boldsymbol {X}$ by the BP algorithm, which has been widely used for decoding LDPC codes [21]. Their main difference in the message update comes from the specific operations to calculate the outputs of group testing and the check sums of LDPC codes. In this part, we briefly review the BP algorithm under the BSC [4], [15].

FIGURE 1.

Graphical representation of the pooling matrix of NAGT.

Show All

The BP algorithm iteratively estimates the posterior marginals for the items in a graph-based model. As shown in Fig. 1, the probabilistic messages passed from the VNs and CNs in the $k$ -th iteration are denoted as ${\xi }^{(k)}_{j,i}(x_{j})$ and ${\delta }^{(k)}_{i,j}(x_{j})$ , respectively. These messages are probability distributions on $\{0,1\}$ [15], i.e., ${\xi }^{(k)}_{j,i}(x_{j}) \in [{0,1}]$ and ${\xi }^{(k)}_{j,i}(0)+ {\xi }^{(k)}_{j,i}(1) =1$ , and similarly for ${\delta }^{(k)}_{i,j}(x_{j})$ . Define the index sets $M_{j}=\{i:1\leq i\leq T, v_{i,j}\neq 0\}$ for $1\leq j\leq N$ and $N_{i}=\{j:1\leq j\leq N, v_{i,j}\neq 0\}$ for $1\leq i\leq T$ , respectively. For $k=0$ with arbitrary $i$ and $j$ , initialize ${\xi }^{(0)}_{j,i}(1) =\alpha$ , and ${\xi }^{(0)}_{j,i}(0)=1-\alpha$ . For $k \geq 1$ , ${\delta }^{(k)}_{i,j}(x_{j})$ is first calculated in the test-to-item message update as follows. If $y_{i}=0$ , then

$\begin{align*} \begin{cases} \delta _{i,j}^{(k)}(0)=\lambda \epsilon +\lambda (1-2\epsilon) \prod \limits _{j'\in {N_{i}}\backslash j}{\xi _{j',i}^{(k-1)}(0)}, \\ \delta _{i,j}^{(k)}(1) =\lambda \epsilon; \end{cases}\tag{5}\end{align*}$ View Source

otherwise,

$\begin{align*} \begin{cases} \delta _{i,j}^{(k)}(0)=\lambda (1-\epsilon)-\lambda (1-2\epsilon)\prod \limits _{j'\in {N_{i}}\backslash j}{\xi _{j',i}^{(k-1)}(0)}, \\ \delta _{i,j}^{(k)}(1) =\lambda (1-\epsilon) \, \end{cases}\tag{6}\end{align*}$

View Source

where the symbol

$\lambda$

is a positive normalizing factor. Then,

${\xi }^{(k)}_{j,i}(x_{j})$

is calculated in the item-to-test message update by

$\begin{equation*} {\xi }^{(k)}_{j,i}(b)= \lambda \alpha ^{b}(1-\alpha)^{1-b} \prod \limits _{i'\in {M_{j}}\backslash i}{\delta _{i',j}^{(k)}(b)}, ~~b\in \{0,1\}.\tag{7}\end{equation*}$

View Source

Finally, the estimation of

${x}_{j}$

in the

$k$

-th iteration is given by

$\begin{equation*} {{\hat {x}}^{(k)}_{j}}\leftarrow \arg \underset {b\in \{0,1\}}{\max }\,\left[{\alpha ^{b}(1-\alpha)^{1-b}\prod \limits _{i\in {M_{j}}}{\delta _{i,j}^{(k)}(b)}}\right].\tag{8}\end{equation*}$

View Source

In the iterative process, different update schedules can be used. Compared to the flooding BP algorithm which performs message update in parallel, the random scheduling BP (RSBP) algorithm updates messages in a randomized way and improves the error performance [15], [16].

SECTION III.

Pool Size Optimization for Belief Propagation

In this section, we investigate the effects of tests in BP by considering their transmitted messages in the associated subgraphs. It shows that the positive tests containing multiple defectives are difficult to transmit effective messages, especially in the low-noise region and later iterations of BP. Therefore, we focus on other tests with effective messages and propose an objective function to measure their effects over edges. Pool size optimization maximizes the objective function, and hence facilitates the transmission of effective messages in BP decoding.

A. Classes of Tests

There are two types of tests in our model: with or without corrupted outputs. First, we show that the tests with corrupted outputs usually have negative effects to their associated items in BP decoding. Specifically speaking, if $x_{j}=1$ and its associated positive test is corrupted with $y_{i}=0$ , we have $\delta _{i,j}^{(k)}(0) \geq \delta _{i,j}^{(k)}(1)$ based on (5) and $\epsilon < 0.5$ . As a conclusion, this corrupted test provides either misleading information or no information to the $j$ -th item, if “>” or “=” is true in the above inequality. Similarly, if $x_{j}=0$ and its associated negative test is corrupted with $y_{i}=1$ , we have $\delta _{i,j}^{(k)}(0) \leq \delta _{i,j}^{(k)}(1)$ based on (6) and draw a consistent conclusion. Or if $x_{j}=0$ and its associated positive test is corrupted with $y_{i}=0$ , although no misleading information is transmitted from the $i$ -th test to the $j$ -th item, but the defective items associated with the $i$ -th test are misled by its messages.

Thus, we focus on the tests with correct outputs, which can be further divided into three classes: (i) containing multiple defective items with $y_{i}=1$ ; (ii) containing only one defective item with $y_{i}=1$ ; (iii) containing no defective item with $y_{i}=0$ . Corresponding subgraphs of three classes of tests are depicted in Fig. 2. The number “0” or “1” marked on each node is its noiseless true value. Then, the message update of various classes of tests is analyzed as below.

FIGURE 2.

Subgraphs of three classes of tests.

Show All

Without the loss of generality, we consider after a certain number of BP iterations in a $\rho$ -regular graph, there still exist a few decoding errors, and then the BP algorithm goes to the $k$ -th iteration:

Suppose the $i$ -th test belongs to class (i) as shown in Fig. 2-(i). First, it provides no effective messages to the associated non-defective items, since $\delta _{i,j}^{(k)}(0) \leq \delta _{i,j}^{(k)}(1)$ based on (6). Second, as long as one of its associated defective items with index ${j_{1}}$ is correctly decoded as ${{\hat {x}}^{(k-1)}_{j_{1}}}=1$ in the $(k-1)$ -th iteration, it is likely that $\xi _{{j_{1}},i}^{(k-1)}(0)\rightarrow 0$ , especially in the low-noise region. As a result, $\delta _{i,j}^{(k)}(0) \approx \delta _{i,j}^{(k)}(1)$ for each item ${j}\in {N_{i}}\backslash {j_{1}}$ based on (6). In this case, the $i$ -th test transmits almost no effective messages to the rest of the defective items, including those with decoding errors. Third, as long as one of its associated non-defective items with index ${j_{2}}$ is incorrectly decoded as ${{\hat {x}}^{(k-1)}_{j_{2}}}=1$ with $\xi _{{j_{2}},i}^{(k-1)}(0)\rightarrow 0$ , we have $\delta _{i,j}^{(k)}(0) \approx \delta _{i,j}^{(k)}(1)$ for each item ${j}\in {N_{i}}\backslash {j_{2}}$ based on (6). In this case, the $i$ -th test is not helpful to correct the decoding errors for the ${j_{2}}$ -th item and other defective items. Fourth, if no item is decoded as ${{\hat {x}}^{(k-1)}_{j}}=1$ , the restriction of $y_{i}=1$ will lead to at least one item to be decoded as 1 in latter iteration. Therefore, this case can be transformed into the above second or third case.
Suppose the $i$ -th test belongs to class (ii) as shown in Fig. 2-(ii). Since $\delta _{i,j}^{(k)}(0) \leq \delta _{i,j}^{(k)}(1)$ for each ${j}\in {N_{i}}$ based on (6), the $i$ -th test can only transmit effective message to the associated defective item. Specifically speaking, if its associated non-defective items with indexes $\{{j'}:{j'}\in {N_{i}}\backslash j\}$ are correctly decoded as 0’s in the $(k-1)$ -th iteration,1 each $\xi _{j',i}^{(k-1)}(0)$ is far from 0, since $\delta _{i,j'}^{(k-1)}(0) \leq \delta _{i,j'}^{(k-1)}(1)$ and $\delta _{i,j'}^{(k-1)}(0)\cdot \xi _{j',i}^{(k-1)}(0)> \delta _{i,j'}^{(k-1)}(1) \cdot \xi _{j',i}^{(k-1)}(1)$ based on (7) and (8). Thus, the $i$ -th test transmits $\delta _{i,j}^{(k)}(0) < \delta _{i,j}^{(k)}(1)$ to the only defective item with index ${j}\in {N_{i}}$ . In particular, in the low-noise region where $\xi _{j',i}^{(k-1)}(0) \rightarrow 1$ , the difference between $\delta _{i,j}^{(k)}(1)$ and $\delta _{i,j}^{(k)}(0)$ tends to $\lambda (1-2\epsilon)$ .
Suppose the $i$ -th test belongs to class (iii) as shown in Fig. 2-(iii). Since $\delta _{i,j}^{(k)}(0) \geq \delta _{i,j}^{(k)}(1)$ for each ${j}\in {N_{i}}$ based on (5), the $i$ -th test can provide effective messages to all the $\rho$ associated non-defective items. In particular, in the low-noise region where $\xi _{j,i}^{(k-1)}(0) \rightarrow 1$ for each ${j}\in {N_{i}}$ , the difference between $\delta _{i,j}^{(k)}(0)$ and $\delta _{i,j}^{(k)}(1)$ tends to $\lambda (1-2\epsilon)$ .

In conclusion, the above analysis shows that the tests of class (i) are difficult to transmit effective messages to their associated items. In contrast, each test of class (ii) only transmits effective message to the associated defective item, while each test of class (iii) can transmit effective messages to all the $\rho$ associated items. The occurrence probabilities of three classes of tests are denoted as $P_{1}$ , $P_{2}$ and $P_{3}$ , respectively, where $P_{1}=1-\epsilon -P_{2}-P_{3}$ ,

$\begin{align*} P_{2}=&(1-\epsilon)\rho \alpha (1-\alpha)^{(\rho -1)}, \tag{9}\\ P_{3}=&(1-\epsilon)(1-\alpha)^{\rho }.\tag{10}\end{align*}$ View Source

B. Pool Size Optimization Via Objective Function

In conventional test designs, the parameters of pooling matrix are usually chosen to maximize the entropy of each test. For instance, in the Bernoulli design [17], each item is included in each test independently at random with the probability $p=\ln 2/K$ . As a result, the occurrence probability of positive tests, denoted as $P_{\text {pos}}$ , tends to 0.5 in the asymptotic region of $N\rightarrow \infty$ . Then, the binary entropy function is used to measure the entropy of each test as follows [17]:

$\begin{align*} H(P_{\text {pos}})= -P_{\text {pos}}\log _{2}P_{\text {pos}}-(1-P_{\text {pos}})\log _{2}(1-P_{\text {pos}}). \\{}\tag{11}\end{align*}$ View Source

Similarly, a randomly chosen (

$\rho,\gamma$

)-regular test design [15] sets

$\rho =\ln 2/\alpha$

and

$\gamma =T\rho /N$

, so that nearly half of the tests are positive and about one bit of information is contained in each test.

According to the analysis in the previous subsection, we design a $\rho$ -regular pooling matrix $\mathbf {V}$ for BP by only considering the positive tests of class (ii) and the negative tests of class (iii). Similar to (11), a metric denoted as $J_{\text {m}}$ is firstly defined to measure the information contained in each test of classes (ii) and (iii) as follows:

$\begin{equation*} J_{\text {m}}= -P_{2}'\log _{2}P_{2}'-(1-P_{2}')\log _{2}(1-P_{2}'),\end{equation*}$ View Source

where

$P_{2}'$

is a normalization of the probability distribution:

$\begin{equation*} P_{2}'=\frac {P_{2}}{P_{2}+P_{3}}=\frac {\rho \alpha }{1+(\rho -1)\alpha }.\end{equation*}$

View Source

Based on the previous analysis of BP, each positive test of class (ii) can only provide effective message to one defective item, while each negative test of class (iii) can provide effective messages to

$\rho$

non-defective items. Thus, we regard the maximum amount of information provided by these tests in the test-to-item message update as

$\begin{equation*} J_{\text {sum}}= TP_{2}\cdot J_{\text {m}}+TP_{3}\cdot \rho J_{\text {m}}.\end{equation*}$

View Source

After that, we define the objective function

$J_{\text {o}}$

as follows:

$\begin{equation*} J_{\text {o}}= \frac {J_{\text {sum}}}{T\rho }=(1-\epsilon)(1-\alpha)^{(\rho -1)}J_{\text {m}},\tag{12}\end{equation*}$

View Source

which is related to the average of the maximum amount of information provided by the tests of class (ii) and (iii) on each edge. Given the parameters

$\epsilon$

and

$\alpha$

$J_{\text {o}}$

is a function of

$\rho$

. We believe that

$J_{\text {o}}$

is able to reflect the effectiveness of the transmitted messages in the test-to-item message update, at least in the low-noise region and later iterations of BP. Thus, the optimized pool size is set as

$\begin{equation*} \rho _{\text {opt}}=\arg \underset {\rho \in \mathbb {N}_{+}}{\max } (J_{\text {o}}),\tag{13}\end{equation*}$

View Source

where

$\mathbb {N}_{+}$

denotes the positive integer set. Accordingly, the column weights of

$\mathbf {V}$

are constant or near-constant with the expected number

$\overline {\gamma }=T\rho _{\text {opt}}/N$

. Simulation results in next section show that the maximization of the objective function is benefit for BP decoding, especially in the low-noise region.

SECTION IV.

Simulation and Comparison

In this section, we compare the error performance and computational complexity of the RSBP algorithm [16] with different test designs. Given the parameters $N$ , $T$ , $\alpha$ and $\rho$ , the pooling matrix $\mathbf {V}$ is generated by the Bernoulli design [17] with fixed probability $p=\ln 2/(N\alpha)$ or the $\rho$ -regular test design. For the Bernoulli design, a specific randomly constructed $\mathbf {V}$ is generated in each of the settings below. For the $\rho$ -regular test design, the progressive-edge-growth algorithm proposed for LDPC codes [22] is used to generate $\mathbf {V}$ . The maximum iteration number of the RSBP algorithm is set as 10 times of $T$ , so that each message is likely to be updated several times. In addition, each simulation result is averaged over at least 3000 experiments, among which at least 100 experiments have decoding errors.

Setting 1: Given $\alpha =0.03$ , $N=100$ and $T=50$ for the RSBP algorithm. Based on (12) and (13), the flipping probability $\epsilon$ is just a scaling factor which has no influence on the value of $\rho _{\text {opt}}$ . For simplicity, the relation between $J_{\text {o}}$ and $\rho$ given $\epsilon =0$ is shown in the magenta line in Fig. 4. As a result, the optimized pool size via the objective function is set as $\rho _{\text {opt}}=11$ . Pool size $\rho _{0}=\lceil \frac {ln2}{\alpha } \rceil = 24$ , where about half of the tests are positive and the entropy of each test is maximized [15], and the two other pool sizes 5 and 18 are investigated for comparison purpose. The RSBP algorithm with these various pool sizes are tested under the BSC as shown in Fig. 4.

$FIGURE 3. - Relation between the objective function and the pool size with various defective rates and $\epsilon =0$ .$

FIGURE 3.

Relation between the objective function and the pool size with various defective rates and $\epsilon =0$ .

Show All

$FIGURE 4. - Error performance of different strategies with various pool sizes under the BSC when $\alpha =0.03$ , $N=100$ and $T=50$ .$

FIGURE 4.

Error performance of different strategies with various pool sizes under the BSC when $\alpha =0.03$ , $N=100$ and $T=50$ .

Show All

Setting 2: Given $\alpha =0.01$ , $N=1000$ and $T=200$ for the RSBP algorithm. When $\epsilon =0$ , the relation between $J_{\text {o}}$ and $\rho$ given $\epsilon =0$ is shown in the black line in Fig. 4. Based on (13), we have $\rho _{\text {opt}}=35$ . Pool size $\rho _{0}=\lceil \frac {ln2}{\alpha } \rceil = 70$ and the two other smaller pool sizes 10 and 20 are investigated for comparison purpose. The RSBP algorithm with these various pool sizes are tested under the BSC as shown in Fig. 4.

Setting 3: Given $\alpha =0.005$ , $N=1000$ and $T=200$ for the RSBP algorithm. When $\epsilon =0$ , the relation between $J_{\text {o}}$ and $\rho$ given $\epsilon =0$ is shown in the red line in Fig. 4. Based on (13), we have $\rho _{\text {opt}}=70$ . Pool size $\rho _{0}=\lceil \frac {ln2}{\alpha } \rceil =139$ and the two other smaller pool sizes 15 and 30 are investigated for comparison purpose. The RSBP algorithm with these various pool sizes are tested under the BSC as shown in Fig. 4.

Setting 4: Given $\alpha =0.001$ , $N=3000$ and $T=100$ for the RSBP algorithm. When $\epsilon =0$ , the relation between $J_{\text {o}}$ and $\rho$ given $\epsilon =0$ is shown in the blue line in Fig. 4. Based on (13), we have $\rho _{\text {opt}}=351$ . Pool size $\rho _{0}=\lceil \frac {ln2}{\alpha } \rceil =694$ and the two other smaller pool sizes 100 and 150 are investigated for comparison purpose. The RSBP algorithm with these various pool sizes are tested under the BSC as shown in Fig. 4.

As shown in Fig. 4 ~7, given different values of $\epsilon$ and $\alpha$ , the RSBP algorithm with our optimized pool size performs best in terms of BER, FNR and FPR in the region of low $\epsilon$ . For instance, its BER is about one order magnitude less than that of the RSBP algorithm with $\rho _{0}$ or the Bernoulli design when $(\epsilon,\alpha) \in \big \{(0.001,0.03), (0.001,0.01),(0.01,0.005),(0.01,0.001)\big \}$ , and about one to two orders of magnitude less than those of the RSBP algorithm with smaller pool sizes.

$FIGURE 5. - Error performance of different strategies with various pool sizes under the BSC when $\alpha =0.01$ , $N=1000$ and $T=200$ .$

FIGURE 5.

Error performance of different strategies with various pool sizes under the BSC when $\alpha =0.01$ , $N=1000$ and $T=200$ .

Show All

$FIGURE 6. - Error performance of different strategies with various pool sizes under the BSC when $\alpha =0.005$ , $N=1000$ and $T=200$ .$

FIGURE 6.

Error performance of different strategies with various pool sizes under the BSC when $\alpha =0.005$ , $N=1000$ and $T=200$ .

Show All

$FIGURE 7. - Error performance of different strategies with various pool sizes under the BSC when $\alpha =0.001$ , $N=3000$ and $T=100$ .$

FIGURE 7.

Error performance of different strategies with various pool sizes under the BSC when $\alpha =0.001$ , $N=3000$ and $T=100$ .

Show All

Now we discuss the computational complexity of the message update in the RSBP algorithm. Suppose $\mathbf {V}$ has constant row weight $\rho$ and constant column weight $\gamma$ . In each iteration $(k\geq 1)$ of the RSBP algorithm, one CN with index $i$ is randomly chosen to compute the messages ${\delta }^{(k)}_{i,j}(x_{j})$ ’s for ${j}\in {N_{i}}$ , and then all the messages ${\xi }^{(k)}_{j,i_{1}}(x_{j})$ ’s for $i_{1} \in M_{j}$ are required to compute. In addition, the messages on the rest of the edges are unchanged and need no update. First, we consider ${\delta }^{(k)}_{i,j}(x_{j})$ ’s based on (5) and (6). The calculation of each ${\delta }^{(k)}_{i,j}(1)$ is omitted here for its simplicity. To compute ${\delta }^{(k)}_{i,j}(0)$ ’s, $\rho +1$ real multiplications are required for $(1-2\epsilon) \prod \limits _{j'\in {N_{i}}}{\xi _{j',i}^{(k-1)}(0)}$ , followed by one real multiplication and one real addition to obtain each ${\delta }^{(k)}_{i,j}(0)$ . Thus, $2\rho +1$ real multiplications and $\rho$ real additions are required for all the ${\delta }^{(k)}_{i,j}(x_{j})$ ’s. Then, we consider ${\xi }^{(k)}_{j,i_{1}}(x_{j})$ ’s based on (7). To compute ${\xi }^{(k)}_{j,i_{1}}(0)$ ’s or ${\xi }^{(k)}_{j,i_{1}}(1)$ ’s, $\gamma +1$ real multiplications are required for $(1-\alpha)\prod \limits _{i'\in {M_{j}}}{\delta _{i',j}^{(k)}(0)}$ or $\alpha \prod \limits _{i'\in {M_{j}}}{\delta _{i',j}^{(k)}(1) }$ , followed by one more real multiplication to obtain each ${\xi }^{(k)}_{j,i_{1}}(0)$ or ${\xi }^{(k)}_{j,i_{1}}(1)$ . Thus, for all the $\rho$ VNs associated with the chosen CN, it requires $2\rho (2\gamma +1)$ real multiplications to compute ${\xi }^{(k)}_{j,i_{1}}(x_{j})$ ’s.

To summarize, $4\rho (\gamma +1)+1$ real multiplications and $\rho$ real additions are required for the message update in each iteration of the RSBP algorithm. Based on this result we provide a complexity comparison between various test designs. For the Bernoulli design, the expectations of the row weights and the column weights are $\overline {\rho }=\ln 2/\alpha$ and $\overline {\gamma }=T\ln 2/(N\alpha)$ , respectively. For the $\rho$ -regular test design, $\overline {\rho }=\rho$ and $\overline {\gamma }=T\rho /N$ . Using Setting 2 as an example, the expected computational complexity per iteration of the RSBP algorithm with various test designs is listed in Table I. It can be seen that the expected computational complexity increases quadratically as $\overline {\rho }$ increases. For the RSBP algorithm with $\rho _{\text {opt}}$ , its computational complexity per iteration is about 3.6 times less than the RSBP algorithm with $\rho _{0}$ or the Bernoulli design.

TABLE 1 Expected Computational Complexity Per Iteration of the RSBP Algorithm With Various Test Designs When

$\alpha=0.01$ ,

$N=1000$ and

$T=200$

$Table 1- Expected Computational Complexity Per Iteration of the RSBP Algorithm With Various Test Designs When $\alpha=0.01$ , $N=1000$ and $T=200$$

SECTION V.

Conclusion

This paper presents an empirical study to optimize the pool size in NAGT for BP. Our analysis shows the effectiveness of the messages from tests in BP and leads to the proposed objective function for pool size optimization. Simulation results show that the RSBP algorithm with our optimized pool size achieves lower error rates in the low-noise region than those required to maximize the entropy of each test.

Cites in Papers - |

Cites in Papers - Other Publishers (1)

Amin Coja-Oghlan, Max Hahn-Klimroth, Lukas Hintze, Dominik Kaaser, Lena Krieg, Maurice Rolvien, Olga Scheftelowitsch, "Noisy group testing via spatial coupling", Combinatorics, Probability and Computing, pp.1, 2024.

CrossRef Google Scholar

References is not available for this document.

MIT Libraries

MIT Libraries

Belief Propagation With Optimized Pool Size for Non-Adaptive Group Testing: An Empirical Study

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction