Introduction
In future, WSNs are expected to be integrated into the Internet of Things [1], where reconfigurable, flexible, and intelligent sensors join the Internet dynamically, and use it to collaborate and accomplish their tasks for a wide range of applications various domains [2]–[16], big data applications, Internet of Things, E-commerce, multimedia medical device, virtual reality & augmented reality, and environment monitoring, etc. Multimedia applications of WSNs require substantially low power consumption, higher bandwidth and more available spectrum [17]–[23]. However, today’s radio spectrum is very crowded for rapid increasing popularities of various wireless applications. Therefore, cognitive radio technology stands as a key and spectrum-efficient communication approach for resource-constrained wireless sensor networks and future wireless network [24]–[26]. When cognitive users, namely, sensor nodes in wireless sensor networks access the spectrum competitively, in order to make effective usage of the spectrum and meet the throughput demand for multimedia applications, effective mechanisms are required to coordinate cognitive users’ behaviors (transmission power control, spectrum access, et al.) [27]–[35]. At the same time, the wireless environment is gradually to be more complicated, which makes it more difficult for cognitive users to obtain the complete network environment parameters information in the wireless network system. Also, the upcoming 5G networking architectures with a super high spectrum utilization and ultra-low power consumption, tends to support massive devices with limited bandwidth and minimum content delay, which would be a big challenge for the current crowded spectrum. Hence, distributed channel access under unknown environment information has been a hot research topic in cognitive radio based wireless sensor network.
Currently, many of the existing literatures (see [36]–[41]) have studied the problem of distributed spectrum access under unknown environment statistical information. Among these literatures, [36] and [37] have researched on the Upper Confidence Bound (UCB) index algorithm with single cognitive user based on multi-armed bandit model, while [38]–[41] focused on the multi-user situation, which transform the unknown system model into Non-Bayesian multi-armed bandit model. Considering the single cognitive user situation is too simple, the current trend is to study multi-armed bandit model with multi-user and multi-channel. Meanwhile, all cognitive users in the system do not know the channel environment statistical information. There will be collisions when more than one cognitive user accesses the same channel simultaneously. To address the problem, [42] proposes an adaptive random access scheme to reduce the collision among cognitive users. The priority access scheme and fair access scheme are also adopted in literature [43] and [44], which could effectively avoid the collision among cognitive users. Based on the UCB1 index algorithm, the literature [45] puts forward the d-UCB4 algorithm, which is suitable for the multi-user scenarios. Literature [46] proposes a distributed online learning access scheme called DSLA, which can be widely applied to distribution learning system and its channels have a variety of channel available probability.
Most of the current literatures, such as [47] and [48] are limited that one cognitive user can only select one channel each time. In this situation, when the selected channel is detected to be in a busy state, the data will not be transmitted and it has to wait for the next slot. However, there may be other channels that are not occupied presently, which will give rise to a waste of spectrum resources. For this problem, we propose a scheme which is based on channel grouping. In each time slot, cognitive users can sense all of the channels in the group one by one, until finding an idle channel or all of the channels are perceived once. Here, these channels are divided into several groups according to the water-filling principle based on the modified UCB-tuned index. On the other hand, many literatures don’t take the fairness among the cognitive users into consideration. In this paper, we introduce the fairness access scheme based on the channel grouping method. Finally, the experimental results show that the proposed scheme can get a logarithmic regret with respect to time slot, and increase the selected number of the channels that has a small idle probability, so as to ensure the fairness among the cognitive users.
This paper is organized as follows. Section II introduces the system model. Then, section III introduces the proposed scheme of this paper, namely, the distributed learning algorithm, and the principle of channels grouping and the access principle. Section IV gives the scheme simulation and analysis. At last, we summarize the paper and point out the future research.
System Model
As shown in Fig. 1, a CR-WSN coexisting with a licensed system is considered. The time is slotted and the
We denote time slot as
In the distributed CR-WSN, each cognitive user can select one opportunity channel from the \begin{equation} \Re ^{\pi }( {\Theta ;n} )=n\sum \limits _{i\in \mathrm {O}_{M}^{\ast }} {\mu _{i} -E^{\pi }} \left [{ {\sum \limits _{t=1}^{n} {S_{\pi \left ({ t }\right )} ( t )}} }\right ] \end{equation}
Where, \begin{equation} S_{\pi \left ({ t }\right )} \left ({ t }\right )=\sum \limits _{m=1}^{M} {\sum \limits _{i=1}^{N} {X_{i} ( t )\times \Omega _{m,i} ( t )}} \end{equation}
In formula (2), \begin{equation} \Re ^{\pi }\left ({ {\Theta ;n} }\right )=n\sum \limits _{i\in {{ O}}_{M}^{\ast }} {\mu _{i} -} \sum \limits _{m=1}^{M} {\sum \limits _{i=1}^{N} {\mu _{i} \times E\left [{ {V_{_{m,i}}^{\pi } \left ({ t }\right )} }\right ]}} \end{equation}
The Principle of the Proposed Scheme
A. Distributed Learning Algorithm Based on Modified UCB-Tuned Index
Due to the channel environment statistical information is completely unknown to cognitive users, we need to predict the channel information by learning algorithm. The proposed distributed learning algorithm in this paper is briefly called UCBT-K that modified based on the tuned upper confidence bound (UCBT) index, which introduces a variance factor in the index and it is generalized form of the modified UCBT. The UCBT-K can select arbitrarily channel with the K-th [49] largest index value. The process of UCBT-K on
Initialization: Cognitive user
senses all the channelsone by one. At time slot$m$ , the instantaneous throughput of channel$t$ is$i$ . The sensed number of times of channel$\hat {\mu }_{m,i} \left ({ t }\right )=X_{i} \left ({ t }\right )$ is$i$ .$T_{i} \left ({ t }\right )=1$ Channel Estimation: Calculate the improved UCB-tuned index for each channel according to the mathematical formula (4):
\begin{align}&\hspace {-0.6pc} \hat {\mu }_{m,i} \left ({ {t-1} }\right )+\sqrt {\frac {\log \left ({ t }\right )}{T_{i} \left ({ {t-1} }\right )}} \notag \\[-2pt]&\times \sqrt {\min \left ({ {\frac {1}{4},\delta _{i}^{2}\left ({ {t-1} }\right )+\sqrt {2\log n/T_{i} \left ({ {t-1} }\right )}} }\right )}\qquad \end{align} View Source\begin{align}&\hspace {-0.6pc} \hat {\mu }_{m,i} \left ({ {t-1} }\right )+\sqrt {\frac {\log \left ({ t }\right )}{T_{i} \left ({ {t-1} }\right )}} \notag \\[-2pt]&\times \sqrt {\min \left ({ {\frac {1}{4},\delta _{i}^{2}\left ({ {t-1} }\right )+\sqrt {2\log n/T_{i} \left ({ {t-1} }\right )}} }\right )}\qquad \end{align}
Where,
is the reward variance of the channel$\delta _{i}^{2}\left ({ t }\right )$ after t time slots. Then, we construct a channel set$i$ that contains the K largest index values.$o_{K} $ Channel Selection: Select a channel
from$k$ based on the following mathematical formula (5):$o_{K}$ \begin{align} k=&\arg \min \limits _{i\in o_{K}} \hat {\mu }_{m,i} \left ({ {t-1} }\right )+\sqrt {\frac {\log \left ({ t }\right )}{T_{i} \left ({ {t-1} }\right )}} \notag \\[-2pt]&\times \,\sqrt {\min \left ({ {\frac {1}{4},\delta _{i}^{2}\left ({ {t-1} }\right )+\sqrt {2\log \left ({ t }\right )/T_{i} \left ({ {t-1} }\right )}} }\right )}\notag \\[-2pt] {}\end{align} View Source\begin{align} k=&\arg \min \limits _{i\in o_{K}} \hat {\mu }_{m,i} \left ({ {t-1} }\right )+\sqrt {\frac {\log \left ({ t }\right )}{T_{i} \left ({ {t-1} }\right )}} \notag \\[-2pt]&\times \,\sqrt {\min \left ({ {\frac {1}{4},\delta _{i}^{2}\left ({ {t-1} }\right )+\sqrt {2\log \left ({ t }\right )/T_{i} \left ({ {t-1} }\right )}} }\right )}\notag \\[-2pt] {}\end{align}
Update Information: if the selected channel is idle, update the average reward
of cognitive user$\hat {\mu }_{m,k} \left ({ t }\right )$ .$m$ \begin{equation} \hat {\mu }_{m,k} \left ({ t }\right )=\frac {\hat {\mu }_{m,k} \left ({ {t-1} }\right )\times T_{k} \left ({ {t-1} }\right )+X_{k} \left ({ t }\right )}{T_{k} \left ({ {t-1} }\right )+1} \end{equation} View Source\begin{equation} \hat {\mu }_{m,k} \left ({ t }\right )=\frac {\hat {\mu }_{m,k} \left ({ {t-1} }\right )\times T_{k} \left ({ {t-1} }\right )+X_{k} \left ({ t }\right )}{T_{k} \left ({ {t-1} }\right )+1} \end{equation}
And
, otherwise, there are$T_{i} \left ({ t }\right )=T_{i} \left ({ {t-1} }\right )+1$ and$\hat {\mu _{k}^{m}} \left ({ t }\right )=\hat {\mu _{k}^{m}} \left ({ {t-1} }\right )$ .$T_{i} \left ({ t }\right )=T_{i} \left ({ {t-1} }\right )$
B. The Principle of Channels Grouping
There are
Firstly, we let the channel with the largest index value in the set
C. The Access of Channel-Groups With Fairness
Since there are multiple cognitive users in the system, it is necessary to find a reasonable channel access scheme to avoid collisions among cognitive users. In literature [42], the priority access scheme is adopted to avoid collisions. First, it assigns a rank of priority access for each cognitive user. Then each cognitive user chooses the corresponding channel according to its rank. For example, for cognitive user 1 its priority is the first, so it can choose the channel with largest index value every time. In term of the cognitive user 2, its priority is the second, it will always select the one with second largest index value. And the other cognitive users select in the same way. As a result, each cognitive user has a corresponding channel. Although this scheme can effectively avoid the collision among cognitive users, it does not reflect the fairness among cognitive users. Therefore, to solve the problem of unfairness, we propose an access scheme which is based on channel grouping, as shown in Fig. 4.
As mentioned above, each cognitive user has a corresponding channel group. For arbitrary cognitive user \begin{equation} G_{j} =\left ({ {\left ({ {m+t} }\right )\bmod M} }\right )+1 \end{equation}
The algorithm implementation of the proposed scheme can be shown in Fig. 5, which includes three parts: distributed learning, channel grouping and fair access scheme.
Lemma 1:
The existence of regret value, assume that the parameter \begin{align} \Pr \left \{{ {\hat {\mu }_{j\left ({ t }\right ),n_{j\left ({ t }\right )}} \leq \mu _{j\left ({ t }\right )} -C_{t,n_{j\left ({ t }\right )}}} }\right \}\leq t^{-4} \\[3pt] \Pr \left \{{ {\hat {\mu }_{i,n_{i}} \geq \mu _{i} +C_{t,n_{i}}} }\right \}\leq t^{-4} \end{align}
Proof:
As \begin{align} \hat {\mu }_{j\left ({ t }\right ),Q_{j_{\left ({ t }\right )} (t-1)}^{m}} +C_{t-1,Q_{j_{\left ({ t }\right )}}^{m} (t-1)} \leq \hat {\mu }_{i,Q_{i}^{m} (t-1)} +C_{t-1,Q_{i_{\left ({ t }\right )}}^{m} (t-1)}\notag \\[3pt] {}\end{align}
\begin{align}&\hspace {-1.5pc}Q^{m}_{i} \left ({ n }\right )=1+\mathop \sum \limits _{t=N+1}^{n} \parallel \left \{{ {I_{i} \left ({ t }\right )} }\right \}\leq l \notag \\[3pt]&\qquad \quad ~~+\mathop \sum \limits _{t=N+1}^{n} \parallel \{I_{i} \left ({ t }\right ),Q^{m}_{i} \left ({ {t-1} }\right )\geq l\} \notag \\[3pt]\leq&l+\mathop \sum \limits _{t=N+1}^{n} \left ({ {\begin{array}{l} \parallel \{I_{i} \left ({ t }\right ),\mu _{i} <\mu _{K_{m}^{\ast }} ,Q^{m}_{i} \left ({ {t-1} }\right )\geq l\} \\[3pt] +\parallel \{I_{i} \left ({ t }\right ),\mu _{i} >\mu _{K_{m}^{\ast }} ,Q^{m}_{i} \left ({ {t-1} }\right )\geq l\} \\[3pt] \end{array}} }\right )\notag \\[3pt] {}\end{align}
Where, if the number of
According to the (10), there are:\begin{align}&\hspace {-1.2pc}\mathop \sum \limits _{t=N+1}^{n} \parallel \{I_{i} \left ({ t }\right ),\mu _{i} <\mu _{K_{m}^{\ast }} ,Q_{i}^{m} (t-1)\geq l\} \notag \\\leq&\mathop \sum \limits _{t=N+1}^{n} \parallel \left \{{ {\begin{array}{l} \hat {\mu }_{j( t ),Q_{j_{( t )}}^{m} (t-1)} +C_{t-1,Q_{j_{( t )}}^{m} (t-1)} \\ \leq \hat {\mu }_{i,Q_{i}^{m} ( {t-1} )} +C_{t-1,Q_{i}^{m} ( {t-1} )} ,Q_{i}^{m} ( {t-1} )\geq l \\ \end{array}} }\right \} \notag \\\leq&\mathop \sum \limits _{t=N+1}^{n} \parallel \Biggl \{{ \min \limits _{0<n_{j(t)<t}} \hat {\mu }_{j\left ({ t }\right ),n_{j(t)}} +C_{t-1,n_{j(t)}} }\notag \\&\qquad \qquad {\leq \max \limits _{l\leq n_{i} <t} \hat {\mu }_{i,n_{i}} +C_{t-1,n_{i}} }\Biggr \} \notag \\\leq&\mathop \sum \limits _{t=1}^{\infty } \mathop \sum \limits _{n_{j\left ({ t }\right )=1}}^{t-1} \mathop \sum \limits _{n_{i} =l}^{t-1} 1\{\hat {\mu }_{j\left ({ t }\right ),n_{j\left ({ t }\right )}} +C_{t,n_{j\left ({ t }\right )}} \leq \hat {\mu }_{i,n_{i}} +C_{t,n_{i}} \} \end{align}
From the (12), for \begin{align} \mu _{j(t)}<&\mu _{i} +2C_{t,n_{i}} \\ \hat {\mu }_{j\left ({ t }\right ),n_{j\left ({ t }\right )}}\leq&\mu _{j\left ({ t }\right )} -C_{t,n_{j(t)}} \\ \hat {\mu }_{i,n_{i}}\geq&\mu _{i} +C_{t,n_{i}} \end{align}
For \begin{align}&\hspace {-2pc}\mu _{j(t)} -\mu _{i} -2C_{t,n_{i}} \notag \\\geq&\mu _{K_{m}^{\ast }} -\mu _{i} -2\sqrt {\frac {2\Delta _{K_{m}^{\ast } ,i}^{2} \ln t}{8\ln n}} \notag \\\geq&\mu _{K_{m}^{\ast }} -\mu _{i} -\Delta _{K_{m}^{\ast } ,i} =0 \end{align}
So it is clearly false for the inequality (13). Similarly, we can get the inequality (14) and the inequality (15) hold.
According to the Chernoff-Hoeffding bound [46] for (14) and (15), we can get:\begin{align} \Pr \left \{{ {\hat {\mu }_{j\left ({ t }\right ),n_{j\left ({ t }\right )}} \leq \mu _{j\left ({ t }\right )} -C_{t,n_{j\left ({ t }\right )}}} }\right \}\leq e^{-4\ln t}=t^{-4} \\ \Pr \left \{{ {\hat {\mu }_{i,n_{i}} \geq \mu _{i} +C_{t,n_{i}}} }\right \}\leq e^{-4\ln t}=t^{-4} \end{align}
Lemma 2:
When \begin{align} \Pr \left \{{ {\hat {\mu }_{i,n_{i}} \leq \mu _{i} -C_{t,n_{i}}} }\right \}\leq t^{-4} \\ \Pr \left \{{ {\hat {\mu }_{h\left ({ t }\right ),n_{h\left ({ t }\right )}} \geq \mu _{h\left ({ t }\right )} +C_{t,n_{h\left ({ t }\right )}}} }\right \}\leq t^{-4} \end{align}
Proof:
In the case of
If
, there are:$\boldsymbol {o}_{K_{m}^{\ast }} =\boldsymbol {o}_{K_{m}^{\ast }}^{\ast } $ \begin{align}&\hspace {-2pc}\hat {\mu }_{i,Q_{i}^{m} \left ({ {t-1} }\right )} -C_{t-1,Q_{i}^{m} \left ({ {t-1} }\right )} \notag \\\leq&\hat {\mu }_{K_{m}^{\ast } ,Q_{K_{m}^{\ast }}^{m} \left ({ {t-1} }\right )} -C_{t-1,Q_{K_{m}^{\ast }}^{m} \left ({ {t-1} }\right )} \end{align} View Source\begin{align}&\hspace {-2pc}\hat {\mu }_{i,Q_{i}^{m} \left ({ {t-1} }\right )} -C_{t-1,Q_{i}^{m} \left ({ {t-1} }\right )} \notag \\\leq&\hat {\mu }_{K_{m}^{\ast } ,Q_{K_{m}^{\ast }}^{m} \left ({ {t-1} }\right )} -C_{t-1,Q_{K_{m}^{\ast }}^{m} \left ({ {t-1} }\right )} \end{align}
If
, there is at least one channel$\boldsymbol {o}_{K_{m}^{\ast }} \ne \boldsymbol {o}_{K_{m}^{\ast }}^{\ast } $ in the set$h\left ({ t }\right )\notin \boldsymbol {o}_{K_{m}^{\ast }}^{\ast } $ , we can get:$\boldsymbol {o}_{K_{m}^{\ast }} $ \begin{align}&\hspace {-2pc}\hat {\mu }_{i,Q_{i}^{m} \left ({ {t-1} }\right )} -C_{t-1,Q_{i}^{m} \left ({ {t-1} }\right )}\notag \\\leq&\hat {\mu }_{h(t),Q_{h(t)}^{m} \left ({ {t-1} }\right )} -C_{t-1,Q_{h(t)}^{m} \left ({ {t-1} }\right )} \end{align} View Source\begin{align}&\hspace {-2pc}\hat {\mu }_{i,Q_{i}^{m} \left ({ {t-1} }\right )} -C_{t-1,Q_{i}^{m} \left ({ {t-1} }\right )}\notag \\\leq&\hat {\mu }_{h(t),Q_{h(t)}^{m} \left ({ {t-1} }\right )} -C_{t-1,Q_{h(t)}^{m} \left ({ {t-1} }\right )} \end{align}
For the above two situations, \begin{align} \hat {\mu }_{i,Q_{i}^{m} ( {t-1} )} -C_{t-1,Q_{i}^{m} ( {t-1} )} \leq \hat {\mu }_{K_{m}^{\ast } ,Q_{K_{m}^{\ast }}^{m} ( {t-1} )} -C_{t-1,Q_{K_{m}^{\ast }}^{m} ( {t-1} )}\notag \\ {}\end{align}
In the same way, we can get:\begin{align} \hat {\mu }_{i,Q_{i}^{m} \left ({ {t-1} }\right )} -C_{t-1,Q_{i}^{m} \left ({ {t-1} }\right )}\leq&\hat {\mu }_{h(t),Q_{h(t)}^{m} \left ({ {t-1} }\right )} -C_{t-1,Q_{h(t)}^{m} \left ({ {t-1} }\right )}\notag \\ \\ \mu _{i}<&\mu _{h\left ({ t }\right )} +2C_{t,n_{i}} \\ \hat {\mu }_{i,n_{i}}\leq&\mu _{i} -C_{t,n_{i}} \\ \hat {\mu }_{h(t),n_{h(t)}}\geq&\mu _{h\left ({ t }\right )} +C_{t,n_{h(t)}} \end{align}
From (25)–(27), there are at least one true inequality, for \begin{equation} \mu _{i} -\mu _{h(t)} -2C_{t,n_{i}} \geq \mu _{i} -\mu _{K_{m}^{\ast }} -\Delta _{K_{m}^{\ast } ,i} \geq 0 \end{equation}
According to the inequality (28), the formula (25) is obviously wrong, which means the others are true.
According to the Chernoff-Hoeffding bound for the formula (26) and (27), we can get the following formula, respectively:\begin{align} \Pr \left \{{ {\hat {\mu }_{i,n_{i}} \leq \mu _{i} -C_{t,n_{i}}} }\right \}\leq e^{-4\ln t}=t^{-4} \\ \Pr \left \{{ {\hat {\mu }_{h\left ({ t }\right ),n_{h\left ({ t }\right )}} \geq \mu _{h\left ({ t }\right )} +C_{t,n_{h\left ({ t }\right )}}} }\right \}\leq e^{-4\ln t}=t^{-4} \end{align}
Theorem 1:
The expected regret under the proposed scheme is bounded and grows logarithmically with time slot going on.
Proof:
Because the selection times of the cognitive user \begin{align}&\hspace {-1.2pc} Q^{m}_{i} \left ({ n }\right )=1+\mathop \sum \limits _{t=N+1}^{n} \parallel \left \{{ {I_{i} \left ({ t }\right )} }\right \} \notag \\\leq&l+\mathop \sum \limits _{t=N+1}^{n} \parallel \{I_{i} \left ({ t }\right ),Q^{m}_{i} \left ({ {t-1} }\right )\geq l\} \notag \\\leq&l+\mathop \sum \limits _{t=N+1}^{n} \left ({ {\begin{array}{l} \parallel \{I_{i} \left ({ t }\right ),\mu _{i} <\mu _{K_{m}^{\ast }} ,Q^{m}_{i} \left ({ {t-1} }\right )\geq l\} \\ +\parallel \{I_{i} \left ({ t }\right ),\mu _{i} >\mu _{K_{m}^{\ast }} ,Q^{m}_{i} \left ({ {t-1} }\right )\geq l\} \\ \end{array}} }\right )\notag \\ {}\end{align}
Here we need to discuss two cases: namely, \begin{align}&\hspace {-1.2pc}E\left [{ {Q_{i}^{m} \left ({ n }\right )} }\right ] \notag \\\leq&\left [{ {\frac {8\ln n}{\Delta _{\min ,i}^{2}}} }\right ]\notag \\&+\,\sum \limits _{t=1}^\infty {\sum \limits _{n_{j\left ({ t }\right )} =1}^{t-1} {\sum \limits _{n_{i} =y}^{t-1} {\left ({ {\begin{array}{l} \Pr \left \{{ {\hat {\mu }_{j\left ({ t }\right ),n_{j\left ({ t }\right )}} \leq \mu _{j\left ({ t }\right )} -C_{t,n_{j\left ({ t }\right )}}} }\right \} \\ +\Pr \left \{{ {\hat {\mu }_{i,n_{i}} \geq \mu _{i} +C_{t,n_{i}}} }\right \} \\ \end{array}} }\right )}}} \notag \\&+\,\sum \limits _{t=1}^\infty {\sum \limits _{n_{i} =y}^{t-1} {\sum \limits _{n_{h\left ({ t }\right )} =1}^{t-1} {\left ({ {\begin{array}{l} \Pr \left \{{ {\hat {\mu }_{i,n_{i}} \leq \mu _{i} -C_{t,n_{i}}} }\right \} \\ +\Pr \left \{{ {\hat {\mu }_{h(t),n_{h(t)}} \geq \mu _{h\left ({ t }\right )} +C_{t,n_{h(t)}}} }\right \} \\ \end{array}} }\right )}}} \notag \\\leq&\frac {8\ln n}{\Delta _{\min ,i}^{2}}+1+2\sum \limits _{t=1}^\infty {\sum \limits _{n_{j\left ({ t }\right )} =1}^{t-1} {\sum \limits _{n_{i} =1}^{t-1} {2t^{-4}}}} \notag \\\leq&\frac {8\ln n}{\Delta _{\min ,i}^{2}}+1+\frac {2\pi ^{2}}{3} \end{align}
For any cognitive user \begin{align} \Re ^{\pi }( {\Theta ,m,n} )\leq&\sum \limits _{i=1}^{n} {Q_{i}^{m} ( n )} \times \mu _{\max } \!+\!\sum \limits _{h\ne m} {\sum \limits _{i\in o_{m}^{\ast }} {Q_{h}^{m}}} ( n )\mu _{i}\qquad \\ \Re ^{\pi }\left ({ {\Theta ,n} }\right )=&\sum \limits _{m=1}^{M} {\Re ^{\pi }\left ({ {\Theta ,m,n} }\right )} \notag \\\leq&M\left ({ {N+M\times \left ({ {M-1} }\right )} }\right )\notag \\&\times \left ({ {\frac {8\ln n}{\Delta _{\min }}+1+\frac {2\pi ^{2}}{3}} }\right )\times \mu _{\max } \end{align}
Simulation and Analysis
A. The Complexity Analysis of Regret
According to the conclusion of Theorem 1, when \begin{align} \boldsymbol {R}^{\pi }\left ({ {\boldsymbol {\Theta },n} }\right )=&M\left ({ {N-M} }\right )\left ({ {\frac {8\ln n}{\Delta _{\min }}+1+\frac {2\pi ^{2}}{3}} }\right )\mu _{\max } \notag \\&+\,M^{3}\left ({ {1+\frac {2\pi ^{2}}{3}} }\right )\mu _{\max } \end{align}
From the formula (35), the scale of the proposed scheme is
It can be seen from table 1, the complexity of regret function produced by these regular schemes, such as
B. Simulation Results
In this experiment, we assume that these channels are mutually independent and identically distributed. The availability of channels obeys Bernoulli process with different parameters, while it is unknown to the
In the case of
At the same time, we provide the bar chart about the channel selected situation running on three strategies. The expression can be shown in Fig. 7, Fig. 8, Fig. 9, respectively.
Fig. 7 shows that there is a serious unfairness among various cognitive users. From Fig. 8, we can find the fairness between different cognitive users has been reflected, but the usage of the channels with smaller idle probability is low, such as channel
Conclusion
In this paper, a multiple cognitive users and multiple channels system is considered in CR-WSNs. The channels statistical information is completely unknown for the cognitive users. To solve the problem of channel selection in CR-WSNs, a novel fair access scheme with channel grouping is proposed. Firstly, an online learning method called modified UCB-K based on the well-known UCB is used. Then these channels are divided into several groups according to the principle of channel grouping, which can improve the usage rate of the idle spectrum. Besides, we adopted the distributed learning with fairness to avoid collision between cognitive users and at the same time embody the fairness between cognitive users. Finally, the simulation results also show the superiority of the proposed scheme. With the development of the Internet of Things, the scale of networks is larger and larger. How to obtain and deal with the large amount of channel information for larger scale wireless sensor networks may be a promising research topic, and schemes based on wireless big data may be adopted.