Journals & Magazines >IEEE Access >Volume: 8

Antenna Optimization Based on Co-Training Algorithm of Gaussian Process and Support Vector Machine

This paper proposes a semi-supervised co-training algorithm based on Gaussian process (GP) and support vector machine (SVM). Initial GP model and initial SVM model can be...

Abstract:

For the optimal design of electromagnetic components, surrogate model methods can usually be used, but obtaining labeled training samples from full-wave electromagnetic s...Show More

Metadata

Abstract:

For the optimal design of electromagnetic components, surrogate model methods can usually be used, but obtaining labeled training samples from full-wave electromagnetic simulation software is most time-consuming. How to use relatively few labeled samples to obtain a relatively high-precision surrogate model is the current electromagnetic research hotspot. This article proposes a semi-supervised co-training algorithm based on Gaussian process (GP) and support vector machine (SVM). By using a small number of initial training samples, the initial GP model and initial SVM model can be trained by some basic parameter settings. Moreover, the accuracy of these two models can be improved by using the differences between these two models and combining with unlabeled samples for jointly training. In the co-training process, to ensure the performance of the proposed algorithm, a stop criterion set in advance to control the number of unlabeled samples introduced. Therefore, the accuracy of the model can be prevented from being reduced by introducing too much unlabeled samples, which can find the best solution in the limited time. The proposed co-training algorithm is evaluated by benchmark functions, optimal design of Yagi microstrip antenna (MSA) and GPS Beidou dual-mode MSA. The results show that the proposed algorithm fits the benchmark functions well. For the problem of resonant frequency modeling of the above two different MSAs, under the condition of using the same labeled samples, the predictive ability of the proposed algorithm is improved compared with the traditional supervised learning method. Moreover, for the groups of antenna sizes that meet the design requirements, the fitting effects of their return loss curve (S11) are well. The effectiveness of the proposed co-training algorithm has been well verified, which can be used to replace the time-consuming electromagnetic simulation software for prediction.

This paper proposes a semi-supervised co-training algorithm based on Gaussian process (GP) and support vector machine (SVM). Initial GP model and initial SVM model can be...

Published in: IEEE Access ( Volume: 8)

Page(s): 211380 - 211390

Date of Publication: 19 November 2020

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2020.3039269

Funding Agency:

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Basic Introduction

In the optimization fields of electromagnetic components, it is common to use numerical simulation calculation or full-wave electromagnetic simulation software such as high frequency structure simulator (HFSS), computer simulation technology (CST) combined with global optimization algorithms [1], [2]. As we know, inverse electromagnetic problem is a hotspot in the field of computational electromagnetics [3], [4]. From the perspective of applications, this problem can be divided into two categories: parameter identification problem and optimization design problem. The latter has been studied more in our research group, which the essence is to give the desired performance index of an electromagnetic system, achieving this goal by optimizing the parameters. It is also a comprehensive problem, according to the performance requirements of electromagnetic devices, so as to synthesize the structure and size of electromagnetic devices for antenna optimization.

According to the previous researches, HFSS simulation can be used to obtain high-precision results to acquire labeled samples for many times. However, if the structure of the microwave device is complex, each call will require a lot of time. Therefore, the calculation cost is high, and the time needed is long. The usage of modeling methods instead of calling HFSS for analysis can effectively save time and has been a hot topic. Popular modeling methods such as artificial neural network (ANN) [5], [6], support vector machine (SVM) [7], [8], kernel extreme learning machine (KELM) [9], [10], Gaussian process (GP) [11], [12] et al., have been achieved.

Electromagnetic problems are generally small sample ones, both GP and SVM modeling methods are widely used in antenna optimization. GP is a machine learning (ML) method gradually developed in recent years. It has a strict statistical theoretical basis and is suitable for solving problems such as small samples, high dimensionality and nonlinearity. SVM is also a common ML method, which has unique advantages in solving problems such as small samples, nonlinear and high-dimensional modes. Resonant frequency is an important technical index in the optimal design of antennas. Obtaining resonant frequency quickly by known structural parameters of the antenna is often used in the research of modern antenna design [13], [14]. Both GP model and SVM model are widely used in antenna resonant frequency modeling [15], [16]. The trained model can establish a mapping relationship between the antenna-related parameters and the measured resonant frequency, so as to predict the resonant frequencies of other antenna parameters, reducing the number of calls to HFSS for accurate results.

The existing modeling of electromagnetic behavior is based on supervised learning, and the labeled training samples used are simulated by the simulation software such as HFSS [17], [18] et al. The frequency of full-wave electromagnetic simulation is the main factor affecting the training efficiency of the model such as GP and SVM mentioned above. Therefore, based on the existing researches, semi-supervised learning (SSL) [19], [20] method is proposed, so as to obtain the satisfied accuracy in a short time. Traditional ML techniques rely on a large number of labeled samples or unlabeled samples for training. In practical applications, it is difficult to obtain labeled samples, while unlabeled samples are easier to be obtained. Considering that unlabeled samples and labeled samples are usually distributed independently and identically, SSL method breaks through the limitations of just considering one type of samples, so as to mining hidden information of unlabeled samples to assist labeled samples for training. It is mainly divided into semi-supervised clustering [21], semi-supervised classification [22], semi-supervised dimensionality reduction [23], and semi-supervised regression [24], [25]. Co-training is a commonly used SSL method based on divergence. It has a solid theoretical foundation and a wide range of applications [26], [27]. The co-training algorithm was first proposed by A. Blum and T. Mitchell in 1998 [28], which has been continuously developed and gradually penetrated into many fields, such as natural language processing [29], image retrieval [30] et al. In the field of electromagnetism, to the best of our knowledge, there is no relevant researches, which is also the reason why this subject is worth researching.

Traditional co-training algorithm focuses on classification problems [31], lacking of researches on regression problems. Therefore, this study improves the traditional co-training method and proposes a co-training method based on GP and SVM, applying SSL algorithm to the field of electromagnetic optimization. Differences between GP model and SVM model have been utilized, and these two models use the same unlabeled samples to generate pseudo-labeled samples for updating. During the iteration process, the termination conditions have been set. If the test error of the next iteration is higher than that of the previous iteration and the test error of the previous iteration has reached the error threshold, the iteration will stop. Hence, the reduction of model accuracy can be prevented due to too many unlabeled samples introduced. In the cases study section, two different benchmark functions, and optimal design of Yagi microstrip antenna (MSA) and GPS Beidou dual-mode MSA are used to evaluate the effectiveness of the proposed co-training algorithm. The results show that the proposed co-training algorithm fits the benchmark functions well. For the experiments of two different antennas optimization, the proposed co-training algorithm has better predictive ability than that of traditional supervised learning method by using the same label samples.

SECTION II.

Co-Training Algorithm

Co-training algorithm uses the differences between two different models, improving the performances of the model by introducing unlabeled samples. This article improves the traditional co-training method to make it more suitable for antenna optimization.

A. Gaussian Process

GP describes the covariance of predicted data by the covariance of input data. The parameters in the kernel function are called hyper-parameters. The process of model training is the process of selecting the kernel function and determining the hyper-parameters. The network link for software package with relevant instructions for GP modeling is shown in reference [32].

GP is a set consisting of countless random variables and any subset is in accordance with Gaussian distribution, mean function $m\left ({\mathbf {x} }\right)=E\left [{ {f\left ({\mathbf {x} }\right)} }\right]$ and covariance function $k\left ({\mathbf {x},\mathbf {x}^\prime }\right)E=\{ [(f(\mathbf {x})-m(\mathbf {x})][f(\mathbf {x}^\prime)-m(\mathbf {x}^{,})]$ determine the properties of GP, defined as $\begin{equation*} f\left ({\mathbf {x} }\right)\sim GP\left ({{m\left ({\mathbf {x} }\right),k\left ({\mathbf {x},\mathbf {x}^{\prime } }\right)} }\right)\tag{1}\end{equation*}$ View Source In the formula, $\mathbf {x,x}^\prime \in \mathbf {X}$ is any d-dimensional vector.

The observed value is polluted by additive noise $\varepsilon$ , which is a normally distributed random variable with a mean of 0 and a variance of $\sigma _{n}^{2}$ , that is $\begin{equation*} \varepsilon \sim N(0,\sigma _{n}^{2})\tag{2}\end{equation*}$ View Source

$\mathbf {x}$ is the input vector and the observation value polluted by noise. The obtained prior distribution is $\begin{equation*} y\sim N\left ({{0,K+\sigma _{n}^{2}\mathbf {I}} }\right)\tag{3}\end{equation*}$ View Source n training sample outputs y and $n^{\ast }$ test sample outputs f $^{\boldsymbol{ \ast }}$ form a joint Gaussian prior distribution, that is $\begin{align*} \left [{ {{\begin{array}{cccccccccccccccccccc} {\mathbf{y}} \\ \mathbf {f}^\ast \\ \end{array}}} }\right]\sim N\left ({{0,\left ({{{\begin{array}{cccccccccccccccccccc} {K(\mathbf {X,X})+\sigma _{n}^{2}{\mathbf{I}}} & {K(\mathbf {X,X^\ast })} \\ {K(\mathbf {X^\ast,X})} & {K(\mathbf {X^\ast,X^\ast })} \\ \end{array}}} }\right)} }\right)\tag{4}\end{align*}$ View Source where $K(\mathbf { X,X^\ast })$ is covariance matrix of order $n\times n^{\ast }$ between $n^{\ast }$ test output samples and $n$ training output samples, $K\mathbf {(X^\ast,X^\ast)}$ is the covariance matrix of order $n^{\ast }\times n^{\ast }$ for the output sample itself.

The optimal hyper-parameters are obtained by maximum likelihood estimation. By establishing the log-likelihood function of the conditional probability, the derivate of hyper-parameters can be calculated. The conjugate gradient optimization method is used to search for the optimal hyper-parameters, and the negative log-likelihood function is expressed as $\begin{equation*} \mathbf {l}=logp(\mathbf {y\vert X})=-\frac {1}{2}\mathbf {y}^{T}\mathbf {K}^{-1}\mathbf {y}-\frac {1}{2}\log \vert \mathbf {K}\vert -\frac {n}{2}\log 2\pi\tag{5}\end{equation*}$ View Source

After the optimal hyper-parameters are obtained, predictions can be done. Given the new input $\mathbf {x}^\ast$ , the input value X of the training sample set and the observation target value y, the maximum possible predicted posterior distribution can be inferred and given by $\begin{equation*} \mathbf {y}^\ast \vert \mathbf {x}^{\ast },\mathbf {X},\mathbf {y}\sim N\left({\mathbf {m},\sum }\right)\tag{6}\end{equation*}$ View Source m and $\sum$ are predicted mean and predicted variance, that is $\begin{align*} \mathbf {m}=&K\left ({\mathbf {X^\ast,X} }\right)K\left ({\mathbf {X,X} }\right)^{-1}\mathbf {y} \tag{7}\\ \sum=&K\left ({\mathbf {X^\ast, X^\ast } }\right)-K\left ({\mathbf {X^\ast, X} }\right) K\left ({\mathbf {X,X} }\right)^{-1}K\left ({\mathbf {X, X^\ast } }\right) \\{}\tag{8}\end{align*}$ View Source

The size of the predicted variance reflects the accuracy of the model at this point, the smaller the variance, the higher the model accuracy.

B. Support Vector Machine

SVM is a ML method that seeks the best compromise and obtains the best promotion ability based on the learning accuracy and learning ability of training samples. For the case of linear inseparability, nonlinear mapping is used to convert the linearly inseparable samples in the low-dimensional input space into the high-dimensional feature space to make it linearly separable. The core content of SVM is to select the kernel function. The commonly used kernel functions are linear kernel function, Gaussian kernel function and so on. The network link of software package with relevant instructions for SVM modeling is shown in reference [33].

The optimal hyper-plane sought by support vector regression (SVR) is to minimize the total deviation of all sample points from the optimal hyper-plane [34]. $\varepsilon -SVR$ uses the insensitive loss function, defined as $\begin{align*} L_{\varepsilon } (t,y)=\begin{cases} \displaystyle \left |{ {t-y} }\right |-\varepsilon, &\left |{ {t-y} }\right |\ge \varepsilon \\ \displaystyle 0,&\left |{ {t-y} }\right |\le \varepsilon \end{cases}\tag{9}\end{align*}$ View Source where $t$ is a target value, and y is the estimated output. The loss function indicates that the penalty error is much greater than $\varepsilon$ . Given training samples set $\left ({{x_{i},t_{i}} }\right)_{i=1}^{N},x\in R^{d}$ , SVR can obtain target output, that is $\begin{align*} y=&\sum \limits _{j=1}^{m} {w_{j}} \phi _{j} \left ({x }\right)+b=w\cdot \phi \left ({x }\right)+b \tag{10}\\ \phi \left ({x }\right)=&\left [{ {\phi _{1} \left ({x }\right),\cdots \phi _{m} \left ({x }\right)} }\right],\quad w=\left [{ {w_{1},\cdots w_{m}} }\right]^{T}\tag{11}\end{align*}$ View Source

$\phi \left ({x }\right)$ maps the training data from the input space to the feature space, and the data points within the channels divided by $\varepsilon$ do not affect the optimization. Reaching the minimization of the empirical risk and the minimum $\left \|{ w }\right \|$ , it becomes a convex optimization problem as follows $\begin{align*} \begin{cases} \displaystyle {\min:\frac {1}{2}\left \|{ w }\right \|^{2}} \\ \displaystyle {s.t.:t_{i} -w\cdot \phi \left ({{x_{i}} }\right)-b\le \varepsilon } \\ \displaystyle {w\cdot \phi \left ({{x_{i}} }\right)+b-t_{i} \le \varepsilon,\quad i=1,\cdots,N} \\ \displaystyle \end{cases}\tag{12}\end{align*}$ View Source

Introducing relaxation variables $\xi _{i},\xi _{i}^{\ast }$ , we can get $\begin{align*} \begin{cases} \displaystyle {\min:\frac {1}{2}\left \|{ w }\right \|^{2}+C\sum \limits _{i=1}^{N} {\left ({{\xi _{i} +\xi _{i}^{\ast }} }\right)}} \\ \displaystyle {s.t.:t_{i} -w\cdot \phi \left ({{x_{i}} }\right)-b\le \varepsilon +\xi _{i}^{\ast }} \\ \displaystyle {w\cdot \phi \left ({{x_{i}} }\right)+b-t_{i} \le \varepsilon +\xi _{i}^{\ast }} \\ \displaystyle {\xi _{i},\xi _{i}^{\ast } \ge 0,i=1,\cdots,N} \\ \displaystyle \end{cases}\tag{13}\end{align*}$ View Source

The original Lagrange optimization problem is given by $\begin{align*}&\hspace {-3pc} L_{p} \left ({{w,\xi,\xi ^{\ast },\alpha,\alpha ^{\ast },\gamma,\gamma ^{\ast }} }\right) \\=&\frac {1}{2}\left \|{ w }\right \|2+C\sum \limits _{i=1}^{N} {\left ({{\xi _{i} +\xi _{i}^{\ast }} }\right)} \\ ~&-\sum \limits _{i=1}^{N} {\alpha _{i}} \left [{ {w\cdot \phi \left ({{x_{i}} }\right)+b-t_{i} +\varepsilon +\xi _{i}} }\right] \\&-\sum \limits _{i=1}^{N} {\alpha _{i}^{\ast } \left [{ {t_{i} -w\cdot \phi \left ({{x_{i}} }\right)-b-\varepsilon +\xi _{i}^{\ast }} }\right]} \\&-\sum \limits _{i=1}^{N} {\left ({{\gamma _{i} \xi _{i} +\gamma _{i}^{\ast } \xi _{i}^{\ast }} }\right)}\tag{14}\end{align*}$ View Source where $\alpha,a^{\ast },\gamma,\gamma ^{\ast }\ge 0$ is Lagrange multiplier. The partial derivative of $L_{P}$ is about $\left ({{w,b,\xi _{i},\xi _{i}^{\ast }} }\right)$ equal to zero, that is $\begin{align*} \begin{cases} \displaystyle {\sum \limits _{i=1}^{N} {\left ({{\alpha _{i} -\alpha _{i}^{\ast }} }\right)=0} } \\ \displaystyle {w=\sum \limits _{i=1}^{N} {\left ({{\alpha _{i} -\alpha _{i}^{\ast }} }\right)} \phi \left ({{x_{i}} }\right)} \\ \displaystyle {\gamma _{i} =C-\alpha _{i},\gamma _{i}^{\ast } =C-a_{i}^{\ast }} \end{cases}\tag{15}\end{align*}$ View Source

The corresponding dual problem is expressed by $\begin{align*} \max=&L_{d} \left ({{\alpha _{i},\alpha _{i}^{\ast }} }\right)-\varepsilon \sum \limits _{i=1}^{N} {\left ({{\alpha _{i} -\alpha _{i}^{\ast }} }\right)} \\&-\frac {1}{2}\sum \limits _{i=1}^{N} {\sum \limits _{j=1}^{N} {\left ({{\alpha _{j} -\alpha _{j}^{\ast }} }\right)\phi ^{T}\left ({{x_{i}} }\right)}} \phi \left ({{x_{j}} }\right)\tag{16}\end{align*}$ View Source where $s.t.:\sum \limits _{i=1}^{N} {\left ({{\alpha _{i} -\alpha _{i}^{\ast }} }\right)} =0$ , $0\le \alpha _{i} \le C,i=1,\cdots,N,0\le \alpha _{i}^{\ast } \le C,i=1,\cdots,N$ .

Therefore, $w=\sum \limits _{i=1}^{N} {\left ({{\alpha _{i} -\alpha _{i}^{\ast }} }\right)} x_{i}$ can be gotten. Introducing the kernel function, it is $\begin{equation*} f\left ({x }\right)=w\cdot x=\sum \limits _{i=1}^{N} {\left ({{\alpha _{i} -\alpha _{i}^{\ast }} }\right)} K_{SVR} \left ({{x,x_{i}} }\right)+b\tag{17}\end{equation*}$ View Source where $N$ is the number of support vectors and determines the complexity of the structure.

C. Co-Training Algorithm

The traditional co-training algorithms are actually iterative ones, which need significant differences for updating each other. If the number of iterations is too large, the number of unlabeled samples introduced increases, which leads to the accumulation of noise, and the performance of the model will be reduced. Therefore, in view of the above problems, this article proposes a semi-supervised co-training algorithm based on GP model and SVM model. Using less labeled samples for initial training, and then unlabeled samples are combined with labeled samples for updating each other. A termination condition is set in advance, which is if the test error of the next iteration is higher than that of the previous iteration and the test error of the previous iteration has reached the set error threshold, the iteration will stop. According to the above method, the number of unlabeled samples introduced can be effectively controlled, preventing reducing the accuracy of the model by introducing too many unlabeled samples. Moreover, a satisfied solution can be got in a short time.

The co-training algorithm proposed in this article mainly includes initial training and co-training. In the initial training process, a small number of labeled samples simulated by HFSS software are used as initial training samples, then initial GP model and initial SVM model are trained respectively by these samples. In the co-training process, the two initial models use the same unlabeled samples to generate pseudo-labeled samples, which are used to cross-train the GP model and SVM model and update these two models, denoted as GP_time and SVM_time. The same test sample set simulated by HFSS is used to test the GP_time and SVM_time. Their test errors are compared, then the pseudo-labeled samples used in the model with smaller error and the test sample corresponding to the number of iterations in the test sample set are all added to the initial training sample set to further train GP model and SVM model. Repeat the iterative process continuously until the above stop criterion is met. The test samples added to the training sample set are removed from the test sample set, and the remaining test samples form the test sample set for the next iteration. Fig. 1 is a flowchart of the proposed algorithm, and Table 1 is the pseudo-code of the proposed algorithm.

TABLE 1 Pseudo Code of the Proposed Co-Training Algorithm

FIGURE 1.

Flow chart of the proposed co-training algorithm.

Show All

SECTION III.

Cases Study

Firstly, two benchmark functions have been used to verify the effectiveness of the proposed co-training algorithm, and then Yagi microstrip antenna (MSA) and GPS Beidou dual-mode MSA have been optimized by the proposed algorithm. The experimental results of resonant frequency modeling of the two MSAs are used as the basis for judging its generalization ability. Moreover, the optimal trained co-training model is used to fit the return loss curve (S₁₁) of one group of antenna size that meets the design requirements. The S₁₁ curve predicted by this model is compared with that of simulation by HFSS for verifying the performance of the proposed algorithm.

A. Benchmark Functions

In this study, the Griewank function and Quartic function are used firstly to test the performance of the proposed method, and their formulas are as (18)–(19). Both these two functions are set to 3 dimensions, the independent variable value interval is [−5, 5]. According to the complexity of functions and the initial errors, the error threshold is set to 1e-02, which is a well precision. Moreover, the maximum number of iterations is set to 100. At each iteration, the Relative Error (RE) is used for judging one test sample, and the Mean Relative Error (MRE) is used for judging the whole test sample set, formulas follow as (20)–(21). $\begin{align*} f(x)=&\sum \limits _{i=1}^{d} {\frac {x_{i}^{2}}{4000}} -\prod \limits _{i=1}^{d} {\cos \left({\frac {x_{i}}{\sqrt {i}}}\right)} +1 \tag{18}\\ f(x)=&\sum \limits _{i=1}^{D} {ix_{i}^{4} +random[0,1)} \tag{19}\\ RE=&\frac {\vert y_{pred} -y_{test} \vert }{y_{test}} \tag{20}\\ MRE=&\frac {1}{n}\sum \limits _{i=1}^{n} {\frac {\vert y_{pred} -y_{test} \vert }{y_{test}}}\tag{21}\end{align*}$ View Source where $y_{pred}$ is the label value predicted by the proposed semi-supervised co-training model, and $y_{test}$ is the true label value of the test sample.

According to the independent variable value interval [−5, 5], 200 training samples can be generated randomly to train the initial GP model and SVM model, and then conduct co-training process. During each iteration, 50 test samples are randomly generated for testing in the interval [−5, 5].

After training, in order to further verify the effectiveness of the proposed algorithm, the optima co-training model is used to predict the output of other 50 randomly generated test points, and the predicted outputs are compared with the real outputs. Table 2 records the iteration numbers and the termination test error of these two benchmark functions. Fig. 2 and Fig. 3 are the iterative test results and the fitting effects of the test points. It can be seen from the error curves that the Griewank function reaches the error threshold at 45 times, and the Quartic function reaches the error threshold at 76 times, all of which have reached the threshold within 100 iterations. Moreover, it can be seen from the fitting curves that the fitting effect of the Griewank function is slightly better than that of the Quartic function, but the fitting effects of these two benchmark functions have all reached good levels. Therefore, according to these experiment results, the effectiveness of the proposed co-training algorithm has been initially verified.

TABLE 2 Iterative Results of the Benchmark Functions

FIGURE 2.

Test results of the Griewank function.

Show All

FIGURE 3.

Test results of the Quartic function.

Show All

B. Yagi Microstrip Antenna

Yagi MSA has high gain, wide beam width, which has been widely used [35]. The structure diagram of the Yagi MSA is shown in Fig. 4 (a), and the three-dimensional view in HFSS is shown in Fig. 4 (b). The model is fed with gradual microstrip Barron. It has a lot of variables, such as the length of the reflective array, the length and width of the excitation array, the length and width of the guidance array and so on. The parameters that have a large impact on performance are selected, namely the length of the excitation array $\boldsymbol {d_ {r}}$ , the distance between the excitation array and the reflection array $\boldsymbol {g}_{1}$ , the distance between the excitation array and the guide array $\boldsymbol {g}_{2}$ , and the distance between the guide array $\boldsymbol {g}_{3}$ as input parameters.

FIGURE 4.

The Yagi MSA.

Show All

1) Resonant Frequency Modeling

The design indexes of the Yagi MSA are working frequency 2.3GHz, and −10dB bandwidth covering 2.2GHz ~ 2.4GHz. The input parameter combination is $\gamma = [d_{r}\,\,g_{1}\,\,g_{2} \,\,g_{3}]$ , and $d_{r} \in [{40,45}]$ , $g_{1} \in [{15,20}],g_{2},g_{3} \in [{8,14}],g_{2} =g_{3}$ , other fixed values of size parameters are shown in Table 3. The frequency sweep range of Yagi MSA is 1GHz ~ 5GHz with a step size of 0.04GHz. Therefore, there are 101 frequency points in total for a group of parameters. Four variables are set in total with 6 levels for each variable, and the value ranges of parameters and sampling intervals of these 4 variables are shown in Table 4. For simulation, the HFSS-MATLAB-API script [36] has been used. The HFSS software is controlled through the script interface to generate 3D model for analysis and solution. Finally, the simulation results are output. In other words, the scripts programmed in MATLAB are used to call HFSS software, taking the variable parameters of the antenna as input and the antenna return loss as output.

TABLE 3 Fixed Size Parameters of the Yagi MSA

TABLE 4 Experimental Samples of the Yagi MSA

By using the partial orthogonal experiment, 16 groups of samples are simulated as initial training samples, another 10 groups of samples are simulated as test sample set, and 10 groups of samples are randomly set without HFSS simulation as unlabeled sample set. The computer processor used in this experiment is Intel(R) Core (TM) i5-10210U CPU @ 1.60GHz 2.11GHz, RAM is 8GB. It takes 104.79 seconds to obtain a set of labeled samples by HFSS. After simulation, the resonant frequency points corresponding to each group of antenna sizes can be got, and then the modeling experiments can be conducted.

Using the 16 initial training samples, taking $d_{r}$ , $g_{1}$ , $g_{2}$ , and $g_{3}$ as input variables and $f_{HFSS}$ as output, so as to establish the GP model and SVM model respectively. Test sample set is used to test these two models, and the initial errors of these two models are shown in Table 5. In each iteration, the GP and SVM take one unlabeled sample for cross-training. The test sample set is used to test the two models respectively, and the test error is evaluated by MRE. Following, the values of the two errors are compared. The smaller one is added as pseudo-label sample and the ith test sample to the training sample set, further training the GP model and SVM model. At the same time, the test sample added to the training set will be deleted from the test sample set, and the remaining test samples are for the next iteration. Based on the magnitude of the initial error and the overall error, the error threshold of this experiment is 1e-03, which is a relatively small error. Table 6 records the test error of each iteration.

TABLE 5 Initial Errors of Yagi MSA

TABLE 6 Iterative Results of Yagi MSA

It can be seen from the results that the iteration ends at the 9th times, and the test error at the 8th iteration is the smallest, which is 0.0027. At this time, 7 labeled test samples are introduced to the training sample set. To ensure the fairness of the comparison, the updated training sample set including 23 samples are also, respectively, used to train the traditional supervised GP model and SVM model. These two models are tested by the same test sample set used in the 8th iteration of the co-training process. The test errors are 0.0050 and 0.0110, which are both greater than 0.0027. In this experiment, 23 sets of labeled samples are used, and the total computing time is 2410.17 seconds. Therefore, the prediction ability of the proposed co-training model is improved compared with the traditional supervised learning model by using the same label samples, namely the same computing time.

2) Return Loss Fitting

The experimental results of resonant frequency modeling are used as the basis for judging the generalization ability of the trained model. In order to more intuitively reflect the fitting ability of the proposed model, a size conforming to the design standards is used for fitting the corresponding S₁₁ curve. According to the optimization results, a set of antenna parameters whose performance satisfies the design index is [45, 18, 14, 14].

As shown in Fig. 5, the X-axis is the frequency sweep range, which is from 1GHz to 5GHz. The Y-axis is the corresponding S₁₁ parameter value of each frequency point. The S₁₁ reaches −17.5dB @2.3GHz and −10dB bandwidth covering 2.2GHz ~ 2.4GHz, which meeting the design requirements. In order to further verify the effectiveness of the proposed co-training algorithm, the trained co-training model is used to predict the S₁₁ of this group of antenna parameters, and compare them with these simulated by HFSS software. The solid blue line named ‘HFSS’ is the simulation results of HFSS, and the dotted red line named ‘Proposed’ is the prediction results of the method proposed in this article. From the Fig. 5, the two curves are highly consistent, indicating that the proposed model is with high accuracy. This proposed model can be used to predict other parameters, replacing HFSS simulation effectively.

FIGURE 5.

S₁₁ fitting diagram of the Yagi MSA.

Show All

C. GPS Beidou Dual-Mode Microstrip Antenna

GPS Beidou dual-mode MSA can be used in GPS positioning systems and Beidou satellite navigation systems, which is widely used in various terminals [37]. In this experiment, a square MSA is used. The flat structure is shown in Fig. 6 (a), and the three-dimensional model in HFSS is shown in Fig. 6 (b). The four sides of the patch are branches with the same width and different lengths, corresponding to the two working modes of GPS L1 frequency band and Beidou B1 frequency band. Its performance is affected by the radiation patch side length W, the low-frequency modal branch L₁ and the high-frequency modal branch length L₂, and these three antenna parameters are used as input variables to establish models.

FIGURE 6.

The GPS Beidou dual-mode MSA.

Show All

1) Resonant Frequency Modeling

The design specification of the GPS Beidou dual-mode MSA is that the voltage standing wave ratio of the antenna at 1.58GHz (Beidou B1 operating frequency) and 1.61GHz (GPSL1 operating frequency) is less than or equal to 1.5. Several variables that have a significant impact on the performance are selected. The input parameter combination is $\gamma =$ [W $\text{L}_{1}~\text{L}_{2}$ ], and $W\in [{42,45}],L_{1} \in [{5.1,6.3}]$ , $L_{2} \in [{3,3.9}]$ . The values of other dimension parameters are fixed, which are shown in Table 7. For the simulation of the GPS Beidou dual-mode MSA, the frequency sweep range is 1.2GHz through 1.8GHz, and the step size is 0.005GHz. Therefore, there are 121 frequency points in total for each group of parameters. Each variable is set with 4 levels. The samples are also selected by using the partial orthogonal experiment. The sampling ranges and intervals of the three parameters are shown in Table 8.

TABLE 7 Fixed Size Parameters of the GPS Beidou Dual-Mode MSA

TABLE 8 Experimental Samples of the GPS Beidou Dual-Mode MSA

By using the HFSS-MATLAB-API script, the input parameters of each set of antenna sizes correspond to a set of return loss values by electromagnetic simulation. In this experiment, 15 groups of samples are simulated by HFSS as the initial training samples, 10 groups of samples are simulated as the test samples, and other 10 groups of samples are randomly set without HFSS simulation as unlabeled samples. Through simulation, it takes 72.93 seconds for calling HFSS simulation to obtain one labeled sample. After the simulation, the resonant frequency points corresponding to each group of antenna parameters can be computed. Since two different operating modes have two different resonant frequency points, considering the accuracy of modeling, the two operating modes are modeled respectively by the proposed co-training algorithm.

Using 15 training samples and taking W, $L_{1}$ , $L_{2}$ as input variables and $f_{HFSS}$ as output, the GP model and SVM model can be trained respectively. Table 9 records the initial errors of these two models. In each iteration, GP and SVM take one same unlabeled sample for cross-training, and the same test sample set is used for testing the two models respectively. The values of test errors are compared in each iteration, the pseudo-labeled sample with small error and the ith test sample are introduced into the training sample set to further train the GP model and SVM model. In this experiment, according to the magnitude of the initial errors of both two operating modes and the overall error trend, the error threshold is both set as 1e-04, which is a relatively small error. Table 10 records the iterative test errors of each iteration for the two models.

TABLE 9 Initial Errors of the GPS Beidou Dual-Mode MSA

TABLE 10 Iterative Results of the GPS Beidou Dual-Mode MSA

For the GPS mode, the iteration ends at the 9th times and the test error at the 8th iteration is the smallest, which is 8.3243e-04. At this time, 7 labeled test samples are introduced. Considering the fairness of comparison, the updated training set including 22 samples is used to train traditional supervised GP model and SVM model respectively. The test errors are 0.0013 and 0.0124 respectively by the same test sample set used in the 8th iteration, both greater than 8.3243e-04.

For the Beidou mode, the iteration ends at the 8th times, and the test error at the 7th iteration is 6.1237e-04. At this time, 6 labeled test samples have been introduced. At the same time, the 21 labeled samples are used to train the traditional GP model and SVM model. The test sample set used in the 7th iteration of the co-training algorithm is also used to test the above two models, and the test errors are 8.1689e-04 and 0.0065 respectively, which are both greater than 6.1237e-04.

According to the above experimental results of both two operating modes, based on the same labeled samples, the predictive ability of the proposed co-training model is improved compared with that of the traditional supervised learning models. In order to ensure that the modeling accuracy of the models can all reach the high levels, the model in the 8th iteration is adopted as the best one in the experiment of GPS Beidou dual-mode MSA, which cost 1604.46 seconds totally.

2) Return Loss Fitting

The results of the resonant frequency modeling experiment are used as the basis for judging the stability of the trained model. The trained model is used to describe the relationship between W, L₁, L₂ and S₁₁. A group of antenna size used for S₁₁ fitting is [44, 5.1, 3.6], conforming to the design requirements.

As shown in Fig. 7, S₁₁ parameter values are −19.5dB@ 1.58GHz and −15.5dB@1.61GHz respectively, which meeting the design requirements. The X-axis is the sweep frequency range, which is from 1.2GHz to 1.8GHz, and the Y-axis is the corresponding S₁₁ value of each frequency point. S₁₁ curve is formed in the whole frequency interval. The solid blue line named ‘HFSS’ is the simulation results of HFSS, and the dotted red line named ‘Proposed’ is the prediction results of the method proposed in this article. It can be seen that the two curves are basically consistent, indicating that the trained semi-supervised co-training model in this experiment has high accuracy, which can replace the HFSS simulation for predicting.

FIGURE 7.

S₁₁ fitting diagram of the GPS Beidou dual-mode MSA.

Show All

SECTION IV.

Conclusion

In order to improve the optimization efficiency of electromagnetic components, reducing the number of times to call HFSS, and saving the time of obtaining labeled samples, this article proposes a semi-supervised co-training algorithm based on GP model and SVM model. GP model and SVM model are cross-trained with the same unlabeled samples. After comparing the test errors, the two models are future trained with pseudo-labeled samples with higher accuracy and corresponding test samples. Iteration termination conditions are set to control the number of unlabeled samples and the number of model updates to find the best solution in a relatively short time, which can effectively prevent the reduction of model accuracy. The benchmark functions are used to verify the effectiveness of the proposed algorithm. From the experimental results, it can be seen that the proposed algorithm has a good fitting effect for the Griewank function and the Quartic function. The Yagi MSA and GPS Beidou dual-mode MSA are optimized and the effectiveness of the proposed algorithm is verified by the resonant frequency modeling experiments. The experimental results show that, in the case of using the same labeled samples, the proposed co-training algorithm improves the prediction ability compared with the traditional supervised learning method. In order to further verify the effectiveness of the proposed algorithm, the trained resonant frequency model is used to fit the S₁₁ values. From the results, we can see the proposed model has a good fitting accuracy for S₁₁ of the above antennas. Therefore, the co-training algorithm proposed in this study are suitable for the problems that include many required label samples, high calculation cost and long time when training the model in antenna design. This proposed model can replace electromagnetic simulation effectively, which obviously save time in antenna optimization. It will further promote the applications of SSL method in the field of electromagnetic optimization.

References is not available for this document.

Antenna Optimization Based on Co-Training Algorithm of Gaussian Process and Support Vector Machine

Abstract:

Metadata

Abstract:

Funding Agency:

Basic Introduction