Basic Introduction
In the optimization fields of electromagnetic components, it is common to use numerical simulation calculation or full-wave electromagnetic simulation software such as high frequency structure simulator (HFSS), computer simulation technology (CST) combined with global optimization algorithms [1], [2]. As we know, inverse electromagnetic problem is a hotspot in the field of computational electromagnetics [3], [4]. From the perspective of applications, this problem can be divided into two categories: parameter identification problem and optimization design problem. The latter has been studied more in our research group, which the essence is to give the desired performance index of an electromagnetic system, achieving this goal by optimizing the parameters. It is also a comprehensive problem, according to the performance requirements of electromagnetic devices, so as to synthesize the structure and size of electromagnetic devices for antenna optimization.
According to the previous researches, HFSS simulation can be used to obtain high-precision results to acquire labeled samples for many times. However, if the structure of the microwave device is complex, each call will require a lot of time. Therefore, the calculation cost is high, and the time needed is long. The usage of modeling methods instead of calling HFSS for analysis can effectively save time and has been a hot topic. Popular modeling methods such as artificial neural network (ANN) [5], [6], support vector machine (SVM) [7], [8], kernel extreme learning machine (KELM) [9], [10], Gaussian process (GP) [11], [12] et al., have been achieved.
Electromagnetic problems are generally small sample ones, both GP and SVM modeling methods are widely used in antenna optimization. GP is a machine learning (ML) method gradually developed in recent years. It has a strict statistical theoretical basis and is suitable for solving problems such as small samples, high dimensionality and nonlinearity. SVM is also a common ML method, which has unique advantages in solving problems such as small samples, nonlinear and high-dimensional modes. Resonant frequency is an important technical index in the optimal design of antennas. Obtaining resonant frequency quickly by known structural parameters of the antenna is often used in the research of modern antenna design [13], [14]. Both GP model and SVM model are widely used in antenna resonant frequency modeling [15], [16]. The trained model can establish a mapping relationship between the antenna-related parameters and the measured resonant frequency, so as to predict the resonant frequencies of other antenna parameters, reducing the number of calls to HFSS for accurate results.
The existing modeling of electromagnetic behavior is based on supervised learning, and the labeled training samples used are simulated by the simulation software such as HFSS [17], [18] et al. The frequency of full-wave electromagnetic simulation is the main factor affecting the training efficiency of the model such as GP and SVM mentioned above. Therefore, based on the existing researches, semi-supervised learning (SSL) [19], [20] method is proposed, so as to obtain the satisfied accuracy in a short time. Traditional ML techniques rely on a large number of labeled samples or unlabeled samples for training. In practical applications, it is difficult to obtain labeled samples, while unlabeled samples are easier to be obtained. Considering that unlabeled samples and labeled samples are usually distributed independently and identically, SSL method breaks through the limitations of just considering one type of samples, so as to mining hidden information of unlabeled samples to assist labeled samples for training. It is mainly divided into semi-supervised clustering [21], semi-supervised classification [22], semi-supervised dimensionality reduction [23], and semi-supervised regression [24], [25]. Co-training is a commonly used SSL method based on divergence. It has a solid theoretical foundation and a wide range of applications [26], [27]. The co-training algorithm was first proposed by A. Blum and T. Mitchell in 1998 [28], which has been continuously developed and gradually penetrated into many fields, such as natural language processing [29], image retrieval [30] et al. In the field of electromagnetism, to the best of our knowledge, there is no relevant researches, which is also the reason why this subject is worth researching.
Traditional co-training algorithm focuses on classification problems [31], lacking of researches on regression problems. Therefore, this study improves the traditional co-training method and proposes a co-training method based on GP and SVM, applying SSL algorithm to the field of electromagnetic optimization. Differences between GP model and SVM model have been utilized, and these two models use the same unlabeled samples to generate pseudo-labeled samples for updating. During the iteration process, the termination conditions have been set. If the test error of the next iteration is higher than that of the previous iteration and the test error of the previous iteration has reached the error threshold, the iteration will stop. Hence, the reduction of model accuracy can be prevented due to too many unlabeled samples introduced. In the cases study section, two different benchmark functions, and optimal design of Yagi microstrip antenna (MSA) and GPS Beidou dual-mode MSA are used to evaluate the effectiveness of the proposed co-training algorithm. The results show that the proposed co-training algorithm fits the benchmark functions well. For the experiments of two different antennas optimization, the proposed co-training algorithm has better predictive ability than that of traditional supervised learning method by using the same label samples.
Co-Training Algorithm
Co-training algorithm uses the differences between two different models, improving the performances of the model by introducing unlabeled samples. This article improves the traditional co-training method to make it more suitable for antenna optimization.
A. Gaussian Process
GP describes the covariance of predicted data by the covariance of input data. The parameters in the kernel function are called hyper-parameters. The process of model training is the process of selecting the kernel function and determining the hyper-parameters. The network link for software package with relevant instructions for GP modeling is shown in reference [32].
GP is a set consisting of countless random variables and any subset is in accordance with Gaussian distribution, mean function
The observed value is polluted by additive noise
The optimal hyper-parameters are obtained by maximum likelihood estimation. By establishing the log-likelihood function of the conditional probability, the derivate of hyper-parameters can be calculated. The conjugate gradient optimization method is used to search for the optimal hyper-parameters, and the negative log-likelihood function is expressed as
After the optimal hyper-parameters are obtained, predictions can be done. Given the new input
The size of the predicted variance reflects the accuracy of the model at this point, the smaller the variance, the higher the model accuracy.
B. Support Vector Machine
SVM is a ML method that seeks the best compromise and obtains the best promotion ability based on the learning accuracy and learning ability of training samples. For the case of linear inseparability, nonlinear mapping is used to convert the linearly inseparable samples in the low-dimensional input space into the high-dimensional feature space to make it linearly separable. The core content of SVM is to select the kernel function. The commonly used kernel functions are linear kernel function, Gaussian kernel function and so on. The network link of software package with relevant instructions for SVM modeling is shown in reference [33].
The optimal hyper-plane sought by support vector regression (SVR) is to minimize the total deviation of all sample points from the optimal hyper-plane [34].
Introducing relaxation variables
The original Lagrange optimization problem is given by
The corresponding dual problem is expressed by
Therefore,
C. Co-Training Algorithm
The traditional co-training algorithms are actually iterative ones, which need significant differences for updating each other. If the number of iterations is too large, the number of unlabeled samples introduced increases, which leads to the accumulation of noise, and the performance of the model will be reduced. Therefore, in view of the above problems, this article proposes a semi-supervised co-training algorithm based on GP model and SVM model. Using less labeled samples for initial training, and then unlabeled samples are combined with labeled samples for updating each other. A termination condition is set in advance, which is if the test error of the next iteration is higher than that of the previous iteration and the test error of the previous iteration has reached the set error threshold, the iteration will stop. According to the above method, the number of unlabeled samples introduced can be effectively controlled, preventing reducing the accuracy of the model by introducing too many unlabeled samples. Moreover, a satisfied solution can be got in a short time.
The co-training algorithm proposed in this article mainly includes initial training and co-training. In the initial training process, a small number of labeled samples simulated by HFSS software are used as initial training samples, then initial GP model and initial SVM model are trained respectively by these samples. In the co-training process, the two initial models use the same unlabeled samples to generate pseudo-labeled samples, which are used to cross-train the GP model and SVM model and update these two models, denoted as GPtime and SVMtime. The same test sample set simulated by HFSS is used to test the GPtime and SVMtime. Their test errors are compared, then the pseudo-labeled samples used in the model with smaller error and the test sample corresponding to the number of iterations in the test sample set are all added to the initial training sample set to further train GP model and SVM model. Repeat the iterative process continuously until the above stop criterion is met. The test samples added to the training sample set are removed from the test sample set, and the remaining test samples form the test sample set for the next iteration. Fig. 1 is a flowchart of the proposed algorithm, and Table 1 is the pseudo-code of the proposed algorithm.
Cases Study
Firstly, two benchmark functions have been used to verify the effectiveness of the proposed co-training algorithm, and then Yagi microstrip antenna (MSA) and GPS Beidou dual-mode MSA have been optimized by the proposed algorithm. The experimental results of resonant frequency modeling of the two MSAs are used as the basis for judging its generalization ability. Moreover, the optimal trained co-training model is used to fit the return loss curve (S11) of one group of antenna size that meets the design requirements. The S11 curve predicted by this model is compared with that of simulation by HFSS for verifying the performance of the proposed algorithm.
A. Benchmark Functions
In this study, the Griewank function and Quartic function are used firstly to test the performance of the proposed method, and their formulas are as (18)–(19). Both these two functions are set to 3 dimensions, the independent variable value interval is [−5, 5]. According to the complexity of functions and the initial errors, the error threshold is set to 1e-02, which is a well precision. Moreover, the maximum number of iterations is set to 100. At each iteration, the Relative Error (RE) is used for judging one test sample, and the Mean Relative Error (MRE) is used for judging the whole test sample set, formulas follow as (20)–(21).
According to the independent variable value interval [−5, 5], 200 training samples can be generated randomly to train the initial GP model and SVM model, and then conduct co-training process. During each iteration, 50 test samples are randomly generated for testing in the interval [−5, 5].
After training, in order to further verify the effectiveness of the proposed algorithm, the optima co-training model is used to predict the output of other 50 randomly generated test points, and the predicted outputs are compared with the real outputs. Table 2 records the iteration numbers and the termination test error of these two benchmark functions. Fig. 2 and Fig. 3 are the iterative test results and the fitting effects of the test points. It can be seen from the error curves that the Griewank function reaches the error threshold at 45 times, and the Quartic function reaches the error threshold at 76 times, all of which have reached the threshold within 100 iterations. Moreover, it can be seen from the fitting curves that the fitting effect of the Griewank function is slightly better than that of the Quartic function, but the fitting effects of these two benchmark functions have all reached good levels. Therefore, according to these experiment results, the effectiveness of the proposed co-training algorithm has been initially verified.
B. Yagi Microstrip Antenna
Yagi MSA has high gain, wide beam width, which has been widely used [35]. The structure diagram of the Yagi MSA is shown in Fig. 4 (a), and the three-dimensional view in HFSS is shown in Fig. 4 (b). The model is fed with gradual microstrip Barron. It has a lot of variables, such as the length of the reflective array, the length and width of the excitation array, the length and width of the guidance array and so on. The parameters that have a large impact on performance are selected, namely the length of the excitation array
1) Resonant Frequency Modeling
The design indexes of the Yagi MSA are working frequency 2.3GHz, and −10dB bandwidth covering 2.2GHz ~ 2.4GHz. The input parameter combination is
By using the partial orthogonal experiment, 16 groups of samples are simulated as initial training samples, another 10 groups of samples are simulated as test sample set, and 10 groups of samples are randomly set without HFSS simulation as unlabeled sample set. The computer processor used in this experiment is Intel(R) Core (TM) i5-10210U CPU @ 1.60GHz 2.11GHz, RAM is 8GB. It takes 104.79 seconds to obtain a set of labeled samples by HFSS. After simulation, the resonant frequency points corresponding to each group of antenna sizes can be got, and then the modeling experiments can be conducted.
Using the 16 initial training samples, taking
It can be seen from the results that the iteration ends at the 9th times, and the test error at the 8th iteration is the smallest, which is 0.0027. At this time, 7 labeled test samples are introduced to the training sample set. To ensure the fairness of the comparison, the updated training sample set including 23 samples are also, respectively, used to train the traditional supervised GP model and SVM model. These two models are tested by the same test sample set used in the 8th iteration of the co-training process. The test errors are 0.0050 and 0.0110, which are both greater than 0.0027. In this experiment, 23 sets of labeled samples are used, and the total computing time is 2410.17 seconds. Therefore, the prediction ability of the proposed co-training model is improved compared with the traditional supervised learning model by using the same label samples, namely the same computing time.
2) Return Loss Fitting
The experimental results of resonant frequency modeling are used as the basis for judging the generalization ability of the trained model. In order to more intuitively reflect the fitting ability of the proposed model, a size conforming to the design standards is used for fitting the corresponding S11 curve. According to the optimization results, a set of antenna parameters whose performance satisfies the design index is [45, 18, 14, 14].
As shown in Fig. 5, the X-axis is the frequency sweep range, which is from 1GHz to 5GHz. The Y-axis is the corresponding S11 parameter value of each frequency point. The S11 reaches −17.5dB @2.3GHz and −10dB bandwidth covering 2.2GHz ~ 2.4GHz, which meeting the design requirements. In order to further verify the effectiveness of the proposed co-training algorithm, the trained co-training model is used to predict the S11 of this group of antenna parameters, and compare them with these simulated by HFSS software. The solid blue line named ‘HFSS’ is the simulation results of HFSS, and the dotted red line named ‘Proposed’ is the prediction results of the method proposed in this article. From the Fig. 5, the two curves are highly consistent, indicating that the proposed model is with high accuracy. This proposed model can be used to predict other parameters, replacing HFSS simulation effectively.
C. GPS Beidou Dual-Mode Microstrip Antenna
GPS Beidou dual-mode MSA can be used in GPS positioning systems and Beidou satellite navigation systems, which is widely used in various terminals [37]. In this experiment, a square MSA is used. The flat structure is shown in Fig. 6 (a), and the three-dimensional model in HFSS is shown in Fig. 6 (b). The four sides of the patch are branches with the same width and different lengths, corresponding to the two working modes of GPS L1 frequency band and Beidou B1 frequency band. Its performance is affected by the radiation patch side length W, the low-frequency modal branch L1 and the high-frequency modal branch length L2, and these three antenna parameters are used as input variables to establish models.
1) Resonant Frequency Modeling
The design specification of the GPS Beidou dual-mode MSA is that the voltage standing wave ratio of the antenna at 1.58GHz (Beidou B1 operating frequency) and 1.61GHz (GPSL1 operating frequency) is less than or equal to 1.5. Several variables that have a significant impact on the performance are selected. The input parameter combination is
By using the HFSS-MATLAB-API script, the input parameters of each set of antenna sizes correspond to a set of return loss values by electromagnetic simulation. In this experiment, 15 groups of samples are simulated by HFSS as the initial training samples, 10 groups of samples are simulated as the test samples, and other 10 groups of samples are randomly set without HFSS simulation as unlabeled samples. Through simulation, it takes 72.93 seconds for calling HFSS simulation to obtain one labeled sample. After the simulation, the resonant frequency points corresponding to each group of antenna parameters can be computed. Since two different operating modes have two different resonant frequency points, considering the accuracy of modeling, the two operating modes are modeled respectively by the proposed co-training algorithm.
Using 15 training samples and taking W,
For the GPS mode, the iteration ends at the 9th times and the test error at the 8th iteration is the smallest, which is 8.3243e-04. At this time, 7 labeled test samples are introduced. Considering the fairness of comparison, the updated training set including 22 samples is used to train traditional supervised GP model and SVM model respectively. The test errors are 0.0013 and 0.0124 respectively by the same test sample set used in the 8th iteration, both greater than 8.3243e-04.
For the Beidou mode, the iteration ends at the 8th times, and the test error at the 7th iteration is 6.1237e-04. At this time, 6 labeled test samples have been introduced. At the same time, the 21 labeled samples are used to train the traditional GP model and SVM model. The test sample set used in the 7th iteration of the co-training algorithm is also used to test the above two models, and the test errors are 8.1689e-04 and 0.0065 respectively, which are both greater than 6.1237e-04.
According to the above experimental results of both two operating modes, based on the same labeled samples, the predictive ability of the proposed co-training model is improved compared with that of the traditional supervised learning models. In order to ensure that the modeling accuracy of the models can all reach the high levels, the model in the 8th iteration is adopted as the best one in the experiment of GPS Beidou dual-mode MSA, which cost 1604.46 seconds totally.
2) Return Loss Fitting
The results of the resonant frequency modeling experiment are used as the basis for judging the stability of the trained model. The trained model is used to describe the relationship between W, L1, L2 and S11. A group of antenna size used for S11 fitting is [44, 5.1, 3.6], conforming to the design requirements.
As shown in Fig. 7, S11 parameter values are −19.5dB@ 1.58GHz and −15.5dB@1.61GHz respectively, which meeting the design requirements. The X-axis is the sweep frequency range, which is from 1.2GHz to 1.8GHz, and the Y-axis is the corresponding S11 value of each frequency point. S11 curve is formed in the whole frequency interval. The solid blue line named ‘HFSS’ is the simulation results of HFSS, and the dotted red line named ‘Proposed’ is the prediction results of the method proposed in this article. It can be seen that the two curves are basically consistent, indicating that the trained semi-supervised co-training model in this experiment has high accuracy, which can replace the HFSS simulation for predicting.
Conclusion
In order to improve the optimization efficiency of electromagnetic components, reducing the number of times to call HFSS, and saving the time of obtaining labeled samples, this article proposes a semi-supervised co-training algorithm based on GP model and SVM model. GP model and SVM model are cross-trained with the same unlabeled samples. After comparing the test errors, the two models are future trained with pseudo-labeled samples with higher accuracy and corresponding test samples. Iteration termination conditions are set to control the number of unlabeled samples and the number of model updates to find the best solution in a relatively short time, which can effectively prevent the reduction of model accuracy. The benchmark functions are used to verify the effectiveness of the proposed algorithm. From the experimental results, it can be seen that the proposed algorithm has a good fitting effect for the Griewank function and the Quartic function. The Yagi MSA and GPS Beidou dual-mode MSA are optimized and the effectiveness of the proposed algorithm is verified by the resonant frequency modeling experiments. The experimental results show that, in the case of using the same labeled samples, the proposed co-training algorithm improves the prediction ability compared with the traditional supervised learning method. In order to further verify the effectiveness of the proposed algorithm, the trained resonant frequency model is used to fit the S11 values. From the results, we can see the proposed model has a good fitting accuracy for S11 of the above antennas. Therefore, the co-training algorithm proposed in this study are suitable for the problems that include many required label samples, high calculation cost and long time when training the model in antenna design. This proposed model can replace electromagnetic simulation effectively, which obviously save time in antenna optimization. It will further promote the applications of SSL method in the field of electromagnetic optimization.