Journals & Magazines >IEEE Photonics Journal >Volume: 16 Issue: 5

Modulation Format Identification Method Based on Multi-Feature Input Hybrid Neural Network

Impact Statement:Our paper presents a novel Modulation Format Identification (MFI) approach using a Multi-Feature Input Hybrid Neural Network (MFHNN). It integrates constellation diagram ...Show More

Abstract:

A modulation format identification (MFI) method is proposed for high-speed optical fiber communication systems employing probabilistic shaping (PS) signals in polarizatio...Show More

Metadata

Impact Statement:

Our paper presents a novel Modulation Format Identification (MFI) approach using a Multi-Feature Input Hybrid Neural Network (MFHNN). It integrates constellation diagram features with Histogram of Oriented Gradients (HOG) features, refined through Multi-Scale Convolutional Neural Network (MS-CNN) and Deep Neural Network (DNN) training. This innovative fusion enables enhanced MFI accuracy, leveraging diverse modulation format features at varying neural network levels.

Abstract:

A modulation format identification (MFI) method is proposed for high-speed optical fiber communication systems employing probabilistic shaping (PS) signals in polarization division multiplexing (PDM). The approach utilizes a multi-feature input hybrid neural network (MFHNN) incorporating constellation diagram features and histogram of oriented gradients (HOG) features as dual inputs. These features are trained using a multi-scale convolutional neural network (MS-CNN) and a deep neural network (DNN) to obtain corresponding feature vectors. In the fusion layer, the two feature vectors are merged and classified through fully connected layers, thus constructing an efficient MFI model. The method enhances MFI accuracy by leveraging features of different modulation formats and representations at different neural network levels. To validate the feasibility of the proposed method, signals are collected through the construction of a simulated PDM optical fiber communication system with a fiber ...

Published in: IEEE Photonics Journal ( Volume: 16, Issue: 5, October 2024)

Article Sequence Number: 7201607

Date of Publication: 06 June 2024

ISSN Information:

DOI: 10.1109/JPHOT.2024.3410392

Funding Agency:

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

The rapid development of businesses such as Big Data, cloud computing, and artificial intelligence has increased demand for data transmission speed and capacity [1], [2], [3], [4], [5], [6], [7]. Consequently, high-speed optical fiber communication has become a focal area for meeting the growing data requirements and enhancing transmission performance.

In this context, probabilistic shaping (PS) technology has gradually emerged as a prominent solution. PS technology possesses the potential to significantly enhance the capacity and spectral efficiency of optical fiber communication systems [8], [9], [10], [11]. This technology optimizes data transmission by dynamically adjusting the probability distribution of signal constellation points to meet diverse transmission requirements, thereby enhancing system adaptability and performance. In high-speed optical fiber communication systems, the accurate identification of signal modulation formats becomes particularly crucial due to the combination of high-speed signal transmission and the mixing of multiple types of signals [12], [13].

To address this challenge, machine learning (ML) technology has become a crucial tool for modulation format identification (MFI) in optical fiber communication systems [14], [15], [16], [17], [18], [19]. A CNN-based method is proposed for identifying images collected by a constellation diagram analyzer, improving MFI identification performance [14]. Using deep neural networks to identify modulation formats in the two-dimensional Stokes plane achieves remarkably high identification accuracy even under low OSNR conditions [15]. The method utilizing higher-order cumulants (HOC) for signal feature extraction, coupled with the DNN algorithm, exhibits outstanding classification performance [16]. By leveraging transfer learning (TL) and a simplified multi-task deep neural network (MT-DNN), MFI is achieved directly from detected PDM-64QAM signals, attaining high identification rates for high-order QAM formats [17]. A method employing artificial neural networks (ANN) for modulation format detection exhibits high identification rates and robustness against interference [18]. In coherent optical communication, the utilization of signal constellation diagrams and support vector machines (SVM) has achieved precise identification of multiple modulation formats [19].

However, these ML techniques are often designed for traditional modulation formats (such as PSK and QAM) at conventional rates. In high-speed optical fiber communication systems incorporating probabilistic constellation shaping technology, the accuracy of conventional methods for MFI may be compromised. This is because the modulation format of signals shaped by probabilistic shaping might sometimes resemble other uniform shaping (US) signal formats. Therefore, there is a need to develop more efficient and intelligent MFI methods for signals generated using probabilistic constellation shaping techniques.

SECTION II.

Operation Principles

The proposed MFI scheme is shown in Fig. 1. First, the signals are power normalized and then divided into a training set and a test set in the ratio of 4:1. Next, the constellation diagram feature set and HOG feature set are input into the MFHNN for training to construct the MFI model. Then, the performance of the constructed MFI model is tested based on data in the test set. Finally, the results are analyzed based on the MFI.

Fig. 1.

The proposed MFI scheme.

Show All

A. Generation of PS-QAM Signal

In conventional optical fiber communication systems, points in the constellation of QAM signals are transmitted with equal probabilities, preventing the channel capacity from approaching the Shannon limit. PS techniques aim to increase the transmission probability of inner-circle constellation points while reducing the transmission probability of outer-circle constellation points. The significance of this approach lies in improving the occurrence frequency of symbols with lower energy compared to symbols with higher energy, thereby reducing the average constellation power and enhancing system performance. \begin{align*} P\left(x_{i}\right)=\frac{1}{\sum _{m=1}^{M} \mathrm{e}^{-v x_{m}^{2}}} \mathrm{e}^{-v x_{i}^{2}} \tag{1} \end{align*} View SourceThe probability distribution of constellation points follows the Maxwell-Boltzmann distribution, as expressed by the following (1) [20], where the signal $x\in [ x_{1}, x_{2},\ldots, x_{M} ]$, with $M$ being the number of signal constellation points; the shaping factor $v \in [ 0,1 ]$ ensures that the sum of probabilities for all signal constellation points equals 1 and represents the degree of probability shaping. When $v=0$, the constellation points are uniformly distributed, maximizing the entropy of the transmitter's information. A larger $v$ indicates a lower entropy value for the transmitter source and a greater degree of probability shaping for the signal constellation.

In this scheme, the probability distribution entropy for PS-16QAM is 3 b/symbol and 3.5 b/symbol. The probability distribution entropy for PS-64QAM is 4 b/symbol, 4.5 b/symbol, 5 b/symbol,and 5.5 b/symbol.

B. Design of an MFHNN

To identify between the US signals and PS signals in 16QAM and 64QAM, an MFHNN structure is devised in this study, as illustrated in Fig. 1. \begin{align*} \varepsilon _{k}=\frac{\exp \left(z_{k}\right)}{\sum _{n=1}^{N} \exp \left(z_{\mathrm{n}}\right)} \tag{2} \end{align*} View SourceConstellation diagram features and HOG features serve as dual inputs for the MFHNN. These features are individually trained in MS-CNN and DNN. In the fusion layer, the feature vectors of the two features are spliced into a composite vector in the channel dimension using a feature cascade approach and mapped into a one-dimensional vector space by a fully connected layer. After passing through the Softmax function, as indicated in (2), where $z_{k}=\sum w_{n k} x_{k}+b_{k}$ denotes the output result of the $k$-th neuron in the network, and $\varepsilon _{k}$ represents the probability of $z_{k}$ among all output results, the probability distribution across different modulation formats for the input sample is obtained, yielding the predicted classification result. \begin{align*} C=- {\textstyle \sum _{i}^{}} u_{i} \ln _{}{y_{i} } \tag{3} \end{align*} View SourceTo train the model, it is essential to compute a loss function to optimize the model. Cross-entropy is a commonly used loss function [21], as shown in (3). It is employed to calculate the error between the predicted classification results and the actual classification results. During the model training process, the Adaptive Moment Estimation (Adam) optimization algorithm is utilized to update model parameters and minimize the loss function. This approach enables the model to gradually converge towards the optimal solution, improving its adaptation to the training data.

1) Training Constellation Diagram Features Using an MS-CNN

At the receiving end of the optical fiber communication system, the IQ two-way data of the signals are captured. As shown in Fig. 2, the signals are presented in the form of constellation diagrams for ease of visualization and analysis. The OSNR range considered in this paper is from 10 dB to 30 dB in steps of 2.5 dB, and for each signal type, 200 sets of corresponding IQ two-way data samples are collected at each OSNR condition.

Fig. 2.

Sample 16QAM and 64QAM constellation diagrams.

Show All

In the MFHNN, the features of the constellation diagram are trained by the MS-CNN. To simplify the feature extraction process, downsample the grayscale constellation diagram. Each convolutional layer utilizes multiple convolutional kernels of different sizes to perform convolution operations on the input constellation diagram so that the model can extract features of different scales, thus capturing diverse scale information in the input data and enhancing the model's representative capacity. The output of the convolutional layers undergoes a non-linear transformation through the ReLU activation function [22]. \begin{align*} f(x) =\max(0,x) \tag{4} \end{align*} View Source

The ReLU activation function is defined as follows in (4), where $x$ represents the output of the convolution operation. The activation function enables the convolutional neural network to learn more complex features and patterns. A one-dimensional max-pooling layer is introduced in each CNN, with pooling window sizes of 127, 126, and 125, respectively. Max-pooling involves selecting the maximum value from multiple pixels within each pooling window, reducing the dimensionality of the data, decreasing computational load, and retaining essential features. The feature vectors obtained from the three parallel CNN undergo average pooling and are then outputted, resulting in the final constellation diagram feature vector.

2) Training HOG Features by DNN

HOG features are a widely employed method for object detection [23], characterized by analyzing the gradient direction histograms of various regions in an image to represent the morphological features of objects. This method plays a crucial role in tasks such as object detection and image identification, effectively extracting texture information from images.

To obtain HOG features. First, the gradient of the grayscale image is calculated to get the gradient information in horizontal and vertical directions: \begin{align*} \left\lbrace \begin{array}{l}U_{i}(i,j) =H(i+1, j)-H(i-1, j) \\ U_{j}(i,j) =H(i, j+1)-H(i, j-1) \end{array} \right. \tag{5} \end{align*} View SourceIn (5): At the pixel point $(i,j)$, $U_{i}(i,j)$ represents the gradient in the horizontal direction, $U_{j}(i,j)$ represents the gradient in the vertical direction; $H(i,j)$ represents the pixel value. \begin{align*} \left\lbrace \begin{array}{l}G(i, j)=\sqrt{U_{i}(i, j)^{2}+U_{j}(i, j)^{2}} \\ \theta (i, j)=\arctan \frac{U_{j}(i, j)}{U_{i}(i, j)} \end{array} \right. \tag{6} \end{align*} View SourceThe gradient magnitude and direction at the pixel point $(i,j)$ can be calculated using (6), where $G(i, j)$ is the magnitude of the gradient, and $\theta (i, j)$ is the gradient direction of the pixel.

Then, divide the image into equally sized feature cells. Within each feature cell, partition the gradient direction into nine regions. For each feature cell, accumulate the gradient magnitudes within the cell based on the corresponding areas of the gradient direction. Then, record the distribution of regions for each feature cell, forming a histogram for that feature cell. This constitutes the HOG feature for each feature cell. Finally, Combine multiple feature cells into a feature block. Within each feature block, concatenate the HOG features of all feature cells, normalize the feature vector, and obtain the HOG feature for that feature block. Concatenate the HOG features of all feature blocks to form the HOG features for the entire image.

In MFHNN, DNN is used to train HOS features, which comprises an input layer and a hidden layer. The input layer receives HOG feature inputs, and the input data enters the hidden layer after a linear combination. The function $f(x)$ in the hidden layer represents the activation function, performing a non-linear transformation on the input data to obtain the HOG feature vector. Similar to the training of constellation map features with MS-CNN, the ReLU function is also used as the activation function.

SECTION III.

Simulation Setup

To validate the effectiveness of the algorithm, this scheme integrates the VPI optical communication system simulation software with MATLAB. It constructs a 50 GBaud/s PDM-QAM coherent optical transmission system, as shown in Fig. 3. At the transmitter, a continuous wave (CW) laser with a wavelength of 1550 nm and a linewidth of 100 KHz generates an optical carrier. The light is then split by a polarization beam splitter (PBS) and enters two modulators separately. Pseudo-random binary sequence (PRBS) is used for QAM mapping, producing 16QAM and 64QAM signals. The uniform signals undergo probabilistic shaping using the probability density function represented in (1). The real and imaginary parts of the generated QAM signals enter two IQ modulators. The output light signals from the two IQ modulators are combined using a polarization beam combiner (PBC), resulting in 3 b/symbol PDM-PS-16QAM, 3.5 b/symbol PDM-PS-16QAM, 4 b/symbol PDM-US-16QAM, 4 b/symbol PDM-PS-64QAM, 4.5 b/symbol PDM-PS-64QAM, 5 b/symbol PDM-PS-64QAM, 5.5 b/symbol PDM-PS-64QAM, and 6 b/symbol PDM-US-64QAM signals. The input signals are power-controlled by an erbium-doped fiber amplifier (EDFA) to achieve an output power of 0 dBm. Subsequently, the OSNR is adjustable within the range of 10$\sim$30 dB. The optical signal is transmitted through an optical fiber link composed of 80 km standard single-mode fiber and an EDFA with a gain of 16 dB. The single-mode fiber has an attenuation of 0.2 dB/km, a dispersion of 16 ps/(nm$\cdot$km), and a nonlinear refractive index coefficient of $26\times 10^{-21}$ $\mathrm{m}^{2}/\mathrm{W}$. At the receiver, the optical signal, after passing through an optical filter, enters a coherent receiver along with the local oscillator light for heterodyne mixing, obtaining four signals in dual-polarization states. Finally, digital signal processing (DSP) techniques are employed to compensate for distortions occurring during the fiber transmission process accurately; before MFI, the signal undergoes a series of DSP algorithms independent of the modulation format, including analog-to-digital conversion (ADC), chromatic dispersion (CD) compensation, constant modulus algorithm (CMA) equalization, and carrier frequency recovery. The proposed MFI scheme is utilized to identify different modulation formats. After MFI, the modulation formats undergo additional processing steps such as carrier phase recovery and decoding.

Fig. 3.

Simulation setup for high-speed fiber optical communication system.

Show All

SECTION IV.

Analysis of the Results

During the training of the MFHNN model, cross-entropy is employed as the loss function, and the Adam optimizer is chosen for optimization. The training is performed for nineteen iterations. For the input training set, consisting of the constellation diagram feature set and the HOG feature set, 40 feature samples are extracted for each training iteration. To assess the impact of network model parameters on performance and training results, the learning rate is varied within the range of [1e-2, 1e-3, 1e-4]. This variation aimed to find the most suitable learning rate setting, enabling fast and stable convergence.

The loss function curves for different learning rates are illustrated in Fig. 4. As the number of training iterations increases, the loss function gradually decreases. There are noticeable differences between the loss function curves for different learning rates, indicating that the MFHNN designed in this study is sensitive to the adjustment of the learning rate. When the learning rate is set to 1e-2, the loss function of the network exhibits a significant sharp increase within a certain range, suggesting that this learning rate makes the model parameters update too aggressively, leading to oscillations and unstable behavior in the loss function during training. For a learning rate of 1e-3, the loss function curve shows a relatively fast and stable decline, with minor fluctuations during the training process, indicating better convergence of the model at this learning rate. When the learning rate is set to 1e-4, the loss function curve exhibits a slower descent with fluctuations, suggesting that this learning rate results in smaller updates to the model parameters, requiring more iterations for convergence to the optimal solution. Based on the performance of the loss function curves at different learning rates, a learning rate of 1e-3 is chosen, as it strikes a good balance between the speed of loss function reduction and the stability of the training process.

Fig. 4.

Comparison of loss function changes at different learning rates.

Show All

The identification accuracy curves of the proposed MFI method for eight types of uniform/probabilistic distribution QAM signals with varying OSNR are illustrated in Fig. 5. It can be observed that within the OSNR range of 10 to 30 dB, the identification accuracy of these eight signal modulation formats gradually improves with increasing OSNR. When the OSNR $\ge$ 17.5 dB, the identification accuracy for all eight modulation formats stabilizes at 100%. Within the specified OSNR range, for eight modulation formats including 3 b/symbol PDM-PS-16QAM, 3.5 b/symbol PDM-PS-16QAM, 4 b/symbol PDM-US-16QAM, 4 b/symbol PDM-PS-64QAM, 4.5 b/symbol PDM-PS-64QAM, 5 b/symbol PDM-PS-64QAM, 5.5 b/symbol PDM-PS-64QAM, and 6 b/symbol PDM-US-64QAM, the minimum OSNR required to stabilize identification accuracy at 100% is 12.5 dB, 15 dB, 17.5 dB, 15 dB, 17.5 dB, 17.5 dB, 15 dB, and 15 dB, respectively. This indicates that the proposed MFI method achieves robust identification performance for these eight modulation formats.

Fig. 5.

Identification accuracy for eight modulation formats of the proposed MFI method.

Show All

Confusion matrix of average identification accuracy for eight modulation formats is shown in Fig. 6. Across the entire OSNR range, the average identification accuracy for 3 b/symbol PDM-PS-16QAM, 3.5 b/symbol PDM-PS-16QAM, 4 b/symbol PDM-US-16QAM, 4 b/symbol PDM-PS-64QAM, 4.5 b/symbol PDM-PS-64QAM, 5 b/symbol PDM-PS-64QAM, 5.5 b/symbol PDM-PS-64QAM, and 6 b/symbol PDM-US-64QAM is 89.6%, 91%, 98.1%, 93.3%, 86.2%, 94.1%, 94.3%, and 87.5%, respectively. Furthermore, it can be observed from the confusion matrix that the proposed MFI method exhibits excellent performance in distinguishing between PS and US signals.

Fig. 6.

Confusion matrix of average identification accuracy for eight modulation formats.

Show All

Two comparative methods are chosen in this paper to demonstrate the superiority and stability of the proposed MFI method. These include an identification method based on constellation diagram features and CNN and an identification method based on HOG features and SVM. The identification accuracy of these three methods is compared under the same optical fiber channel conditions as the OSNR changes. The overall identification accuracy trends of the eight modulation formats based on different methods with varying OSNR are shown in Fig. 7. When the OSNR ranges from 10 to 30 dB, the proposed MFI method shows better identification performance and stability compared to the other two methods. Specifically, for the method using constellation diagram features and CNN, the overall identification accuracy within the entire OSNR range is 84.5%, and the accuracy stabilizes at 100% when the OSNR $\ge$ 30 dB. For the method using HOG features and SVM, the overall identification accuracy within the entire OSNR range is 75.1%, but the identification accuracy is unstable within the entire OSNR range. In comparison, the proposed MFI method achieves an overall identification accuracy of 91.6% within the entire OSNR range, and the accuracy stabilizes at 100% when the OSNR $\ge$ 17.5 dB. Based on these results, under the same optical fiber channel conditions, our proposed MFI method performs significantly better in overall identification accuracy.

Fig. 7.

Comparison of overall identification accuracy of different MFI methods.

Show All

The identification accuracy of different methods for different modulation formats under the same conditions is shown in Fig. 8. It can be observed that with the increase in OSNR, the proposed MFI method gradually approaches stability in the independent identification accuracy for the eight modulation formats. However, within the set OSNR range, the identification method based on constellation diagram features and CNN, along with the identification accuracy of the identification method based on HOG features and SVM is not stable for most modulation formats. This phenomenon is due to the fact that there is a certain degree of overlap in the features between the US signals and PS signals. The use of a single MFI method can not enable them to be accurately classified in the feature space. The proposed MFI method has more stable and superior identification accuracy for the eight modulation formats compared to the other two methods through feature fusion and multilevel learning strategies. In addition, for PS signals, the proposed MFI method is able to achieve 100% identification accuracy with a smaller OSNR.

Fig. 8.

Comparison of identification accuracy of different methods for different modulation formats under the same conditions. (a) PDM-PS-16QAM (3 bit/symbol), (b) PDM-PS-16QAM (3.5 bit/symbol), (c) PDM-US-16QAM (4 bit/symbol), (d) PDM-PS-64QAM (4 bit/symbol), (e) PDM-PS-64QAM (4.5 bit/symbol), (f) PDM-PS-64QAM (5 bit/symbol), (g) PDM-PS-64QAM (5.5 bit/symbol), (h) PDM-US-64QAM (6 bit/symbol).

Show All

To evaluate the computational complexity of the proposed MFI method, the CPU running time required for signal feature processing is measured and compared with the other two methods, as detailed in Fig. 9. The test CPU running time is performed on a personal computer equipped with an Intel i5-10210 U processor, 1.60 GHz clock speed, 16 GB RAM, and Windows 10 Home operating system. Combining the data from Figs. 7 and 8, the proposed MFI method achieved significant improvements in both identification accuracy and stability while only sacrificing slight computational complexity. This achieves an optimal balance between accuracy and stability. In future research, we will continue to optimize the method to reduce complexity further, making it more widely applicable to various application scenarios.

Fig. 9.

Comparison of complexity of different methods.

Show All

SECTION V.

Conclusion

In this paper, an MFI method based on MFHNN is proposed, which employs constellation diagram features and HOG features as dual input features to the MFHNN. These features are trained using an MS-CNN and a DNN to obtain corresponding feature vectors. In the fusion layer, the two feature vectors are merged and classified through fully connected layers, thus constructing an efficient MFI model. The method enhances MFI accuracy by leveraging features of different modulation formats and representations at different neural network levels. To validate the feasibility of the proposed method, signals are collected through the construction of a simulated PDM optical fiber communication system with a fiber length of 80 km and a symbol rate of 50 GBaud. The gathered data is then utilized with the proposed MFI to identify six PS-QAM signals (PS-16QAM with 3 b/symbol and 3.5 b/symbol, PS-64QAM with 4 b/symbol, 4.5 b/symbol, 5 b/symbol, and 5.5 b/symbol) and two uniform shaping (US) QAM signals (US-16QAM with 4 b/symbol and US-64QAM with 6 b/symbol). Simulation results demonstrate that the MFI model constructed by the proposed method achieves an overall identification accuracy of 91.6% for the eight modulation formats when the OSNR is within the range of 10 to 30 dB. Compared to traditional MFI methods, our approach significantly improves both MFI accuracy and convergence speed.

References is not available for this document.

MIT Libraries

MIT Libraries

Modulation Format Identification Method Based on Multi-Feature Input Hybrid Neural Network

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Introduction