Journals & Magazines >IEEE Access >Volume: 12

A Study of Expert Finding Methods for Multi-Granularity Encoded Community Question Answering by Fusing Graph Neural Networks

The model consists of three modules which are problem feature extractor, expert vector modeling and multi-granularity coding technique. The question feature extractor obt...

Abstract:

expert finding in community question answering sites aims to match target questions with experts who are most likely to provide satisfactory answers. Network embedding te...Show More

Metadata

Abstract:

expert finding in community question answering sites aims to match target questions with experts who are most likely to provide satisfactory answers. Network embedding techniques have been highly successful in expert finding. Nevertheless, most network embedding techniques generate only a single feature vector for experts based on historically answered question to match the target question, often ignoring multi-granularity linguistic matching information. As a result, these methods cannot fully capture the similarity between questions and experts. To tackle these challenges, this study proposes a multi-granularity encoded community question answering expert finding model incorporating graph neural networks, i.e., LG-ERMG: 1) LG-ERMG constructs a relationship graph based on the experts and their history of answered questions and utilizes a lightweight graph convolutional network to capture the potential connections among the experts, which can help to enhance the representation of expert expertise; and 2) multi-granularity coding technique is used to learn different granularity of semantic matching information between target questions and experts’ historical answer questions. In this study, experiments on two real community question answering datasets are carried out to demonstrate the effectiveness of this approach.

The model consists of three modules which are problem feature extractor, expert vector modeling and multi-granularity coding technique. The question feature extractor obt...

Published in: IEEE Access ( Volume: 12)

Page(s): 142168 - 142180

Date of Publication: 27 August 2024

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2024.3450544

Funding Agency:

Contents

SECTION I.

Introduction

Community Question Answering (CQA) serves as a platform where users can exchange knowledge and find answers to their questions [1]. Community Question Answering expert finding is used to identify users with relevant knowledge capabilities and recommend them to the questioners of the target questions by analyzing the user’s behavioral data such as historical questioning, answering, liking, and adopting answers on the website [2]. As shown in Fig 1, first, keywords are extracted from the CQA data and new questions and transformed into vector representations using word embedding techniques. Then, the resemblance between the new question vectors and the expert vectors is calculated to recommend the expert who best matches the new question.

FIGURE 1.

Expert finding process.

Show All

At the heart of expert finding is the precise matching of an expert’s interests to a target question. Each expert tends to have diversified interest domains, which are not only reflected in their history of answering questions, but also implied in the potential relationships between experts. In addition, semantic coding techniques with different granularity can more effectively capture and refine the semantic correlation between experts and target questions, thus optimizing the effect of expert finding.

However, most current expert finding systems tend to analyze experts’ historically answered questions to extract single feature vectors of experts and use them as the basis for matching with the target questions, which may ignore the diversity of experts’ interests and the importance of deep semantic information. For example, Liu et al. introduced a topic-sensitive probabilistic model that takes into account the link structure and topic similarity among users to identify experts [3]. In 2019, Li et al. used the long short-term memory network (LSTM) [4] architecture and a heterogeneous network embedding algorithm to jointly learn the question content, questioner and responder representations, and embed them uniformly into the same latent space, then calculate similarity scores via a convolutional scoring function [5]. Ghasemi et al. proposed the UserEmb model [6], which employs node embedding to model user relationships in CQA and uses this information in a joint model to compute the similarity of text and nodes. Peng et al. focuses primarily on analyzing questions historically answered by experts to learn representations of their interests, which are then used to match target questions [7].

Although existing research has made some progress in improving the accuracy of CQA expert findings, there are still shortcomings in modeling the expertise of experts and the potential connections between them. Most of the existing methods focus only on extracting feature vectors of experts and questions from textual information, ignoring the complexity and potential connections between experts through graph-structured relationships, and the important impact of these connections and interactions on enhancing recommendation effectiveness. In addition, how to make full use of experts’ historical answer data in the expert finding process to model and match in a multi-level and multi-granularity way is also an urgent problem. As a result, the association between experts and questions can be captured more comprehensively by fully learning the semantic matching correlation between experts and target questions through different granularity.

Aiming at addressing the above problems, this study proposes a multi-granularity encoding community question answering expert finding method that fuses GNNs [8]. An expert interest and expertise relationship graph is constructed. Then graph convolutional networks are utilized to capture potential connections among experts, and multi-granularity encoding techniques are used to learn semantic matching information between different levels of experts and questions. Specifically, graph convolutional networks that do not contain nonlinear activation functions or linear transformations, i.e., lightweight convolutional networks (LightGCNs) [9], such a design not only simplifies the model complexity and reduces the risk of overfitting, but also improves the training efficiency and enhances the generalization ability of the model. Applying LightGCN to expert-question heterogeneous graphs enables a purer focus on the interaction history between experts and questions, effectively tapping into experts’ interests in answering questions. A multi-granularity coding technique is employed to capture the matching scores between the expert and the target question. Additionally, the semantic similarity between the experts’ historical answers to questions and the target question is calculated at the word, question, and expert levels. The similarity scores from different levels are combined into a prediction model to compute a more comprehensive information match.

The study’s contributions can be summarized as follows:

Constructing a graph of the relationship between experts’ interests and expertise capabilities. By constructing a graph of experts who have answered the same questions, the LightGCN method is used to enrich the information of the graph structure to obtain more comprehensive expert vectors.
Utilizing a multi-granularity encoder to enhance the correlation matching signals between experts’ historical answers and target questions. This approach improves the matching correlation between experts and target questions more fully by considering three aspects: word-level, question-level, and expert-level.
Experiments verify the effectiveness and superiority of the LG-ERMG model. Compared to the baseline method, the proposed model achieves better performance in community question answering expert finding.

SECTION II.

Related Work

We design a multi-granularity coded community question answering expert finding method that fuses graph neural networks. This method deeply explores the correlation between target questions and experts for more accurate expert findings. This article will further introduce the current research progress in the field of expert finding, including different research methods and challenges.

A. Traditional Expert Finding Methods

Traditional expert finding methods are primarily divided into two categories: feature engineering-based methods and topic modeling-based methods. In the past, feature engineering methods have been the mainstream methods for implementing expert finding [10], [11], [12], [13], [14]. For example, Zhou et al. extracted local and global features to capture various aspects of questions, users, and their relationships. These features were then fed into an SVM [15] to recommend suitable answerers [14]. In 2018 Roy et al. innovatively utilized quantitative data of users to construct an exhaustive user profile and pinpoint potential respondents accordingly, significantly improving the effectiveness of question answering [16]. However, these methods are limited by the tedious process of manual feature extraction, which is not only time-consuming and laborious, but also requires the support of profound expertise [2]. Topic modeling approaches capture topic features from textual content, represent the expert and the problem with these features, and perform expert finding by calculating the similarity between them [17], [18], [19], [20], [21]. The most widely used model in topic modeling is the latent delay distribution (LDA) model [22]. For example, Zhu et al. [23] incorporated not only the information of the target category but also that of other relevant categories. These categories were identified and extracted on similarity using the LDA topic model. In 2017 Zhu et al. proposed MLQR, a multi-objective learning ranking model specifically applied to the problem routing domain, which deftly balances the likelihood and quality of answering a question to achieve dual optimization [24]. 2022 Krishna’s team proposed a novel expert finding strategy that closely models the dynamics of a user’s activities in a topic-specific community, providing the users with expert resources that are more relevant to their needs [25].

B. Deep Learning-Based Methods for Expert Finding

Deep learning has been extensively applied in various fields in recent years, leading to the emergence of deep learning models [30]. For example, autoencoder-based models and neural autoregressive models have been applied in recommender systems [31]. Moreover, deep learning has been extensively employed in expert finding problems because of its capacity to manage multimodal heterogeneous features. In CQA websites, many works have begun using neural networks to enhance the performance of the expert finding [32]. For example, Chen et al. used a convolutional neural network (CNN) to recommend an expert for a given problem [33]. Zhou et al. approached the expert finding problem by focusing on ranked metric network embedding to address the sparsity of CQA data [34]. TCQR [27] designed a temporal dynamic multi-granularity context-aware model with reference to the temporal information features in the problem. In 2023 Krishna innovatively proposed a graph diffusion-based modeling framework that skillfully blends semantic and temporal dimensions of information, aiming to accurately capture the dynamic changes in users’ interests and activity levels for expert finding [35]. In 2024 Tang et al. proposed EPAN-SERez as a knowledge graph-based software expert finding model that effectively addresses the challenge of insufficient expert engagement. On the StackOverflow dataset, the model outperformed the baseline model and provided a new solution for expert finding in the software domain [36]. FA-CPQAER effectively copes with the data through the mechanism of aligning and transmitting features and information in a cross-platform environment in the field of CQA sparsity challenge [37]. 2024 Peng [38] proposed an expert finding framework combining time-aware interest and personalized Transformer encoder to effectively capture expert diversity knowledge and interest.

C. Comparison to Current Methods

We present a multi-granularity encoding community question answering expert finding method that fuses graph neural networks. This model leverages deep learning techniques to enhance the matching accuracy between the target question and the experts. The model integrates the attention mechanism, expert ID features, word-level matching, question-level matching, and expert-level matching features. At the same time, potential connections between experts are modeled with the help of the LightGCN model. This model not only learns the deep correlation of text content but also utilizes the community structure information embedded in graph data to further improve the accuracy and efficiency of expert finding. Unlike most of the existing methods, the model not only extracts features deeply in the textual content, such as word-level matching, question-level matching, and expert-level matching, to fully capture the semantic similarity between experts and questions; at the same time, by introducing a graph neural network (the LightGCN model), the model is able to learn and mine the potential connections among experts from the complex graph data. This enables both comprehensive semantic relationships from textual data and exploration of complex network relationships from graph structures, making LG-ERMG capable of more comprehensive and in-depth understanding of the multi-dimensional relationships between experts and target problems. Table 1 shows that the LG-ERMG in this paper contains five features, enabling a more comprehensive match between the target question and the experts.

TABLE 1 Comparison of Methods

SECTION III.

LG-ERMG Model Building

This section describes the multi-granularity coding community quiz expert finding model incorporating GNNs. The model building process includes question definition, a question feature extractor, expert vector modeling, and multi-granularity coding techniques. The model structure is illustrated in Fig 2. The question feature extractor is designed to derive semantic features from experts’ historical answer questions and target questions. To enhance the experts’ vector representation, this study designs an attention mechanism to learn the vector representation of the expert based on history questions and introduces the LightGCN model to model the graph structure information in the CQA. Information at different granularities is captured by three matching encoders: word-level, question-level and expert-level. The model improves the accuracy of matching information and provides a more comprehensive understanding of the correlation between target questions and experts.

FIGURE 2.

LG-ERMG framework.

Show All

A. Definition of the Problem

The problem of expert finding on a CQA website is defined as follows: Assume that the target problem is $q^{t}$ and a set $U=\{u_{1}, \ldots , u_{m} \}$ contains multiple candidate experts, where $m$ denotes the number of experts. For each expert $u$ in the set $U$ , the answered historical questions can be expressed as $Q_{u}=\{q_{u}^{1}, q_{u}^{2}, \ldots , q_{u}^{n} \}$ , where $n$ is the size of the set $Q_{u}$ . For the $i$ -th problem $q_{u}^{i}$ in $Q_{u}$ , the representation is of the form $q_{u}^{i}=\{w_{1}^{i}, w_{2}^{i}, \ldots , w_{l}^{i} \}$ . Similarly, for the target problem $q^{t}$ , denoted as $q^{t}=\{w_{1}^{t}, w_{2}^{t}, \ldots , w_{l}^{t} \}$ , $l$ represents the number of words in the problem.

B. Question Feature Extractor

The question feature extractor is used to extract semantic features of questions from candidate expert history answering questions and target questions. First, the dataset is filtered into different question sets, and each word in the target question or the candidate expert history answer question is transformed into a low-dimensional feature vector $\mathbf {F} \in \mathbb {R}^{d_{w}}$ using a word embedding layer; then, these feature vectors are stacked into a word embedding matrix $\mathbf {F} \in \mathbb {R}^{l \times d_{w}}$ . In addition, due to the significant advantages of the Transformer [39] in handling long texts and complex semantic relations, this study uses the Transformer coding layer to capture the semantic features of the contextual problem.

Specifically, a multi-head self-attention layer and a positional feed-forward layer are included in each layer. $F_{i}$ denotes the word embedding feature of the $i$ -th word; $\alpha _{i,j}^{k}$ denotes the attention score between the $i$ -th word and the $j$ -th word in the $k$ -th self-attention head; and $h_{i,k}$ denotes the $k$ -th output value of the $i$ -th sample computed by the attention mechanism. The formulas are denoted as:

$\begin{align*} F_{i} & = F_{i} + (F_{i})_{p}, \quad i \in \{1, 2, \ldots , l\} \tag {1}\\ \alpha _{i,j}^{k} & = \frac {\exp (F_{i} Q_{k} (F_{j} K_{k})^{T})}{\sum _{x=1}^{l} \exp (F_{i} Q_{k} (F_{x} K_{k})^{T})} \tag {2}\\ h_{i,k} & = \left ({{ \sum _{j=1}^{l} \alpha _{i,j}^{k} F_{j} }}\right ) V_{k} \tag {3}\end{align*}$ View Source

where

$(F_{i})_{p} \in \mathbb {R}^{d_{w}}$

is the positional embedding feature of the

$i$

-th word and

$Q_{k}$

$K_{k}$

, and

$V_{k}$

denote the parameters of the

$k$

-th self-attentive head.

The contextual semantic features of the $l$ -th word are stacked into a matrix $\mathbf {E} \in \mathbb {R}^{d_{w}}$ , and the average aggregation operation is used to obtain the final semantic feature G of the problem with the following formula:

$\begin{equation*} \mathbf {G} = \text {mean}(\mathbf {E}) \tag {4}\end{equation*}$ View Source

By calculating Equation (4), the target question semantic feature

$\mathbf {G}^{t}$

and the expert history answer question semantic feature

$\mathbf {G}_{u}$

can be obtained.

C. Expert Vector Modeling

In this study, we consider that there are similar relationships between experts who have answered the same questions and there are potential connections with other experts; therefore, we model the experts and their history of answering questions, and the modeling process is shown in Fig 3. Taking the expert as a node, its history has answered the same question between the experts to generate an edge of the relationship between the experts. To construct the expert and history of answering the question relationship graph, the graph structure is input into the LightGCN module. After 3 layers of the graph convolution layer, the final expert node of the feature vector representation is obtained.

FIGURE 3.

Expert vector modeling diagram.

Show All

The formula for calculating the aggregation function for graph convolution is:

$\begin{equation*} e_{u}^{(k+1)} = \sum _{i \in N_{u}} \frac {1}{\sqrt {|N_{u}|} \sqrt {|N_{i}|}} e_{i}^{(k)} \tag {5}\end{equation*}$ View Source

where

$\sqrt {|N_{u}|} \sqrt {|N_{i}|}$

is called the symmetric normalization term, and

$e_{i}^{(k)}$

denotes the embedding of the history answering question

$i$

at the

$k$

-th layer.

Unlike most existing graph convolutions, in LightGCN, only connected neighbors are aggregated without integrating the target node itself (i.e., self-connected). After going through $K$ graph convolution layers, the embedding vectors obtained from each layer are further combined to form the final representation of the expert. The layer combination formula for $e_{u}$ is expressed as:

$\begin{equation*} e_{u} = \sum _{k=0}^{K} \alpha _{k} e_{u}^{(k)} \tag {6}\end{equation*}$ View Source

where

$\alpha _{k} \geq 0$

denotes the normalized weights of the edges between nodes, which ensures the reasonableness of the relationship between nodes and guarantees the stability and fairness of information aggregation in the convolution operation.

D. Multi-Granularity Coding Techniques

A schematic diagram of the multi-granularity expert finding process is shown in Figure 4. Information is extracted, embedded and matched between the target question and the expert’s historical answer question from different granularity layers, and then the similarity between the two is calculated to more accurately recommend a suitable expert for the questioner.

FIGURE 4.

Multi-granularity expert finding process.

Show All

The multi-granularity encoder captures and analyzes the similarity information between the expert’s historical answer questions and the target questions at different granularity levels to improve the accuracy of expert finding. This encoder consists of three components: a word-level encoder, a question-level encoder and an expert-level encoder.

E. Word-Level Encoder

A word-level encoder can efficiently combine the word embeddings of an expert’s history-answering questions with the target questions for similar signal matching between the expert and the questions. Specifically, given an expert $u$ , its historical answer questions are converted into word embedding matrices $(\mathbf {F}^{1}, \mathbf {F}^{2}, \ldots , \mathbf {F}^{n})$ , and the target question features $\mathbf {G}^{t}$ are obtained from a question feature extractor. Then, $\mathbf {G}^{t}$ is passed through a linear layer to obtain the final target question representation $\mathbf {G}_{w}^{t}$ , which is computed as:

$\begin{equation*} \mathbf {G}_{w}^{t} = \mathbf {W}_{w} \mathbf {G}^{t} + \mathbf {b}_{w} \tag {7}\end{equation*}$ View Source

For the

$i$

-th question that has been answered by the expert, the matching score between

$\mathbf {F}^{i}$

and

$\mathbf {G}_{w}^{t}$

is calculated with the following formula:

$\begin{equation*} \mathbf {F}_{w}^{i} = \max (\mathbf {F}^{i} \times \mathbf {G}_{w}^{t}), \quad i \in \{1, 2, \ldots , n\} \tag {8}\end{equation*}$

View Source

The maximum pooling operation is performed on $\mathbf {F}_{w}^{i}$ to finally obtain the similarity score $\mathbf {F}_{w}$ between the expert and the target question:

$\begin{equation*} \mathbf {F}_{w} = \max (\mathbf {F}_{w}^{1}, \mathbf {F}_{w}^{2}, \ldots , \mathbf {F}_{w}^{n}) \tag {9}\end{equation*}$ View Source

F. Question-Level Encoder

Question-level encoders pay more attention to the contextual semantic information of question titles, thus enabling a more precise understanding of the same lexical differences in different contexts. For example, the semantic features of “Dream of the Red Chamber” in the sentences “I like to read Dream of the Red Chamber” and “The actors in the Dream of the Red Chamber TV series are great” are different. Therefore, it is necessary to capture contextual semantic features. Specifically, given an expert $u$ , we obtain the semantic features $\mathbf {G}_{u}$ and $\mathbf {G}^{t}$ of the historical answer question, the target question $\mathbf {G}_{q}^{t}$ after the linear layer, multiply $\mathbf {G}_{q}^{t}$ with $\mathbf {G}_{u}$ to compute the question-level matching score, and maximize the pooling operation to obtain the final question-level similarity score $\mathbf {F}_{q}$ . Because the principle is similar to that of the word-level encoder, we will not repeat the specific formulas here.

G. Expert-Level Encoder

The expert-level encoder computes the overall matching information. Since experts have different interests and influences in different domains, this study employs an attention mechanism to aggregate semantic features of experts’ historical answers to questions to obtain the overall feature vector of experts. Subsequently, the expert vector representation is further enriched and enhanced in conjunction with a graph neural network to more accurately reflect the correlations and differences among experts. Ultimately, the enhanced expert vector representation is matched with the target question to achieve more accurate expert finding or question answering. The specific steps are as follows:

Step 1: Calculate the attention weight $\alpha _{u}^{i}$ for the $i$ -th problem with the following formula:

$\begin{equation*} \alpha _{u}^{i} = \frac {\exp (l_{u}^{i})}{\sum _{j=1}^{n} \exp (l_{u}^{j})}, \quad l_{u}^{i} = (E_{x})^{T} \odot G_{u}^{i} \tag {10}\end{equation*}$ View Source

where

$l_{u}^{i}$

is the attention score of the

$i$

-th question;

$\odot$

denotes the dot-multiplication operation;

$G_{u} = (G_{u}^{1}, G_{u}^{2}, \ldots , G_{u}^{n})$

denotes the expert history answer question features; and

$(E_{x})^{T}$

denotes the transpose of the expert ID feature vector.

Step 2: Aggregate the history answering questions according to different attention weights to obtain the overall expert feature vector $u$ with the following formula:

$\begin{equation*} u = \sum _{i=1}^{n} \alpha _{u}^{i} G_{u}^{i} \tag {11}\end{equation*}$ View Source

Step 3: The expert vector $e_{u}$ learned from the LightGCN model is spliced with the overall expert feature vector $u$ obtained from the attention mechanism to obtain the final expert vector representation $V_{u}$ with the following formula:

$\begin{equation*} V_{u} = \text {concat}(e_{u}, u) \tag {12}\end{equation*}$ View Source

Step 4: The target problem feature $\mathbf {G}^{t}$ is mapped to the expert-level feature space through a linear layer to obtain $\mathbf {G}_{e}^{t}$ , and a dot multiplication operation is performed with the final expert vector representation $V_{u}$ to obtain the expert-level matching score $F_{e}$ , which is denoted by the following formulas:

$\begin{align*} \mathbf {G}_{e}^{t} & = \mathbf {W}_{e} \mathbf {G}_{e}^{t} + \mathbf {b}_{e} \tag {13}\\ F_{e} & = (V_{u})^{T} \odot \mathbf {G}_{e}^{t} \tag {14}\end{align*}$ View Source

Step 5: The matching scores at different granularities are put into the integrator and accumulated to arrive at an overall matching score $F_{c}$ , which in turn selects the candidate expert with the highest score and recommends him/her to the corresponding target problem. The formulas are as follows:

$\begin{equation*} F_{c} = W_{w}' F_{w} + W_{q}' F_{q} + W_{e}' F_{e} \tag {15}\end{equation*}$ View Source

H. Model Training

To improve the performance of the model in distinguishing between positive and negative samples, this study uses negative sampling technique for model training. The provider of “acceptable answers” for each question (i.e., the expert) is considered a positive sample, while $K$ other experts are randomly selected as negative samples. The goal of training is to minimize the cross-entropy loss between the predicted labels and the true label distribution defined as follows:

$\begin{align*} \bar {F}_{c} & = \frac {F_{c}}{\sum _{i=1}^{K+1} \exp (F_{i})}, \quad c \in \{1, 2, \ldots , K+1\} \tag {16}\\ \text {Loss} & = -\sum _{c=1}^{K+1} (\hat {F}_{c} \log \bar {F}_{c}) \tag {17}\end{align*}$ View Source

where

$\hat {F}_{c}$

denotes the true value label,

$F_{c}$

denotes the predicted value probability of the

$c$

-th sample, and

$\bar {F}_{c}$

denotes the normalized probability of the

$c$

-th sample.

SECTION IV.

Experiments and Analysis

A. Data and Experiment Settings

In this paper, six real-world CQA datasets were selected from the StackExchange website, namely Artificial Intelligence, Print, History, Biology, English and Bioinformatics. The data was pre-processed with reference to previous work [4]. Each dataset included all questions up to June 2019, with corresponding titles and answers, featuring various answerers and experts who provided “accepted answers”. Answers with fewer than 5 responses were excluded to avoid cold-start questions. For each question, a candidate set of 20 experts was created, including the original answerers (one being an expert) and other experts randomly selected from the pool of responders. The detailed statistics of the dataset are presented in Table 2. Each dataset was split into training, validation and testing sets at 80%, 10% and 10% ratios, respectively. The number of historically answered questions was set to 30 (i.e., n = 30), and the length of each question title was 15 characters (i.e., l = 15).

TABLE 2 Dataset Details

In the experiments, the validation set was used to tune the hyperparameters. Specifically, the feature dimensions for the word, question, and expert levels were set to 100. The number of Transformer heads was set to 2, the number of Transformer encoding layers to 2, LightGCN layers to 3, and the batch size to 64. Each experiment was run independently five times, with average results reported. The device used for the experiments was an RTX 3090 GPU server with 16 GB of memory, and the framework used was PyTorch. To prevent overfitting, the discard method technique [40] was used with the discard rate set to 0.25. The model was trained using the Adam [41] optimizer with a learning rate of 0.001 and a weight decay of 0.0005.

B. Evaluation Metrics

To validate the model’s effectiveness, the mean reversed rank (MRR) [42], precision (P@k), and normalized discounted cumulative gain (NDCG@k) [43] were used as the evaluation metrics. Specifically, MRR is the inverse of an expert’s ranking among the candidate answerers for each question, and a higher MRR indicates a higher efficiency in recommending the correct expert, with the following formula:

$\begin{equation*} \text {MRR} = \frac {1}{N} \sum _{i=1}^{N} \frac {1}{r_{i}} \tag {18}\end{equation*}$ View Source

where

$N$

represents the number of samples and

$r_{i}$

denotes the rank of respondents for each question.

P@k indicates the percentage of predicted instances where the expert appears in the top $k$ results among the candidate answerers. For example, P@1 denotes the proportion of experts in the top 1 ranked results, the formula is as follows:

$\begin{equation*} {P\text {@}K} = \frac {|\{ r \in N \mid r_{i} \leq K \}|}{|N|} \tag {19}\end{equation*}$ View Source

NDCG is the overall ranking quality that measures the effectiveness of the model’s recommendation, and the discounted cumulative gain (DCG_k) formula is as follows:

$\begin{equation*} \text {DCG}_{k} = \sum _{i=1}^{k} \frac {r_{i}}{\log _{2}(i+1)} \tag {20}\end{equation*}$ View Source

where

$r_{i}$

is 1 when the

$i$

-th answer is an expert.

The ideal normalized discounted cumulative gain (IDCG_k) is used to calculate NDCG@k as follows:

$\begin{equation*} \text {NDCG@k} = \frac {\text {DCG}_{k}}{\text {IDCG}_{k}} \tag {21}\end{equation*}$ View Source

In this study, we set

$k$

to 10 (i.e., NDCG@10).

C. Baseline Model

In this study, we compared our model with the following models:

BM25 [13]: An information retrieval algorithm that assesses the relevance of a document to a given query by evaluating the degree of word match between them.
Doc2Vec [44]: This method recommends experts by analyzing the similarity between their historical answers and the target question, considering the content and structure of the text.
CNTN [26]: A convolutional neural network (CNN) is used to learn the deep semantic features of a question and calculate the similarity score between the question and the expert.
NeRank [5]: This method uses heterogeneous network embedding techniques to learn the features of questions, questioners and experts and combines them with convolutional neural networks (CNNs) to route questions for personalized queries.
TCQR [27]: TCQR provides comprehensive expert assessment by capturing the expertise and competence of experts over time through a temporal context-aware model.
RMRN [28]: This model mines deep correlations between problems and candidate experts through recurrent memory reasoning networks (RNNs) to recommend appropriate experts.
UserEmb [6]: This method takes into account the experts’ relationships in the community and the similarities between their answered questions and answers, aiming to optimize the accuracy of expert findings.
MPQR [29]: This model is a comprehensive scoring method based on user interest and expertise assessment and community voting information, aiming to find the right answerer for a question more efficiently and improve the quality and efficiency of question answering.
EFPT [38]: this method dynamically captures changes in experts’ interests and expertise by encoding their historical answer timestamps and voting scores, combined with a personalized Transformer and an additive attention interaction encoder.

D. Experimental Results and Analysis

1) Comparison Analysis of Experimental Results

The results on the two datasets in Stack Exchange using MRR, P@1, and NDCG@10 as evaluation metrics are summarized in Table 3. By summarizing the experimental data, this study draws the following conclusions:

Neural network models, such as CNTN, NeRank, TCQR, and RMRN, are usually able to show better performance than the traditional BM25 method in the expert finding task. This is primarily because neural network models can effectively capture complex semantic information from sentences. They facilitate more accurate interactions between questions and the history of experts answering them at a semantic level. Second, through multiple layers of nonlinear variations, neural networks are able to learn complex patterns in the data, which is crucial for modeling experts with problem vector representations.
Doc2Vec and UserEmb models lag behind in performance, mainly because they fail to fully utilize the semantic correlation between texts. Specifically: these two models mainly focus on the independent representation of a single text or user, ignoring the semantic connection between different texts, leading to underutilization of information.
Models based on graph structures (e.g., NeRank, UserEmb, MPQR, and LG-ERMG) generally perform well, thanks to the ability of graph neural networks to capture and model potential connections among experts, and these relationships provide an important complement to the richness of expert feature vectors. Unlike models that focus only on localized textual content, graph models are able to examine interactions between experts from a global perspective, resulting in a more comprehensive portrayal of expert features.
The EFPT model shows its unique innovation and effectiveness in the field of expert finding, and its performance is second only to the model proposed in this paper. Its advantages are mainly reflected in the fact that EFPT not only takes into account the history of experts in answering questions, but also integrates timestamps and voting score information, which provides a more comprehensive perspective for the portrayal of expert characteristics. The model incorporates temporal information, which makes it able to capture the evolutionary trend of experts’ interests over time, which is valuable for predicting experts’ performance on future questions.
Models such as RMRN and LG-ERMG achieve further performance improvement by introducing question-level encoder matching information. The reason behind this is that question-level coding can capture the similarities and differences between the questions answered by experts historically and the target questions more finely, thus realizing more accurate expert finding. Moreover, different experts show different expertise and interests in different questions, and the question-level coding mechanism enables the model to dynamically adjust the recommendation strategy to adapt to the needs in different scenarios.

TABLE 3 Experimental Comparison Results

2) Comparison of Ablation Experiments

This subsection examines the impact of the attention mechanism, word-level encoder, question-level encoder, expert-level encoder, and graph neural network components on the model’s effectiveness. As shown in Table 4, the experimental results reveal that each component affects model performance differently.

After removing the attention mechanism in the model, it was observed that the evaluation indicators of the model decreased to varying degrees on six different datasets. Specifically, on the AI dataset, MRR decreased by 1.9%, P@1 decreased by 2.7%, and NDCG@10 decreased by 3.1%. Similarly, a similar downward trend was observed on other datasets. This result effectively proves the importance of the attention mechanism in the task of CQA expert finding, which helps the model to focus on the key information when processing a large amount of information, so as to improve the accuracy of expert finding.
First, after removing the word-level encoder, the model performance all decreased, but the decrease was relatively small, indicating that although the word-level encoder can contribute to the model performance, its importance may not be as significant as other components. Second, the model performance decreases more significantly after removing the question-level encoder, especially in the P@1 and NDCG@10 metrics. This suggests that the question encoder plays an important role in capturing the correlation between questions and experts. Finally, removing the expert-level encoder also decreases the model performance, but the decrease is between the word-level and question-level encoders. This suggests that the expert-level encoder is crucial for fully understanding and learning expert features. Therefore, the multi-granularity encoders work together to enhance the model performance, and in the absence of multi-granularity coding techniques, the model will not be able to fully utilize the multi-level matching information, leading to a decrease in the accuracy of the expert findings.
After removing the graph neural network, the model significantly decreases on six datasets. This suggests that graph neural networks play an important role in enhancing the expert vector representation and capturing potential connections between experts. With the graph neural network, the model is able to better model the potential connections between experts, thus improving the efficiency and accuracy of learning expert vectors.

TABLE 4 Results of Ablation Experiments

3) Hyperparametric Analysis

This subsection aims to verify the impact of hyperparameters, including feature dimensions, the number of LightGCN layers and the number of Transformer layers. In the experiments, the validation set was used to adjust the hyperparameters using a control variable approach, and when the value of one hyperparameter was changed, all the other hyperparameters were set to the value when the best result was achieved.

First, the effects of the experts and the feature dimensions of the target problem were explored. As illustrated in Fig 5, the performance of the model shows an increasing and then decreasing trend with the gradual increase in feature dimensions. In particular, the performance of the model reaches the optimal value when the feature dimension is 100. When the feature dimension is too small, the model may not perform well due to its inability to comprehensively capture and model the key information required for matching the target problem with the experts, and when the feature dimension is large, overfitting may occur. Based on this observation, the feature dimension was set to 100.

FIGURE 5.

Impact of changes in feature dimensions.

Show All

Second, the impact of the number of LightGCN layers on model performance was examined. Fig 6 illustrates the performance of the model with varying number of layers (1 to 4). The results indicate that increasing the number of layers generally enhances performance. Particularly noteworthy is the significant performance improvement when going from 0 to 1 layer. Optimal performance is typically achieved when the number of layers reaches 3. Consequently, the number of LightGCN layers was set to 3.

FIGURE 6.

Impact of the number of graph neural network layers on model performance.

Show All

Third, the performance of the Transformer model was analyzed to investigate the impact of the number of layers on its performance. According to the data in Figure 7, the model’s performance progressively increases as the number of Transformer layers increases. Specifically, when the number of layers reaches 2, the model’s performance significantly improves and reaches its optimum. However, when the number of layers is further increased, the performance of the model starts to decrease. This phenomenon may stem from two aspects. On the one hand, when the number of Transformer layers is too small, the encoder may not be able to extract rich information from the problem content, thus limiting the model’s ability to understand and process complex problems. On the other hand, with the increasing number of layers, the model may become too complex, leading to overfitting of the training data, which in turn reduces the generalization ability. Based on the above analysis, this study chose to set the number of Transformer layers to 2.

FIGURE 7.

Impact of the number of transformer layers on the model performance.

Show All

4) Training Time Efficiency Analysis

In this subsection, an experimental analysis of training time efficiency is carried out by comparing three benchmark neural network models (TCQR, RMRN, and NeRank) in the same environment, and the experimental results are shown in Fig 8. The proposed model outperforms all three benchmark models in terms of training time. This is due to the fact that TCQR uses the BERT model, and although the BERT model has a strong semantic understanding capability due to its large amount of corpus information during pretraining, it also makes its weight initialization to include more parameters and computation. RMRN uses the Elmo model, which needs to build a language model from scratch during training and tends to require more time and computational resources. The NeRank model is superior to the excessive number of parameters, leading to inefficient training, and too many parameters not only increase the computational cost but also introduce the risk of overfitting. In summary, the proposed model outperforms the other three benchmark models in terms of training time, mainly due to the rationality of model selection.

FIGURE 8.

Comparison of training time efficiency (min/hour [m/h]).

Show All

5) Weight Analysis

In this section, the allocation of different weights at different granularities is analyzed. As shown in Table 5, the weights of relevance signals at each level can be flexibly adjusted for different datasets and problem characteristics. For example, in the AI dataset, the question-level and word-level relevance signals play an important role in recommending expert 1, which may be due to the fact that the expert has been deeply researched and widely published in the field of deep learning, which makes his/her historical answers highly technically relevant to the target question.

TABLE 5 Effect of Different Weights at Different Granularity on the Dataset

In the Print dataset, the expert-level relevance signal dominates, possibly because Expert 2 has unique insights and extensive practical experience in 3D printing technology, especially its application in the medical field.

The case of the Bioinformatics dataset, on the other hand, demonstrates the importance of word-level relevance signals, especially when it comes to specialized terms and algorithms. The exact matching of these terms is crucial for question answering, so Expert 3 is recommended for answering this question due to his deep professional background.

The case of the History dataset shows that in some cases, the question-level and expert-level relevance signals are relatively balanced and together determine the final recommendation. Expert 4 became the recommended expert for the question due to his extensive historical knowledge coverage and deep understanding of the question.

The case of the Biology dataset also demonstrates the equalizing effect of question-level and expert-level relevance signals. In this case, expert 5 is recommended as the best answerer due to his extensive knowledge of biology and research experience related to the question.

As for the English dataset, its case emphasizes the combination of word-level and question-level signals. Interpretation of literary works requires an in-depth understanding of the specifics of the text and the author’s writing style, so Expert 6 was recommended as the expert to answer the question due to his in-depth research and unique insights into Shakespeare’s works.

SECTION V.

Discussion

When reconsidering the limitations of the CQA model and the direction of future improvement, the time factor becomes a dimension that cannot be ignored. The current model performs well when dealing with static datasets, but its limitations gradually emerge in the face of new questions and changes in expert activity brought about by the passage of time. Data sparsity and cold-start problems are particularly prominent on the timeline, as new questions and new experts require time to accumulate sufficient data support. In addition, experts’ expertise and interests may change over time, and current models may not adequately account for this temporal dynamic when selecting experts. Text representations also need to accommodate time-sensitive content, such as the emergence of new vocabulary and emerging trends.

Therefore, future directions for improvement should include: first, constructing time-series analysis models to capture the trends of questions and experts over time; second, developing algorithms that can dynamically update experts’ knowledge and activeness; and third, exploring time-sensitive text representation techniques to better reflect the current context and background of questions. At the same time, data over longer time spans are collected to support in-depth research on the impact of time factors.

SECTION VI.

Conclusion

This study proposes a model that integrates a multi-granularity coding technique with GNNs to enhance expert finding performance for CQA websites. The model employs a multi-granularity encoder to extract word-level, question-level, and expert-level matching information between experts and target questions. This hierarchical approach enables the model to capture the complex relationships between experts and target questions more comprehensively. In addition, the introduction of graph neural networks effectively exploits the correlation between experts and problems, and the expert vectors are modeled using graph neural nets, thus enriching the expertise capabilities of experts. The effectiveness of fusing GNNs and multi-granularity coding in the field of expert finding is demonstrated through experimental validation on six publicly available datasets.

References is not available for this document.

A Study of Expert Finding Methods for Multi-Granularity Encoded Community Question Answering by Fusing Graph Neural Networks

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction

Related Work

A. Traditional Expert Finding Methods

B. Deep Learning-Based Methods for Expert Finding

C. Comparison to Current Methods

LG-ERMG Model Building

A. Definition of the Problem

B. Question Feature Extractor

C. Expert Vector Modeling

D. Multi-Granularity Coding Techniques

E. Word-Level Encoder

F. Question-Level Encoder

G. Expert-Level Encoder

H. Model Training

Experiments and Analysis

A. Data and Experiment Settings

B. Evaluation Metrics

C. Baseline Model

D. Experimental Results and Analysis

1) Comparison Analysis of Experimental Results

2) Comparison of Ablation Experiments

3) Hyperparametric Analysis

4) Training Time Efficiency Analysis

5) Weight Analysis

Discussion

Conclusion

References

IEEE Account

Purchase Details

Profile Information

Need Help?

A Study of Expert Finding Methods for Multi-Granularity Encoded Community Question Answering by Fusing Graph Neural Networks

Alerts

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction

Related Work

A. Traditional Expert Finding Methods

B. Deep Learning-Based Methods for Expert Finding

C. Comparison to Current Methods

LG-ERMG Model Building

A. Definition of the Problem

B. Question Feature Extractor

C. Expert Vector Modeling

D. Multi-Granularity Coding Techniques

E. Word-Level Encoder

F. Question-Level Encoder

G. Expert-Level Encoder

H. Model Training

Experiments and Analysis

A. Data and Experiment Settings

B. Evaluation Metrics

C. Baseline Model

D. Experimental Results and Analysis

1) Comparison Analysis of Experimental Results

2) Comparison of Ablation Experiments

3) Hyperparametric Analysis

4) Training Time Efficiency Analysis

5) Weight Analysis

Discussion

Conclusion

References

IEEE Account

Purchase Details

Profile Information

Need Help?