Introduction
Artificial Intelligence (AI) is transforming the drug discovery process by speeding up the development of new therapeutics and reducing high costs, extended timelines, and high failure rates inherent in traditional pharmaceutical innovation [1], [2]. The successful integration of AI-driven approaches in pharmaceutical research relies on several critical factors. High-quality data availability and preprocessing remain fundamental to AI effectiveness, as inconsistencies in biomedical datasets can significantly impact model performance and reproducibility [3].
Machine learning (ML) applications enhance all stages of drug discovery by enabling AI to support data-driven decision-making and boost clinical trial success rates. However, challenges remain, including the interpretability of ML models and the necessity for systematic, high-dimensional biological datasets, which can hinder AI’s reliability in real-world applications [4].
Deep learning (DL)-based approaches have demonstrated remarkable advancements in predicting drug-target interactions (DTIs), virtual screening, and de novo drug design [5]. By leveraging neural networks and molecular representations, AI can accelerate lead optimization while enhancing the precision of drug selection. However, the integration of AI in drug discovery is still hindered by data biases, generalization limitations, and regulatory constraints [6]. Addressing these challenges requires interdisciplinary collaboration and improved data-sharing frameworks to maximize AI’s impact on pharmaceutical innovation. Additionally, regulatory concerns surrounding AI-driven drug development are becoming increasingly significant as evolving policies struggle to keep pace with technological advancements [7]. Given that traditional drug development timelines exceed 12 years and cost more than
AI models have demonstrated their potential to optimize key stages of drug development, such as target identification, hit generation, and chemical optimization [9], [10]. ML and DL methodologies have shown remarkable promise in preclinical research by leveraging computational modeling to predict outcomes and discover novel compounds [11], [12]. Despite these advancements, challenges such as data scarcity, heterogeneity, and class imbalance impede the practical deployment of AI models in real-world pharmaceutical applications [13], [14]. Moreover, many advanced AI systems function as opaque “black boxes,” making it difficult for healthcare professionals and regulatory agencies to fully trust their predictions without enhanced model interpretability [14], [15].
The ethical application of AI in drug discovery introduces further complexities, such as ensuring algorithmic fairness, protecting data privacy, and maintaining regulatory compliance to guarantee accountability and reliability [16], [17]. Aligning AI-generated predictions with experimental validation is a significant challenge, as discrepancies between computational outcomes and clinical trial results can undermine the reliability of AI-driven drug candidates [18]. Industrial-scale AI-powered platforms have sought to integrate multi-omics, ML, and computational precision medicine to refine target selection and reduce the risk of clinical failure, yet challenges remain in standardizing data integration and model interpretability [18]. ML-based drug repositioning methods also leverage large-scale datasets and predictive modeling to identify novel therapeutic applications for existing drugs. Still, the lack of interpretability in model decisions continues to be a limiting factor [19]. The quality, completeness, and accessibility of bioactivity data further influence the reliability of AI-based drug discovery, as inconsistency in experimental assays and deposition protocols hinders reproducibility and cross-validation [20]. While techniques such as transfer learning and explainable AI (XAI) have been introduced to enhance model adaptability to novel drug targets [21], [22], the absence of standardized benchmarks and incomplete bioassay datasets continue to restrict the scalability and generalization of AI models across diverse disease areas [5], [19].
To address these multifaceted challenges, this study presents a strategic framework that utilizes the AHP to identify and prioritize essential success criteria in AI-driven drug discovery. The proposed framework is driven by expert insights, offering a scientific approach to optimize resources and enhance the effectiveness of AI-driven drug discovery. Unlike previous studies focusing on isolated computational advancements, this research systematically integrates expert-driven prioritization to evaluate the most critical factors influencing AI-driven drug discovery. By applying a structured decision-making approach, this study ensures that AI methodologies align with real-world pharmaceutical constraints, strengthening their scientific validity and industry relevance. This contribution enhances resource allocation and supports the development of AI models that meet technical and regulatory requirements, ultimately improving the practical applicability of AI-driven drug discovery frameworks. It assesses key challenges and limitations highlighted in previous studies by prioritizing critical criteria identified from the literature. This comprehensive approach aims to alleviate bottlenecks in the drug development pipeline, enhance predictive performance, improve model interpretability, and ensure regulatory compliance. Ultimately, the framework seeks to accelerate drug discovery processes, improve approval rates for new therapeutic candidates, and streamline the path toward innovative treatments.
This paper contributes to the growing body of research on strategic planning for AI-driven drug discovery by:
Bridging the Gap Between AI and Pharmaceutical Sciences: This research establishes a scientific framework that aligns AI-driven strategies with the practical needs of drug discovery. By integrating expert opinions from machine learning, deep learning, and AI applications, the study enhances the applicability and robustness of AI in pharmaceutical research.
Providing a Structured Decision-Making Framework: The study enhances the scientific community by offering a structured decision-making framework for AI-driven drug discovery. Utilizing an AHP-based prioritization process, it systematically evaluates and ranks critical technical, regulatory, and operational factors. This approach helps researchers and industry professionals navigate the complexities of integrating AI into drug development.
This study Advancing Multi-Criteria Analysis in AI Implementation: This study enriches existing literature by introducing a comprehensive multi-criteria analysis that encompasses six critical areas:
Data Quality Management (DQM).
Algorithm Performance and Optimization (APO).
Interpretability and Explainability (IE).
Regulatory Compliance and Ethical Considerations (RCEC).
Computational Efficiency and Scalability (CES).
Validation and Experimental Confirmation (VEC).
By identifying 24 interrelated success factors within these categories, the research offers a more holistic understanding of the challenges in AI-driven drug discovery. It is a reference for future studies in computational drug development.
Enhancing Transparency and Reliability in AI-Driven Drug Discovery: A significant challenge in AI adoption for drug discovery is ensuring transparency, interpretability, and regulatory compliance. This study directly highlights this issue by structuring expert-driven prioritization to enhance the reliability of AI applications in pharmaceutical research. The proposed framework ensures that AI adoption is based on scientifically validated criteria rather than ad hoc or isolated technical advancements.
Facilitating Efficient Resource Allocation and Strategic Planning: This study offers pharmaceutical companies and regulatory bodies a strategic roadmap to optimize resource allocation. By employing AHP for pairwise comparisons, decision-makers can prioritize factors that most significantly impact the success of AI adoption in drug discovery. This structured approach promotes efficiency and cost-effectiveness in pharmaceutical research and development.
Laying a Foundation for Future Research and Policy Development: This research provides a foundation for future advancements in AI-driven drug discovery, allowing policymakers, researchers, and industry leaders to build on a scientific framework. It serves as a benchmark for evaluating AI strategies in pharmaceutical sciences and paves the way for further refinement through empirical validation and real-world implementation.
The AHP-based framework fills a crucial research gap by creating a comprehensive, expert-driven system that systematically evaluates and prioritizes key factors for AI-driven drug discovery. It transcends technical discussions to offer an integrative approach encompassing scientific, regulatory, and operational considerations. This methodology ensures a balanced and validated AI adoption strategy, significantly advancing theoretical and practical aspects of modern pharmaceutical development.
Many experts with diverse research interests were invited to ensure the AHP framework reflects a well-rounded and practical understanding of the challenges in AI-driven drug discovery. This diversity encompasses specialists in ML, DL, and other computational methods, all of whom have significantly contributed to drug discovery through their research and publications. The selection process prioritized academic achievements, professional experience, and active involvement in developing computational models, ensuring a broad spectrum of perspectives and practical expertise was captured.
From the invited pool, 21 experts completed the AHP questionnaire, representing the variety of research interests necessary to address the multifaceted challenges in this domain. The chosen number of experts aligns with the AHP methodology, which relies on qualitative, pairwise comparisons rather than large statistical samples to derive meaningful results [23]. This approach ensures critical factors are prioritized accurately based on expert knowledge and experience. The participants’ diverse backgrounds enhance the proposed framework’s robustness and applicability, fostering innovative solutions to optimize AI-driven drug discovery.
Table 1 presents a comprehensive summary of 21 experts, emphasizing their contributions and research focuses on computational drug discovery. This table shows the diverse expertise involved in the study, which spans a wide range of topics. Integrating various research areas strengthens the robustness of the AHP framework used in the research.
The structure of this study is outlined as follows: Section II provides an in-depth literature review on key research areas and methodologies related to AI-driven drug discovery; Section III outlines the study’s methodology, detailing the approach and analysis steps taken; Section IV presents and discusses the results obtained; Section V highlights the implications, limitations, and future research directions stemming from the findings; Section VI concludes the paper with a summary of contributions and insights.
Literature Review
This section defines and analyzes the current state of AI-driven methods in drug discovery, addressing challenges, limitations of previous studies, diverse applications of AHP, and its potential to improve drug discovery efforts.
A. Drug Discovery
Drug The discovery and development of a new drug is a complex and expensive process, costing up to
Drug discovery typically begins with medicinal chemists generating a library of lead compounds, which are assessed through structure-activity relationships (SARs) to evaluate their in vitro efficacy and preclinical in vivo safety. Subsequently, promising candidates are subjected to formulation, stability assessments, scale-up production, and chronic safety studies in animal models before advancing to clinical trials [40], [41]. Despite these rigorous efforts, many drug candidates fail to reach the market due to formulation challenges, even when demonstrating potent biological activity. Figure 1 illustrates the drug development processes.
Recent AI advancements promise to enhance drug discovery by using ML and DL techniques to expedite hit identification and lead optimization, which are critical early phases of drug development. These technologies enable rapid screening of millions of compounds, providing predictive insights into binding affinities and pharmacological properties, thus improving the precision and efficiency of the selection process [42]. Furthermore, AI-based models play a pivotal role in refining the lead optimization phase by predicting drug-likeness and potential toxicities, thereby improving the probability of selecting viable candidates. Despite the significant potential of AI-driven approaches, challenges remain, including the need for high-quality datasets, regulatory barriers, and the requirement for experimental validation to confirm computational predictions [16], [19]. Overcoming these limitations is crucial for effectively integrating AI into the pharmaceutical industry’s workflows and expediting the delivery of safe and effective therapeutics to market.
B. Computational Drug Discovery Approaches
Computational methods such as ML and DL have emerged as indispensable tools in modern drug discovery. These advanced techniques enable the comprehensive analysis of vast amounts of biological data, molecular interactions, and chemical structures with remarkable efficiency and accuracy [2]. Molecular docking, virtual screening, and quantitative structure-activity relationship (QSAR) modeling are widely used to predict drug efficacy and safety profiles [43]. Public databases like ZINC, BindingDB, and PubChem provide valuable datasets that enable computational techniques to predict interactions between drugs and biological targets. These databases give access to vast chemical and biological data repositories essential for in-silico drug discovery [44], [45]. However, heterogeneous data sources pose a significant challenge due to the varying formats and quality of available datasets, making it difficult to integrate and analyze them seamlessly [13], [46]. Moreover, data warehousing tools such as SWISS-PROT, BIOMOLQUEST, and others have been developed to address data integration issues, yet achieving data standardization and curation remains critical [47].
In this context, Figure 2 provides an overview of computational approaches categorized into three main groups: ML, DL, and other methods. It includes examples of popular techniques such as random forest classification, k-nearest neighbors (k-NN), recurrent neural networks (RNNs), network-based methods, and Bayesian optimization strategies.
The figure illustrates how public databases like ZINC and PubChem [44], [45]. Supply these models with extensive chemical and biological data, enhancing training and prediction reliability. Researchers can address heterogeneity by integrating data from multiple sources and utilizing data warehousing tools, ultimately accelerating drug discovery through AI-driven solutions. The following subsections will explore these approaches.
1) Machine Learning (ML)
ML is now crucial in computational drug discovery. By analyzing complex biological datasets, ML techniques provide powerful predictive abilities. These techniques significantly enhance drug-target interaction (DTI) prediction, virtual screening, and lead compound identification [14]. Popular algorithms such as Random Forest, Support Vector Machines (SVM), and k-nearest Neighbors (kNN) identify patterns within large-scale, high-dimensional biological data, streamlining the drug discovery process [13]. Despite their effectiveness, these models often struggle when trained on small or imbalanced datasets. They face challenges such as overfitting, which limits their generalizability to new data [6], [48].
Several strategies have been implemented to improve model robustness and address these limitations. Data augmentation and resampling techniques balance class distributions, while cost-sensitive learning introduces penalties for misclassifications, enhancing the performance of ML models in drug discovery applications [48], [49]. By incorporating these advancements, ML-based frameworks can provide more accurate and reliable predictions, contributing to more efficient and targeted drug development pipelines.
2) Deep Learning (DL)
DL represents a significant breakthrough in AI, fundamentally transforming drug discovery through its advanced modeling capabilities that effectively learn intricate and non-linear patterns from extensive datasets [50]. These techniques empower researchers to process and interpret complex biological data, driving more effective and innovative drug development approaches. A prominent innovation in DL is the application of graph-based neural networks, which enhance prediction accuracy by representing molecular structures as graphs [5].
Furthermore, DL architectures such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have demonstrated exceptional performance in predicting drug-target interactions, accelerating drug repurposing efforts, and optimizing molecular properties [51], [52].
However, with these advancements, DL models often face the challenge of limited interpretability, commonly called the black-box problem, where the decision-making processes remain opaque, hindering acceptance in clinical and regulatory environments [53]. To improve transparency, Explainable AI (XAI) approaches such as Local Interpretable Model-Agnostic Explanations (LIME) have been introduced. These offer insights into model predictions and foster trust in AI-driven outputs [21]. Addressing the interpretability issue can help DL frameworks achieve broader adoption and integration in real-world drug discovery workflows, paving the way for safer and more efficient pharmaceutical innovations.
3) Other Approaches Used AI-Driven Drug Discovery
Building upon the foundations of ML and DL, various complementary methodologies have emerged, playing critical roles in advancing drug discovery processes. Network-based approaches, for instance, enable the identification of novel drug-disease associations by analyzing complex relationships among drugs, diseases, and biological targets. This network-centric analysis aids in uncovering previously unrecognized connections, thereby bolstering drug repurposing efforts and expediting the identification of new therapeutic applications for existing drugs [54].
Transfer learning is pivotal, especially when limited labeled data are available. By leveraging models pre-trained on extensive datasets, transfer learning facilitates fine-tuning these models for specific tasks within drug discovery, thereby enhancing efficiency and precision. This approach enables researchers to overcome data scarcity issues by transferring relevant knowledge across related domains, ultimately accelerating hypothesis generation and validation [55]. Moreover, multitasking learning further enhances learning outcomes by training models across multiple related tasks, which helps mitigate data scarcity issues. Additionally, Bayesian optimization and ensemble learning techniques are often utilized to boost model performance and tackle overfitting, enhancing scalability and robustness in AI-driven drug discovery processes [56], [57]. Ontology-based integration tools, like the Ontology Web Language and RDF Schema, address data heterogeneity by integrating biomedical data from diverse sources. However, scalability remains challenging with large datasets of varying formats [58], [59].
Explainable AI (XAI) techniques, such as Local Interpretable Model-agnostic Explanations (LIME), have emerged as crucial tools for increasing transparency, fostering model acceptance, and addressing ethical and regulatory requirements in clinical applications [60], [61]. Furthermore, big data analytics play a significant role in managing and analyzing diverse datasets, supporting pipelines that integrate data from major public databases like ZINC, BindingDB, PubChem, and DrugBank, which are essential for evaluating drug properties comprehensively and optimizing drug discovery pipelines [62], [63].
C. Shortcomings of Previous Studies
Past research on computational drug discovery has revealed challenges and limitations that impede their broader applicability and effectiveness across different fields. A significant issue relates to data quality and management; healthcare and biomedical data frequently exhibit high noise levels, inconsistency, and heterogeneity [53], [64]. These affect the reliability of ML models and result in poor performance, especially when datasets are small, imbalanced, or lack standardization. In many cases, the datasets used for drug discovery models are limited, leading to overfitting and difficulty in generalizing new, unseen data [22], [65]. Furthermore, challenges arise in integrating heterogeneous data from various sources, creating bottlenecks in effectively leveraging this information for predictive modeling [66].
Algorithm performance and optimization present another key challenge. ML models, especially DL, offer advanced capabilities, but they often struggle with overfitting due to the limited data available for training, particularly in drug-target interaction prediction [19], [67]. Many of these models require extensive fine-tuning of complex hyperparameters, which is time-consuming and computationally expensive. This thereby limits the scalability of models to large datasets, an essential requirement for real-world drug discovery applications [68].
Despite their impressive performance, DL models still face limitations. One limitation of DL, along with future perspectives, is that DL can achieve high accuracy due to advances in feature learning. However, this performance depends on having a large training dataset. When data is limited, DL techniques struggle to generalize effectively and often produce biased estimates of model performance. Traditional shallow ML methods may outperform DL models in these cases, as they are less prone to overfitting and require fewer computational resources [69].
A critical issue in interpretability and explainability is the “black-box” nature of many ML and DL models [70], [71]. These models are opaque in their decision-making processes, making it difficult for researchers, clinicians, and regulators to understand the mechanisms behind their predictions [15]. The lack of transparency is especially problematic in drug discovery, where mechanistic insights are needed for regulatory approval and clinical adoption [72]. Recent efforts in explainable AI (XAI) aim to address this issue, but achieving a balance between model complexity and interpretability remains a significant hurdle [3], [73].
Regulatory compliance and ethical considerations further complicate the adoption of AI in drug discovery. AI models raise questions about accountability, fairness, and privacy, especially when handling sensitive patient data [70]. Ensuring that AI models comply with ethical standards and regulatory guidelines is essential for their wider adoption in clinical settings. However, the challenges of meeting these requirements limit the speed at which AI models can be integrated into real-world drug discovery pipelines [17], [18].
The high computational costs of ML models, especially when processing large-scale datasets, require significant resources such as high-performance GPUs or cloud computing platforms [74]. Smaller research institutions often lack access to these resources, limiting their ability to deploy advanced ML techniques effectively. Additionally, the time complexity of training and optimizing DL models restricts their scalability, making them less feasible for practical drug discovery applications [52], [75]. Moreover, validation and experimental confirmation are critical aspects often overlooked in computational studies. Many models are developed and tested using synthetic or pre-clinical data. Still, the actual test of their predictive power lies in experimental validation, particularly in clinical trials [18]. Discrepancies between computational predictions and real-world outcomes often slow the drug development process, as the lack of experimental corroboration reduces the reliability and applicability of these models [19], [52]. Thus, bridging the gap between computational predictions and experimental results is essential for advancing AI-driven drug discovery.
This study seeks to prioritize critical challenges in drug discovery through the AHP. Systematic ranking of these issues within a scientific framework enables our research to provide valuable insights and methodologies that can meaningfully enhance AI model development in this field.
D. Analytic Hierarchy Process (AHP)
AHP is a widely recognized decision-making tool that facilitates multi-criteria analysis by decomposing complex problems into a structured, hierarchical model. Initially developed by Thomas Saaty in the late 1970s [76]. AHP provides a systematic approach to decision-making that integrates quantitative and qualitative factors, making it highly adaptable across various fields. Its ability to prioritize and weigh diverse criteria has led to its implementation in the healthcare and environmental management sectors, as well as construction and business, where decisions often require balancing competing objectives with significant real-world implications [77].
AHP has proven particularly effective in healthcare and medical decision-making, helping practitioners and policymakers assess and prioritize treatment options, medical technologies, and patient care strategies. Research by Liberatore and Nydick illustrates AHP’s capacity to support systematic evaluations based on key criteria such as cost, treatment efficacy, and patient outcomes, thereby enhancing the transparency and consistency of decisions [78]. Similarly, AHP has been instrumental in environmental management, prioritizing sustainable solutions and providing a framework for balancing economic, social, and ecological criteria. Studies underscore AHP’s value in facilitating decisions that align with sustainable development goals by enabling stakeholders to analyze and weigh factors critical to environmental preservation [79].
AHP is also extensively applied in construction and engineering [80], [81], where it aids in selecting materials, risk assessment, and resource management. By enabling structured prioritization of factors such as cost, quality, and sustainability, AHP contributes to more robust project planning and execution. Furthermore, in the business sector, AHP supports managerial decision-making by offering a structured approach to evaluate both tangible and intangible factors, such as financial metrics, customer satisfaction, and corporate social responsibility [77]. This structured approach highlights AHP’s unique ability to assess criteria that impact strategic business outcomes objectively, thus fostering a holistic approach to decision-making in corporate environments [82].
Beyond traditional sectors, AHP has found critical applications in natural resource management, where it balances the need for resource utilization with conservation imperatives. Research in this area shows how AHP facilitates the integration of expert opinions, aiding in the development of policies that sustain biodiversity and resource availability [83]. This flexibility has also led to the application of AHP in emerging fields such as technology and innovation management, where structured decision-making is essential to navigate the complexities of modern advancements.
Given the versatility and structured approach of AHP, this study is pioneering its application in AI-driven drug discovery. As AI transforms drug development, AHP systematically prioritizes critical factors like data quality, model interpretability, and ethical considerations. This is the first application of AHP in this field, driven by closely related experts, and it establishes a framework that could significantly improve decision-making in AI-driven pharmaceutical research. The structured and quantifiable insights provided by AHP could lead to more efficient, transparent, and ethical development processes, highlighting its potential as a transformative tool for drug discovery in the era of AI.
The Methodology of Study
AHP, developed by Thomas Saaty in the 1970s, is a robust multi-criteria decision-making method [76], [84], [85]. This methodology enables decision-makers to structure and prioritize complex issues by organizing them into a hierarchy, allowing for systematic comparisons and quantification of subjective assessments. This study applies AHP to prioritize critical success factors affecting AI-driven drug discovery, employing geometric mean calculations to combine individual judgments into a group judgment matrix. This approach effectively synthesizes inputs from multiple stakeholders, offering a balanced perspective on the importance of each factor [86], [87].
This study uses an AHP hierarchy with three levels. The top level identifies the critical success factors influencing AI-driven drug discovery. The second level presents broad criteria for these factors, while the third level details specific sub-criteria within each category. Unlike traditional AHP studies, this model emphasizes identifying the most influential factors rather than including alternatives.
AHP’s capability to address complex decision-making problems makes it well-suited for this study. Its hierarchical structure facilitates a logical breakdown of the problem, and pairwise comparisons help decision-makers evaluate the relative importance of each factor. Using geometric means to aggregate individual assessments ensures that the group judgment matrix reflects a balanced consensus. Although other MCDM techniques like ANP, TOPSIS, or PROMETHEE may be suitable in different contexts [87], [88], [89], AHP’s structured framework and proven success in hierarchical analysis make it particularly fitting for this research.
Figure 3 illustrates a flowchart that outlines the research workflow, from defining the research goal to obtaining the results. This visual representation highlights the interconnected steps and the study’s systematic approach.
The research methodology begins by defining the primary goal and developing a hierarchical structure of relevant criteria and sub-criteria. Questionnaires and pairwise comparison matrices are created to capture expert input and evaluate the relationships among the criteria and their sub-criteria. These pairwise comparison matrices are then normalized to calculate priority weights. Afterwards, a consistency ratio (CR) is verified to ensure the comparisons are logically consistent. Finally, the priority scores are aggregated to prioritize the critical success factors. This structured approach enhances the reliability and validity of the study’s outcomes.
A. Define the Goal
The first and most crucial step in AHP is to define the primary objective. After an extensive literature review, this study establishes to assess and rank the essential factors of success influencing AI-driven drug discovery. Although many studies recognize these factors as challenges, limitations, and concerns in AI-driven drug discovery, none have explored them comprehensively or established a framework for systematically identifying and prioritizing their significance.
This study addresses this gap by introducing a solid AHP-based framework driven by insights from 21 domain experts with extensive knowledge in AI-driven drug discovery. These experts ensure that all aspects and concerns in the domain are captured comprehensively, enabling a systematic evaluation and prioritization process. By defining this goal, the framework integrates expert perspectives to address key challenges, delivering actionable insights that advance the field of AI-driven drug discovery.
B. Hierarchical Structure and Selected Criteria
The study systematically organizes and selects criteria and sub-criteria based on extensive literature reviews addressing key challenges in the domain. Six main criteria and 24 sub-criteria were identified, representing critical success factors. These criteria ensure a comprehensive evaluation framework, balancing scientific insights and practical challenges.
Table 2 outlines the six primary criteria and their corresponding descriptions and sub-criteria (factors), forming a comprehensive framework for evaluating critical success factors in AI-driven drug discovery. These criteria encompass Data Quality and Management (DQM), Algorithm Performance and Optimization (APO), Interpretability and Explainability (IE), Regulatory Compliance and Ethical Considerations (RCEC), Computational Efficiency and Scalability (CES), and Validation and Experimental Confirmation (VEC). Each criterion is further elaborated with specific sub-criteria, detailed in Table 3, including in-depth descriptions and relevant references for each factor.
DQM’s sub-criteria include Accuracy (ACC), Completeness (CMP), Integrity of Heterogeneous Data (IHD), and Class Imbalance Handling (CIH).
The APO criterion addresses key aspects such as Generalization (GEN), Hyperparameter Tuning (HT), Overfitting Control (CO), and Training Learning (TL).
Similarly, the IE criterion comprises Explainable AI (XAI), Black-box Model Challenges (BBMC), Alignment with Domain Knowledge (ADK), and Predictive Sensitivity Testing (PST).
Within the RCEC criterion, the sub-criteria include Adherence to Standards (ATS), Bias and Fairness (BF), Data Privacy and Security (DPS), and Ethical AI Usage (EAU).
The CES emphasizes Infrastructure Readiness (IR), Scalability of Solutions (SOS), Optimization of Resources (OOR), and Future Technologies (FT).
Lastly, the VEC criterion comprises Experimental Validation (EV), Uncertainty Estimation (UE), Benchmark Models (BM), and Feedback Loop (FL).
These factors were identified through a comprehensive literature review based on their significant relevance to challenges, limitations, and unresolved issues highlighted in previous studies on AI-driven drug discovery. While some studies directly address AI methodologies applied to drug development, others provide insights into related processes that influence discovery outcomes. Tables 2 and 3 offer a structured framework, emphasizing the critical role of these criteria in the current research.
Figure 4 illustrates the hierarchical structure of factors influencing AI-driven drug discovery using the AHP method. After establishing criteria and sub-criteria, we developed a structured decision framework to rank and prioritize these factors across three levels systematically:
Hierarchical levels of AHP for prioritizing critical factors in AI-Driven Drug Discovery.
The top level represents the main objective: prioritizing critical success factors vital for AI-driven drug discovery. The second level organizes six broad criteria: DQM, APO, IE, RCEC, CES, and VEC. The third level details specific sub-criteria within each criterion, facilitating a structured evaluation.
C. Questionnaire Construct and Pairwise Comparison Matrices
In this research phase, a structured questionnaire was developed to systematically evaluate and prioritize the chosen criteria and sub-criteria, ensuring alignment with the identified factors. The complete set of questionnaires, organized, captures relevant factors and establishes an evaluation framework. Experienced experts in AI-driven drug discovery were chosen to validate the framework. Their insights ensured the representation of all critical aspects.
Based on expert inputs, pairwise comparison matrices for the criteria and sub-criteria were developed to assess their relative importance. Each criterion was compared with others in pairs, and sub-criteria within each criterion were similarly evaluated. The decision-makers used Saaty’s scale of 1 to 9 to rate the relative importance of elements.
The geometric mean is applied to each pairwise comparison to aggregate these individual judgments into a single group judgment matrix. For example, if multiple stakeholders provide comparisons \begin{equation*} c_{ij}={\left ({{\prod \limits _{k=1}^{K} c_{ij}^{(k)} }}\right)^{\frac {1}{K}}} \tag {1}\end{equation*}
The pairwise comparison matrix C for the criteria take the form:\begin{align*} C=\left [{{ c_{ij} }}\right ]=\left [{{\begin{array}{cccccccccccccccccccc} 1 & \quad c_{12} & \quad \cdots & \quad c_{1m} \\ c_{21} & \quad 1 & \quad \cdots & \quad c_{2m} \\ \vdots & \quad \vdots & \quad \ddots & \quad \vdots \\ c_{m1} & \quad c_{m2} & \quad \cdots & \quad 1 \\ \end{array}}}\right ]\end{align*}
D. Normalize the Matrices and Calculate Priority Weights
The group judgment matrices are normalized to create priority weights, representing the relative importance of each element. For each column in the criteria matrix \begin{equation*} p_{i}=\frac {1}{m}\sum \limits _{j=1}^{m} n_{ij} \tag {2}\end{equation*}
E. Consistency Ratio (CR) Verification
The consistency ratio (CR) is calculated to confirm the reliability of the group judgments. First, the weighted sum matrix W is derived by multiplying each element in the criteria matrix C by its corresponding priority weight in P. The maximum eigenvalue \begin{equation*} \lambda _{max}=\frac {1}{m}\sum \limits _{i=1}^{m} \frac {w_{i}}{p_{i}} \tag {3}\end{equation*}
\begin{equation*} CI=\frac {\lambda _{max}-m}{m-1} \tag {4}\end{equation*}
The consistency ratio is derived by dividing CI by a random index (RI):\begin{equation*} CR=\frac {CI}{RI(n)} \tag {5}\end{equation*}
A CR of less than 0.1 indicates acceptable consistency, while higher values suggest adjusting the comparisons for more excellent reliability.
F. Aggregate Priority Scores
Once the priority vectors for the criteria P and sub-criteria
Discussion and Results
This study identifies key success factors for AI-driven drug discovery, illustrating their relative importance using a structured AHP framework. This section examines results from pairwise comparisons of criteria and sub-criteria (factors) to optimize AI methodologies in drug discovery.
The study applied the AHP to evaluate six main criteria and their factors to prioritize critical success factors. Table 4 presents the weights and significance of these categories.
Data Quality and Management (DQM) emerged as the most critical, with a weight of 35.7%, and its sub-criterion, Accuracy (ACC), received the highest score at 40.6% of DQM’s weight. This highlights the necessity of reliable and accurate data for practical AI applications in drug discovery. Completeness (CMP), Class Imbalance Handling (CIH), and Integration of Heterogeneous Data (IHD) further underline the importance of comprehensive, balanced, and effectively integrated data.
Algorithm Performance and Optimization (APO) ranks second, with Generalizability (GEN) weighted at 52.4%, indicating the need for robust models across diverse datasets. Other factors like Hyperparameter Tuning (HT), Control Overfitting (CO), and Transfer Learning (TL) also stress adaptability, robustness, and efficiency.
Validation and Experimental Confirmation (VEC), at 14.7%, highlights the importance of Experimental Validation (EV) (57.9% of VEC), reinforcing the need for rigorous experimental support for AI predictions. Complementary sub-criteria like Uncertainty Estimation (UE), Benchmarking Models (BM), and Feedback Loop (FL) emphasize confidence quantification and iterative refinement.
Interpretability and Explainability (IE), with a weight of 11.3%, showcases the importance of Explainable AI (XAI) alongside Predictive Sensitivity Testing (PST), Alignment with Domain Knowledge (ADK), and addressing Black-Box Model Challenges (BBMC).
Regulatory Compliance and Ethical Considerations (RCEC) carry a weight of 7.5%. They focus on Bias and Fairness (BF) and assess ethical AI applications along with Data Privacy and Security (DPS), Ethical AI Use (EAU), and Adherence to Standards (ATS).
Computational Efficiency and Scalability (CES), the least weighted at 5.9%, stresses the significance of scalable and efficient AI systems, emphasizing Scalability of Solutions (SOS) and Optimization of Resources (OOR), supported by Infrastructure Readiness (IR) and advancements in Future Technologies (FT). These insights present a structured understanding of critical factors and their interdependencies for optimizing AI-driven drug discovery.
The pairwise comparison matrices provide a granular view of the relationships and relative significance among criteria and sub-criteria. Table 5 shows a pairwise matrix for criteria.
Table 6 presents pairwise comparisons of six sub-criteria within each category, highlighting their relationships and relative significance.
Table 7 below summarizes the consistency analysis results for seven pairwise comparison matrices used in this research. Each matrix was evaluated to determine its maximum eigenvalue.
This research’s results are based on responses from structured questionnaires directed at experts in AI-driven drug discovery. Table 8 and Figure 4 highlight the prioritization of 24 factors, emphasizing the systematic approach needed for addressing the practical crucial success factors.
Figure 5 complements the insights from Table 8 by presenting a clear, ranked overview of the crucial factors influencing AI-driven drug discovery. This visualization offers an intuitive comparison that emphasizes key priorities and areas for further improvement, reinforcing the strategic emphasis on AI in pharmaceutical research.
Although experts assigned lower importance to certain factors, their significance should not be overlooked, as many studies highlight their relevance. Factors like ethical AI use, infrastructure readiness, adherence to standards, and future technologies may not directly impact the immediate success of AI-driven drug discovery. Still, they are vital for long-term sustainability and regulatory compliance. Additionally, optimizing resources and ensuring data privacy is essential for maintaining AI systems’ efficiency, security, and trustworthiness over time.
The expert assessment reveals that these lower-weight factors, though not primary concerns, function as supportive factors rather than key determinants of AI success in drug discovery. This underscores the need for a prioritization framework that allows decision-makers to allocate efforts and resources efficiently toward the most impactful factors while remaining mindful of secondary, yet potentially influential, considerations.
The top 10 factors have been analyzed for their significance and impact. The following section examines these factors, discussing their importance, challenges, and potential effects on optimizing AI-driven drug discovery systems.
A. Accuracy (ACC) – 14.5%
Accuracy is paramount in AI-driven drug discovery. High accuracy ensures reliable predictions and reduces false positives and negatives during drug candidate screening, vital for cost-effectiveness and clinical success. Inaccurate data and model outputs can impede downstream processes like experimental validation and regulatory approval. Therefore, enhancing data preprocessing, addressing noise, and refining model architecture is crucial for achieving this goal.
B. Generalizability (GEN) – 13.0%
Generalizability measures an algorithm’s effectiveness across diverse datasets, which is crucial in drug discovery because of the variability in biological data and the complexity of molecular structures.
A generalizable model lessens reliance on large, curated datasets and ensures strong performance in varied contexts, such as rare disease treatments or cross-species analysis. This highlights the importance of thorough training and validation across heterogeneous datasets.
C. Experimental Validation (EV) – 8.5%
Experimental validation provides the bridge between computational predictions and real-world applicability.
This factor’s high ranking highlights the necessity of experimental testing to confirm the reliability of AI-generated insights, such as lead compound identification or toxicity predictions. Strong experimental validation ensures that computational outcomes are theoretical and actionable, reducing the risks associated with clinical trials.
D. Completeness (CMP) – 8.2%
Completeness refers to the availability and comprehensiveness of datasets used in AI training. In drug discovery, data gaps, whether due to missing molecular properties or incomplete pharmacokinetic profiles, can lead to erroneous predictions. The ranking of this factor signifies the importance of addressing missing data through imputation techniques or by integrating diverse data sources.
E. Class Imbalance Handling (CIH) – 8.0%
Class imbalance handling is critical in drug discovery, where datasets often have a disproportionate representation of active versus inactive compounds. Models trained on imbalanced datasets tend to favor the dominant class, reducing the chances of identifying novel drug candidates. Proper techniques such as oversampling, undersampling, or advanced algorithms like SMOTE (Synthetic Minority Oversampling Technique) are essential for ensuring balanced and unbiased predictions.
F. Use of Explainable AI (XAI) – 6.1%
Explainability in AI models is crucial for trust and transparency, especially in sensitive domains like drug discovery. By providing insights into how predictions are made, XAI enables researchers to understand the underlying decision-making processes, facilitating regulatory approvals and stakeholder acceptance. For example, feature importance analysis can reveal which molecular properties drive efficacy predictions, making XAI an indispensable tool.
G. Hyperparameter Tuning (HT) – 5.3%
Hyperparameter tuning is pivotal in optimizing AI models, as it directly impacts performance metrics such as accuracy and generalizability. The ranking of this factor reflects the need to fine-tune parameters such as learning rates, dropout rates, and the number of hidden layers to achieve optimal results. Automated approaches like grid search or Bayesian optimization can streamline this process, saving time and computational resources.
H. Integration of Heterogeneous Data (IHD) – 5.0%
Drug discovery often combines data from various sources, including chemical properties, biological assays, and clinical trials. Integrating heterogeneous data ensures a holistic view, enabling models to capture complex relationships across domains. This factor’s ranking signifies the growing importance of multi-modal data processing in AI-driven drug discovery pipelines.
I. Control Overfitting (CO) – 4.4%
Overfitting occurs when an AI model performs well on training data but fails to generalize to unseen data. This issue is particularly relevant in drug discovery, where datasets are often limited in size. Controlling overfitting through regularization techniques, dropout, or cross-validation is critical for building robust and reliable models that can adapt to new challenges.
J. Bias and Fairness (BF) – 3.8%
Bias and fairness in AI systems are emerging concerns in drug discovery. Biased models can lead to inequitable outcomes, such as favoring specific drug candidates over others due to dataset imbalances or algorithmic predispositions. This factor emphasizes the importance of addressing biases in training data and ensuring equitable performance across diverse populations, ultimately leading to fairer and more inclusive drug development processes.
The top-ranked factors provide a structured roadmap for advancing AI-driven drug discovery. Enhancing data accuracy, algorithm generalizability, and experimental validation is critical for ensuring reliable and actionable AI outputs. Addressing class imbalance, overfitting, and bias strengthens model fairness and robustness.
This study advances the field by providing a structured, expert-driven prioritization of AI success factors, addressing critical gaps in data integrity, algorithmic transparency, and scalability. Unlike conventional AI adoption strategies focusing on isolated improvements, our approach systematically ranks the most influential criteria, ensuring that AI methodologies are both scientifically valid and practically applicable. By integrating domain expertise into AI evaluation, this research offers a decision-making framework that enhances strategic planning for AI-driven drug discovery, bridging the gap between computational innovations and real-world pharmaceutical requirements.
This study’s framework is a decision-making tool, enabling efficient resource allocation and a strategic focus on impactful areas. It also identifies future research opportunities, including ethical considerations, computational efficiency, and leveraging emerging technologies to develop scalable, robust, and ethically sound AI systems for the pharmaceutical domain.
Implications, Limitations, and Future Research Directions
This study outlines a comprehensive framework for assessing key success factors in AI-driven drug discovery, acting as a strategic resource to enhance decision-making processes in pharmaceutical research. By emphasizing six core criteria, the findings highlight the importance of accurate data, generalizable models, and thorough experimental validation in bridging the gap between computational predictions and their practical applications.
A notable contribution of this research is its systematic prioritization of factors, which offers a clear path for enhancing AI methodologies in alignment with pharmaceutical objectives. Strengthening data management practices, improving algorithm performance, and fostering transparency through interpretability and compliance can address common obstacles, such as data heterogeneity, scalability issues, and trust in AI-based systems. This structured approach equips stakeholders with actionable insights for resource allocation and pipeline optimization.
However, some limitations exist that merit further consideration. The proposed framework was built using a specific set of datasets and expert evaluations, which may limit its adaptability to other contexts within the pharmaceutical domain. AHP is a valuable tool for multi-criteria analysis; however, it assumes that expert judgments are consistent. This assumption may not adequately reflect the complexities of rapidly advancing AI technologies. Furthermore, the dependence on curated datasets emphasizes the importance of considering dynamic, real-world situations and emerging data sources.
Future studies could broaden the framework’s scope by incorporating diversity and expanding its application to various stages of drug discovery. Exploring adaptive weighting methods could improve its responsiveness to shifting priorities in AI research and drug development. Integrating domain-specific constraints and advancements, such as generative AI and quantum computing, could further refine the framework’s capabilities. Real-world validation across AI-driven drug discovery research as a first step to global initiatives will also be crucial to assess the framework’s scalability and robustness. Additionally, introducing multi-objective optimization and adaptive feedback loops could enhance its ability to address complex challenges in real-time settings.
Conclusion
This research introduces a framework for identifying key success factors in AI-driven drug discovery and proposes strategies to enhance AI’s impact on the pharmaceutical industry. The study highlights critical areas such as data quality, algorithm performance, and experimental validation, underscoring their importance for the effective integration of AI in drug development.
The results suggest that AI’s success in drug discovery depends on a balance of technical innovation, ethical considerations, and adaptability. The framework addresses key challenges, including data gaps, algorithmic bias, and regulatory constraints, to meet the growing demand for more efficient and fair therapeutic solutions.
This work contributes to better strategic planning for AI applications in drug discovery by providing actionable insights and a structured approach. It sets the stage for further research to refine AI methods in different pharmaceutical contexts, potentially accelerating drug development and improving patient outcomes. The framework offers a fresh perspective, informed by expert contributions, and encourages continued innovation, collaboration, and ethical progress.
ACKNOWLEDGMENT
The authors sincerely thank the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University, which was essential for completing this research. This support highlights the University’s dedication to advancing scientific innovation and impactful research in emerging fields. They also appreciate the collaborative experts, the conducive environment, and the resources that significantly contributed to this work, including the Study Lounge library in Amman.