I. Introduction
It is important to estimate the academic success of a student in any higher education. The reason behind this is that it is vital to determine which group of students can perform well in the semester final exam so that scholarships will be awarded, also to recognize those students who will fail in the semester assessments so that they could be given a sort of remediation. The academic achievement of students depends on several factors; some of them are past academic records, such as their performance in English, particularly if the English language is not the mother tongue and the educational medium uses the English language [1]. Learner's family history, mid-semester test results, etc. On the basis of these factors, the MLA classification models may indeed be configured to predict student outcomes. Various forms of MLA [2] have been used for this reason. This includes Naive Bayes, Decision Tree, Random Forest, Logistic Regression, to name a few. This research aims to collect data about new student’s pre-college academic performance, particularly their English-language score, which is essential for non-English speakers who study English curriculum, Math scores, and their high school scores. Data cleanest will be used to train four models, Naive Bayes, Decision Tree, Random Forest, and Logistic Regression, to identify key features that affect student performance at their next academic level. The association between the characteristics of the sample should be evaluated to identify the most significant factor affecting poor academic achievement (at-risk cases) and the main element contributing to outstanding academic success.