Modeling Student’s Academic Performance During Covid-19 Based on Classification in Support Vector Machine

: This study proposed a statistical investigate the pattern of students’ academic performance before and after online learning due to the Movement Control Order (MCO) during pandemic outbreak and a modelling students’ academic performance based on classification in Support Vector Machine (SVM). Data sample were taken from undergraduate students of Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris (UPSI). Student’s Grade Point Average (GPA) were obtained to developed model of academic performances during Covid-19 outbreak. The prediction model was used to predict the academic performances of university students when online classes was conducted. The algorithm of Support Vector Machine (SVM) was used to develop a model of students’ academic performance in university. For the Support Vector Machine (SVM) algorithm, there are two important parameters which are C (misclassification tolerance parameter) and epsilon need to identify before proceed the further analysis. The parameters was applied to four different types of kernel which is linear kernel, radial basis function kernel, polynomial kernel and sigmoid kernel and the result was found that the best accuracy achieved by SVM are 73.68% by using linear kernel and the worst accuracy obtained from a sigmoid kernel which is 67.99% with parameter of misclassification tolerance C is 128 and epsilon is 0.6.


Introduction
The global higher education landscape has dramatically changed for the past few months due to the spread of the coronavirus (COVID- 19)  . The effect of this MCO has indirectly caused schools and institutions of higher learning institutions not to be allowed to run during the MCO period [5]. On 27 May 2020, Ministry of Higher Education (MOHE) informed that all teaching and learning activities for students must be implemented online until 31 December 2020 [6]. Universiti Pendidikan Sultan Idris (UPSI) will be implementing online learning and teaching (PdP) in line with Higher Education Ministry (KPT)'s directive [7]. Face-to-face learning is a process of learning and teaching directly and indirectly between teachers or lecturers and students. Face-to-face learning is also known as conventional learning [8]. Online learning can be defined as that learning occurs partially or entirely through internet access [9]. Online learning is not stranger among university students. However, conducting the concept of digital lectures as a whole due to the pandemic outbreak is seen to have a huge impact on university students, especially on academic achievement [10]. But in [11], Smith and Stephens found that online students tended to have higher pre-course GPA as a positive, since online learning required discipline and self-motivation. Many studies have been conducted to study online learning[12], but it is very difficult to find studies related to online learning during pandemics. In [13], online learning has effect to the students learning outcomes where student interaction during online learning has a significant effect on student academic results. Recent studies show that interactive activities are one of the factors that affect student results [14]. In addition, Oye, et al., and Keshavarz said that e-learning gives a positive impact on students' academic achievement because of the reduces costs, saving time and increase accessibility of education [15].Shahiri et al. [16] stated that there are several techniques used to evaluate the academic performance of students. Data mining is one of the most familiar techniques use to examine the academic performance of students. Support Vector Machine (SVM) algorithm is used in this research which is a technique of supervised learning. Through SVM algorithm the prediction is performed and data is analysed using classification. [17] proposed SVM which is very useful technique for the data classification. [18] investigates students' academic performance prediction using support vector machine. This study examines the association between preadmission academic profile of students and final performance of academics. The support vector machine outperforms other machine learning algorithms. Machine Besides, [19] proposed classification of students based on quality of life and academic performing using SVM and [20] use the SVM as machine learning too due to its advantage to improve the accuracy of classification procedure especially in data mining. Moreover, the result from [21] shows that SVM is one the model that capable of predicting with scoring high accuracy not less than 92%. This study mainly focused on finding the pattern of students' academic performance before during online learning due to the COVID-19 pandemic outbreak by referring on their Grade Point Average (GPA). Moreover, this study also to analyse the prediction of academic performance of Universiti Pendidikan Sultan Idris (UPSI) undergraduates' students after they completely attend whole one semester by studying online based on classification in Support Vector Machine (SVM).

Methodology
In order to find the pattern of students' academic performance before and after online learning and the proposed modelling students' academic performance was shown in Figure 1. The dataset used in this study were from undergraduate student from Universiti Pendidikan Sultan Idris (UPSI). The data was collected by using questionnaire and was distributed through online platform. The data collected based on the questionnaire are GPA's students before and during online learning, ages of students and current Cumulative Grade Point Average (CGPA). The questionnaire was responded by undergraduate students from semester 3 to semester 7 from Faculty of Science and Mathematics, UPSI. The data collected shows that, 82.5% are female and 17.5% are male. The data also consist of 72.5% from department Mathematics, 5.7% from department Biology, 12.2% from department Chemistry, 3.1% from department Physics and 6.6% from department Science and the total respondents are 225 students.

Time Series Plot
In order to find the pattern of students' academic performance before and after online learning, a graph has been constructed by using excel. The graph uses GPA's students before and during online learning as a data.

Support Vector Machine (SVM)
The Support Vector Machine (SVM) is a powerful machine learning tools which was proposed by [22] and become more attracted of machine learning researchers and community. The algorithm of SVM has been proven effectively to be used in regression and classification methods. Based on previous studies from [23], the study reported that the SVM is generally able to result the best accuracy of classification compare than other methods. Also, SVM can performs linear and nonlinear classification with high efficiently. However, the challenging of using SVM in classification or regression is to find the best of penalty term parameter and kernel parameters. It is because of SVM is very sensitive to the parameter used. Consider that dataset from PCs were divided into two sets which are training data and testing data. The training data with two classes [( 1 , 1 ), ( 2 , 2 ), … … , ( , )] and the input vector is , the output is . The output was labelled by {+1, −1}. The classifier for the problem of binary classification is where the input vector ( ) was mapped with a feature space by non-linear function ( ). Then, the and are the classifier parameter. By solving the optimization problems are equivalent with determine the SVM classifier from theory, where is a non-negative slack variables that influence objective function when data misclassified. Then, is a penalty parameter with positive value. The optimization problems will be solved by using Lagrange multiple, where 0 ≤ ≤ . Hence, the classifier will be eqn.
(3) by a series of mathematical derivation.
The kernel function, ( , ) = ( ) • ( ) was introduced to calculated the inner products. There are a quite number of kernel might be used in SVM classification but the standard kernels used are linear, polynomial, radial basis function and sigmoid. A most popular and capable kernel is the radial basis function with parameter .
In the final classifier, only nonzero Lagrange multiple will be take part as indicated in eq. (3). The data which having nonzero corresponding Lagrange multiple will be named as support vector. Then, the classifier will be written as, where is the support vector and is the number of support vector. In the support vector classifier, there have two parameters to calibrate which are (misclassification tolerance parameter) and [24]. By determine those parameter, the Lagrange multiple and parameter in eq. (5) can be find by the SVM algorithm. Figure 2 shows the result of GPA students before and during pandemic outbreak using two different types of classes. The blue line represents the student's GPA which conducted face-to-face classes and the orange line represent the student's GPA which conducted via online classes. Meanwhile, x-axis represent the students and yaxis represent the GPA. From the Figure 2, the pattern clearly shows that most of the students got excellent results when conducted using online classes. This is because, the orange line are mostly on top of blue line where it indicates that majority of the students score higher during online classes compared to the face-to-face classes. Thus, the prediction model was used to estimate the student's GPA performance for next coming semester if online classes are continued. In this study, SVM classification model is used to predict students' academic performance. Therefore, it is most important to identify the best parameter in the SVM classification. The selection of the best parameters will improve the accuracy of classification. And, the selection process has called by turning parameter in SVM. Table  1 shows the turning parameter process and the best parameter was chosen by depending the smallest misclassification error. However, based on the table 1 the patterns of misclassification error are slightly different. Hence, the result by plotting graph to measure the performance of turning parameter will be helping to identify the best pair of parameter. The figure 3 shown the graph between cost against epsilon and resulting the misclassification error. The misclassification error was measured by using the reference scale of colour at right side in figure 3. Then, the figure 3 clearly shown that when epsilon became 0.6 the colour become darker and when it's approaching higher value of cost the colour became more darkest than other values. Hence, the result of turning parameter has summarized that the best pair of parameter are 128 as the value and 0.6 as epsilon value selected. By using the selected pair of parameter, the SVM classifier will be resulting the best accuracy of classification model. In way to get the best result by using SVM, the selected of kernels also is the main point to highlight in this study. The types of kernel were used in this study are radial basis function, sigmoid, polynomial and linear. Each kernel function has a particular parameter that must be optimized to obtain the best result performance [25].  Table 2, SVM was resulting the number support vector and misclassification error. The number of support vector are representing the data which approaching or far away from hyperplane during classification. The suitable value of number of support vector is medium value where it is representing the classification is overfitting or underfitting. Based on Table II, the highest number of support vector is 60 and the lowest number of support vector is 56. So, the medium value will be 58 and 59 which represent for radial basis function and linear kernel respectively. Then, the accuracy of model to be classification model can be determining by subtracts misclassification error with 1. The values of misclassification model were obtained by using matrix confusion between prediction and current class. The best accuracy of model achieved when type of kernel be a linear kernel with 73.68% and the worst accuracy get from a sigmoid kernel which is 67.99%.

Conclusion
In this study, GPA's students between face-to-face learning (before MCO) and GPA's students during online learning (during MCO) was studied to find the pattern of the students' academic performance. From the result, students outperform very well when online classes were conducted compared to the face-to-face classes. Most of the students achieve high GPA during online learning compared to the last semester's GPA which is before online learning. Meanwhile, SVM was applied by stressing on the classification method to predict student's academic performance based on their current CGPA and their ages. From the study, by using SVM, the best accuracy of model achieved when type of kernel be a linear kernel with 73.68%. The most accuracy of SVM was achieved by applying linear kernel with pair of misclassification tolerance, , 128 and epsilon 0.6. The best number of support vector for this study is 59 which not be overfitting or underfitting. Therefore, this SVM classification model are successfully predicted students' academic performance by using misclassification tolerance, , and epsilon as a parameters. For future work, the accuracy of predicting students' academic performance in this study can be compared by using other models in machine learning such as Artificial Neural Network (ANN) and Relevance Vector Machine (RVM). Predicting students' academic performance can be very useful in many contexts especially in management such as identifying excellent students for scholarship programs, admissions, and to help to quickly identify students who are unlikely to graduate.