Machine Learning Methods Performance Evaluation*

. In this paper, we describe an approach for air pollution modeling in the data incompleteness scenarios, when the sensors cover the monitoring area only partially. The fundamental calculus and metrics of using machine learning modeling algorithms are presented. Moreover, the assessing indicators and metrics for machine learning methods performance evaluation are described. Based on the conducted analysis, conclusions on the most appropriate evaluation approaches are made.


Introduction
Modern society is currently at an active stage of globalization, and the needs of the economic, political,cultural, and other areas are always growing. Growing needsalso entails an increase in the scale of production activities. However,large production processes that are an integral partthe world economy are capable of being sources of environmental pollutionand affect both the health of thenearby territory's population andon the flora and fauna state.
Research in the field of environmental monitoringconducted for many years by scientific communities from different countries.Of particular interest in this area is the developmentmethods for making short-term forecasts of air pollution,the results of which can be used for operationalresponse to industrial emissions and the formation of a managerexposure to prevent the spread of pollution beyonda certain controlled area.
In this paper, we solve the task of controltheharmful substances concentration emitted into the air duringvital activity of a critical facility and organizationtimely response to such emissions. The considered object activityis coaltransportation.
The monitoring module is based on the concept of distributedself-organizing cyber-physical systems, representedas a set of various elements: sensors, data transmission means, computing devices, etc. The main rolehereare played by the system's sensors responsible for collecting meteorological data anddata on individualpollutant concentrations in the atmosphereover thesanitary zone territory.

Approach for Machine Learning Methods Performance Evaluation
Below we present and discuss metrics that can be used to assess the effectiveness of various artificial intelligence methods in solving the problem of environmental pollutionpredicting [1][2][3].
MSE (Mean Square Error). The essence of the MSE estimation method is to calculate the square sum ofactual values deviationsfrom the calculated initial values. However, squaring the magnitude of the deviation significantly increases those values that lie far from all others, or decreases the values of deviations that are between 0 and 1.
RMSE (Root Mean Square Error).The advantage of RMSE over MSE is that the order of the estimated values coincides with the magnitude of the error or deviation, however, it is much easier to evaluate the effectiveness of a predictive model based on MSE.
MAE (Mean Absolute Error) is used to estimate the absolute error N of the prediction results. The undoubted advantage of MAE is that the modules of the deviation magnitude do not multiply the deviations that are considered outliers. Therefore, this estimate is more robust than MSE and corresponds to the median.
The determination coefficient reflects the percentage of variance. Thedetermination coefficient is used in regression analysis more than in other forecasting methods, therefore it can be used when evaluating extrapolation models. Moreover, it is scale-free. If the model fits the data series perfectly, the 2 value is 1. If the model does not describe the series at all, but is just a straight line, then the coefficient of determination becomes equal to 0. In cases with nonlinear models, the coefficient can also become negative, but at the same time it is uninterpretable.
However, it is calculated from the training part of the sample, which means it simply shows how well the data is described. However, the accuracy of the description does not guarantee the accuracy of the forecasts. Therefore, this coefficient can be used to assess the adequacy of the model. MAPE (Mean Absolute Percentage Error) is the mean absolute percentage error. This ratio can be measured in fractions or percentages and be interpreted as a percentage of deviation from the actual values.
where + is a forecasting step error.
To describe situations ofthe null hypothesis acceptance or rejection within the forecasting tasks, statistical methods are used.Such methods describe first and second type errors, acceptance of a correct null hypothesis and rejection of an incorrect null hypothesis.
The statistical hypothesis is tested and correctly accepted as True Positive if the experimental result is consistent with the null hypothesis. If the null hypothesis is rejected correctly, it is the False Negative hypothesis.
During the hypotheses statistical testing, errors of the 1st and 2nd type may appear. False Positive Error (type I error) means that the null hypothesis was rejected incorrectly, and in the case of False Negative Error (type II errors), the null hypothesis is incorrectly accepted.
In the forecasting module being developed, the level of coal dust concentration is determined in two successive stages: • classification; • regression. The classification defines two value categories -zero and non-zero coal dust concentrations. A zero value means a value significantly less than the resolution of the sensor. In the case of classifying a given observation as non-zero, the second stage -regression -is carried out to determine a specific concentration value.
The following values are used to assess the quality of the classification: When building classification models for the forecasting module, the F1 macro metric was chosen as a priority due to the following reasons: • the data sample is not balanced; • F1 is a metric that combines Precision and Recall metrics. To assess the quality of the regression, the metrics MSE and MAE were taken into account, however, preference was given to the MAE metric, since the work is carried out mainly with numbers less than one.

Conclusion
In this paper we described and discussed machine learning performance evaluation methods related the task of environmental pollution modeling and forecasting. This study is a part of the environmental monitoring module development, and here we presented methods and calculus for forecasting performance evaluation. Such metrics as Mean Square Error, Root Mean Square Error, Mean Absolute Error, determination coefficient, Mean Absolute Percentage Error were described. To assess the classification performance, provided by the machine learning models, such indicators as Accuracy, Recall, F1, which depend on the number of True Positive, True Negative, False Positive and False Negative classification cases.According to the conducted analysis, Mean Absolute Error were chosen as the most relevant indicator in our case, as numbers less that 1 are prevalent in our project.