Research in Machine Learning
Using Neural Networks in Science
One of the research area we are interested in, is how to use machine learning (ML) for science (as physics, chemistry, etc.). We work on developing methods and processes to be able to use ML techniques to better understand the world around us in a scientific way. Our goal is to develop methodologies that can be applied to different sciences. The first paper that deals with those questions is
Michelucci, U.; Venturini, F. Estimating Neural Network’s Performance with Bootstrap: a Tutorial. Preprints 2021, 2021010431 [Get PDF]
It deals with the question on how to correctly estimate the performance of a neural network, given its stochastic components. An interesting and important analysis of the Central Limit Theorem with regard to neural networks is also discussed at length.
In this paper we show how the distributing of an averaging statistical estimator (MSE, accuracy, etc.) will tend to follow a normal distribution and that will allow us to use average and standard deviation to characterise the performance of neural networks completely.
A numerical demonstration of the Central Limit Theorem. Panel (a) shows the asymmetric chi-squared distribution of random values for k = 10, normalized to have the average equal to zero; in panel (b), (c) and (d) the distribution of the average of the random values is shown for sample size n = 2, n = 10 and n = 200 respectively. More information can be found in Michelucci, U.; Venturini, F. Estimating Neural Network’s Performance with Bootstrap: a Tutorial. Preprints 2021, 2021010431 [Get PDF].
Neural networks present the characteristics that the results are strongly dependent on the training data, the weight initialisation, and the hyper-parameters chosen. The determination of the distribution of a statistical estimator, as the Mean Squared Error (MSE) or the accuracy, is fundamental to evaluate the performance of a neural network model (NNM). For many machine learning models, as linear regression, it is possible to analytically obtain information as variance or confidence intervals on the results. Neural networks present the difficulty of being not analytically tractable due to their complexity. Therefore, it is impossible to easily estimate distributions of statistical estimators. When estimating the global performance of an NNM by estimating the MSE in a regression problem, for example, it is important to know the variance of the MSE. Bootstrap is one of the most important resampling techniques to estimate averages and variances, between other properties, of statistical estimators. In this tutorial, the application of two resampling (including bootstrap) techniques to the evaluation of neural networks’ performance is explained from both a theoretical and practical point of view. Pseudo-code of the algorithms is provided to facilitate their implementation. Computational aspects, as the training time, are discussed since resampling techniques always require to run simulations many thousands of times and, therefore, are computationally intensive. A specific version of the bootstrap algorithm is presented that allows the estimation of the distribution of a statistical estimator when dealing with an NNM in a computationally effective way. Finally, algorithms are compared on synthetically generated data to demonstrate their performance.
Figure 1: Distribution of the MSE values obtained by evaluating a trained NNM on 1800 bootstrap samples generated from a validation dataset . The Neural Network Model (NNM) used consists of a small neural network with two layers, each having 4 neurons with the sigmoid activation functions, trained for 250 epochs, with a mini-batch size of 16 with the Adam optimizer.