Here I have described some highlights of my research. For further details, see my list of my publications.
When can machines learn from just a few examples?
Machine Learning and Artificial Intelligence have seemingly become synonymous with Big Data – we imagine neural networks and other tools being trained on vast datasets to learn to perform tasks. But is it possible for these systems to learn from small datasets consisting of just a few (e.g. 5 or 10) examples of a new class of objects?
In a recently submitted article we investigate just this question. We build on existing work showing that implicit properties of high dimensional spaces can make learning from few examples possible in certain circumstances using very simple linear classifiers (a so-called blessing of dimensionality). We are particularly interested in whether nonlinear feature maps, i.e. nonlinear mappings of the dataset into much higher dimensional spaces, can accelerate the onset of this blessing of dimensionality, thereby enabling algorithms to learn from few examples. We are able to provide specific relationships between the distributions of the data classes we are learning, the geometry of the feature mappings, and the probability of successfully learning from just a few examples. To illustrate our results, we consider the MNIST handwritten digit recognition problem. We first train a (relatively simple) neural network to recognise 9 of the 10 digits. We then build a simple Fisher linear discriminant classifier in feature space to recognise the remaining (previously unseen) digit using just 10 training examples. Surprisingly, the AI is able to successfully recognise the extra digit, even though it never saw it while the underlying network was being trained and only saw a few examples of it during calibration!
To find out more about how this works and when it is possible, see our arXiv preprint:
O. J. Sutton, A. N. Gorban and I. Y. Tyukin (2022). Towards a mathematical understanding of learning from few examples with nonlinear feature maps, arXiv:2211.03607, .
Adaptive polygonal meshes for parabolic problems, using virtual element methods
We have extended our previous work (described below) on numerical methods using adaptive polygonal meshes to (time dependent) parabolic problems. This means that aggressive mesh coarsening becomes a real possibility, since patches of elements with low error can simply be merged together. In particular, this allows the shape of the mesh to directly match the shape of the solution being approximated, as shown in the videos here. The key to the argument turned out to be in deriving new error estimators which were applicable even when the finite element spaces may be completely different between time-steps - i.e. they are non-hierarchical. Our estimates are therefore also applicable to moving mesh methods and other schemes presenting similar non-hierarchicalities.
The full details of this research can be found in our article:
A. Cangiani, E. H. Georgoulis, and O. J. Sutton (2021). Adaptive non-hierarchical Galerkin methods for parabolic problems with application to moving mesh and virtual element methods, Mathematical Models and Methods in Applied Sciences, 31 (4) 711-751.
To ensure that the estimator effectivities remain bounded with respect to time, the analysis adopts the time accumulation framework developed in my earlier paper:
O. J. Sutton (2020). Long-time L∞(L2) a posteriori error estimates for fully discrete parabolic problems, IMA Journal of Numerical Analysis, Volume 40, Issue 1, 498–529..
New pattern forming mechanisms in reaction-diffusion systems, discovered using an efficient computational framework
By developing an efficient computational framework, we have recently discovered entirely new pattern forming mechanisms in a time dependent reaction-diffusion system with a cyclic competition structure. Such systems describe a wide variety of natural phenomena, particularly the competion between certain groups of organisms. These interactions are structured like a game of rock-paper-scissors: species A outcompetes species B, species B outcompetes species C, and species C in turn outcompetes species A.
Some examples of these new patterns are shown here, with a different colour indicating the regions dominated by each of the three species. The patterns typically feature zones of chaos and zones of stable travelling structures. To produce these simulations, we developed an efficient computational framework based around an automatic adaptive algorithm for the computational mesh driven by rigorous new a posteriori estimates of the numerical simulation error.
A snapshot of these patterns was featured on the front cover of the May 2018 issue of Proceedings of the Royal Society A. For the full details of the patterns and the framework we developed to simulate them, see our paper:
A. Cangiani, E. H. Georgoulis, A. Yu. Morozov, and O. J. Sutton (2018). Revealing new dynamical patterns in a reaction-diffusion model with cyclic competition via a novel computational framework, Proceedings of the Royal Society A, 474 20170608.
Solving elliptic problems using adaptive polygonal meshes, with virtual element methods
Numerical methods which use polygonal meshes bring with them the exciting prospect of very general adaptive meshing, allowing computational effort to be tightly focussed on areas of the domain where it is required. For instance, coarsening areas of the mesh becomes as simple as merging patches of elements, as demonstrated by the sequence of adapted meshes shown here.
This is something we have been recently exploring for elliptic problems by developing rigorous and computable local error estimators for virtual element methods. The full details can be found in our article:
A. Cangiani, E. H. Georgoulis, T. Pryer, and O. J. Sutton (2017). A posteriori error estimates for the virtual element method, Numerische Mathematik, 137(4), 857-893.
The analysis is based on the numerical scheme we developed in the article:
A. Cangiani, G. Manzini, and O. J. Sutton (2017). Conforming and nonconforming virtual element methods for elliptic problems, IMA Journal of Numerical Analysis, 37(3), 1317-1354.
The virtual element method: now in just 50 lines of Matlab!
The virtual element method is a recently introduced method for approximating solutions to partial differential equations using general polygonal meshes. To help clarify the design and implementation of the method, and show how the theory translates into practice, I have developed and dissected an accessible 50 line Matlab implementation for solving Poisson’s problem.
The code itself is available online, and has been downloaded over 700 times (it is also available through Netlib as the package na45 in the numeralgo library). The details of the formulation of the virtual element method for this problem, and a discussion of how each component may be implemented, is given in my article:
O. J. Sutton (2017). The virtual element method in 50 lines of MATLAB, Numerical Algorithms, 75(4), 1141-1159.