the study
Author published on October 5, 2022
Alhussein Fawzi, Matej Balog, Bernardino Romera-Paredes, Demis Hassabis, Pushmeet Kohli
The first extension of alpha-zero to mathematics unlocks new possibilities for research
Algorithms have helped mathematicians perform basic operations for thousands of years. The ancient Egyptians created an algorithm that multiplies two numbers without the need for a multiplication table, and the Greek mathematician Euclidean explained the algorithm that calculates the largest common denominator still used today .
During the golden age of Islam, Persian mathematician Muhammad ibn Musa al-Kwalizimi designed a new algorithm to solve linear and quadratic equations. In fact, the name Al-Khwarizmi, translated into Latin as an algorithm, leads to the term algorithm. However, despite being familiar with today’s algorithms, it is used throughout society, from classroom algebra to cutting-edge scientific research, but the process of discovering new algorithms is extremely difficult, and it is a human being. An example of the mind’s incredible reasoning ability.
Our paper, published today, introduces Alphatensor, the first artificial intelligence (AI) system to discover new, efficient and appropriately correct algorithms for basic tasks such as matrix proliferation. This sheds light on an unresolved question from 50 years ago in mathematics about finding the fastest way to multiply two matrices.
This paper is a stepping stone to the deep mind’s mission of advancing science and using AI to unlock the most basic problems. Our system, Alphatensor, is based on Alphazero, an agent who has performed superhumanly in board games such as Chess, Go, and Wonder.
Matrix multiplication
Matrix multiplication is one of the simplest operations in algebra and is commonly taught in high school math classes. But outside the classroom, this humble mathematical manipulation has had a major impact on the modern digital world and is ubiquitous in modern computing.
An example of a process of multiplying two 3×3 matrices.
This includes other operations such as processing images on a smartphone, recognition of voice commands, generating graphics for computer games, running simulations to run simulations, compressing data and video for sharing on the Internet, and more. Companies around the world are spending a lot of money on developing computing hardware and increasing the matrix efficiently. Therefore, even mild improvements in the efficiency of matrix growth can have widespread impacts.
For centuries, mathematicians believed that standard matrix multiplication algorithms were the best they could achieve from an efficiency standpoint. However, in 1969, German mathematician Volker Strassen shocked the mathematics community by showing that better algorithms exist.
The standard algorithm compared to Strassen’s algorithm uses one scalar multiplication (7 instead of 8) to multiply a 2×2 matrix. Multiplication is much more important than addition for overall efficiency.
By studying very small matrices (size 2×2), he discovered an ingenious way to combine matrix entries to generate faster algorithms. Despite decades of research following the Strassen breakthrough, a larger version of this issue remains unresolved, and it is not clear how efficiently it is possible to multiply two matrices, 3×3. to.
In our paper, we investigated how modern AI technology can advance the automatic discovery of new matrix growth algorithms. Based on advances in human intuition, Alpha Tensole has discovered algorithms that are more efficient than the latest technologies in many matrix sizes. Our AI-designed algorithms outweigh human-designed algorithms. This is a major step forward in the field of algorithm discovery.
Process and progress to automate algorithm discovery
First, we transformed the problem of finding efficient algorithms for matrix multiplication into a single player game. In this game, the board is a 3D tensor (an array of numbers) that captures how far away the current algorithm is. Through a series of permitted movements corresponding to the algorithm’s instructions, the player attempts to change the tensor and zero the entry. If the player manages to do so, this will prove the matrix multiplication algorithm for a pair of matrices, and its efficiency will be captured by the number of steps taken to zero the tensor.
This game is extremely difficult. The number of algorithms to consider is much larger than the number of atoms in the universe, even in the smallest case of matrix growth. Compared to the GO game, an AI challenge for decades, the number of possible moves at each step of the game is 30 orders more (over 1033 in one of the settings to consider).
Essentially, to play this game well, you need to identify the smallest needle in a huge haystack that could potentially be. To tackle the challenges of this domain, which is a major departure from traditional games, a new neural network architecture that incorporates problem-specific guiding bias, procedures for generating useful synthetic data, and recipes to harness symmetry. We have developed several important components, including: problem.
Next, we trained the Alphatonsole agents using reinforcement learning to play the game using reinforcement learning without knowledge of existing matrix multiplication algorithms. Through learning, Alphatensor gradually improves over time, rediscovering historical fast matrix multiplication algorithms such as Strassen, and ultimately outperforms the realm of human intuition and discovers algorithms faster than before.
A single player game played by Alphatensor. Here, the goal is to find the correct matrix multiplication algorithm. The game state is a cube number (displayed as grey for 0, blue for 1, and green for -1) representing the rest of the work.
For example, if a traditional algorithm taught in schools uses 100 multiplication to multiply a 4×5 into a 5×5 matrix, and this number is reduced to 80 with human ingenuity, then the Alphatensor uses 76 multiplexations. I discovered an algorithm that performs the same operation.
The algorithm discovered by Alphatensor using 76 multiplication is an improvement over cutting-edge algorithms.
Beyond this example, the Alphatensor algorithm is the first time in a finite field to improve Strassen’s two-level algorithm since it was discovered 50 years ago. These algorithms for multiplying small matrices can be used as primitives to multiply by much larger matrices of any size.
Additionally, Alphatensor discovers a diverse set of algorithms with cutting-edge complexity. Matrix multiplication algorithms of up to thousands of sizes also show that the space of matrix multiplication algorithms is richer than previously thought.
This rich spatial algorithm has different mathematical and practical properties. Using this versatility, we have tuned the Alphatensor to find specially fast algorithms on certain hardware such as the NVIDIA V100 GPU and Google TPU V2. These algorithms increase the larger matrix 10-20% faster than the algorithms commonly used in the same hardware. This shows the flexibility of the alphaton sole when optimizing any target.
An alpha tensole with the purpose of corresponding to the runtime of the algorithm. Once the correct matrix multiplication algorithm is discovered, it is benchmarked to the target hardware and fed to the Alphatensor to learn more efficient algorithms on the target hardware.
Investigating future research and application impacts
From a mathematical perspective, our results can guide further studies of complexity theory, which aims to determine the fastest algorithm for solving computational problems. By investigating the space of possible algorithms in a more effective way than previous approaches, Alphatensor helps to promote understanding of the richness of matrix proliferation algorithms. Understanding this space can unlock new results that will help you determine the asymptotic complexity of matrix growth, one of the most fundamental open problems in computer science.
Because multiplication of the matrix is a core component of many computational tasks spanning computer graphics, digital communications, neural network training, and scientific computing, the algorithms of alphart sensor discovery make computation in these fields more efficient. You can do it as follows. The flexibility of Alphatensor to consider all kinds of objectives will facilitate new applications for designing algorithms that optimize metrics such as energy use and numerical stability, and snowballs to ensure that the algorithm works. It helps to prevent small rounding errors.
Here we focused on the specific problem of multiplication of the matrix, but hopefully our paper will encourage others to use AI to guide algorithm discovery for other fundamental computational tasks. It’s there. Our research also shows that alpha-zero is a powerful algorithm that can extend well beyond the realm of traditional gaming to solve open problems in mathematics. Based on our research, we hope to spur more work. Applying AI to help society solve some of the most important challenges in mathematics and science at large.
More details can be found in the GitHub repository on Alphatensor.
Acknowledgments
Feedback on blog posts by Francisco R. Lewis, Thomas Hubert, Alexander Novikov and Alex Gant. On the help of text and numbers by Sean Carlson, Ariel Beer, Gabriella Pearl, Katie McCatuckney and Max Barnett. This work was conducted by Alfussein Fauzi, Matezi Balog, Aja Fan, Thomas Hubert, Bernardino Romera Paredes, Mohammadamine Beakatein, Francisco Lewis, Alexander Novikov, Julian Schlitwieser, Gretzegols Swirstz; Davils Silver, Davils Schliver, Kohli.