Research for AI models that can generalize, expand, and accelerate science
Next week, the international conference (ICLR) for the 11th learning expression will be held in Kigari, Rwanda, on May 1st to 5th. This is the first major artificial intelligence (AI) meeting held in Africa and the first face -to -face event since the start of pandemic.
Researchers around the world gather to share the most advanced work with deep learning that spans applications such as AI fields, statistics, data science, machine vision, game, and robot engineering. I am proud to be able to support the meeting as a diamond sponsor and a DEI champion.
Deepmind’s full -scale team has published 23 papers this year. There are several highlights here:
Open questions about the way to AGI
Recent progress has shown an incredible performance in AI textbooks and images, but even more research is required for the system to generalize the domain and scale. This is an important step in the way to develop artificial general information (AGI) as a transformed tool for our daily life.
We present a new approach that model learns by solving two problems in one. You will learn how to infer a task that needs to be solved with similar problems by simultaneously examining the problem from two perspectives by training a model. This is beneficial for generalization. He also investigated the ability of neural networks to generalize by comparing the chom ski hierarchy in the language. By strictly testing the 2200 models in 16 different tasks, it is important to steal that a specific model is struggling to generalize, and to enhance them with external memory to improve performance. I understand.
Another challenge we work on is how to make progress in a long -term task at a less rewarded expert level. We have developed a new approach and open source training dataset to help models learn to explore in a human -like way over many years.
Innovative approach
With the development of a more advanced AI function, it is necessary to efficiently work on the current method intended in the real world. For example, language models can create impressive answers, but many cannot explain the reaction. Here are some ways to solve multi -step inference issues by using a language model to use the basic logical structure to provide explanations that humans can understand and confirm. On the other hand, a hostile attack is a method of investigating the limits of the AI model by pushing the AI model and creating an incorrect output or harmful output. Examinated enemy training allows the model to be more robust against attacks, but can sacrifice performance in “normal” input. It indicates that adding an adapter can create a model that can control this trade -off on the spot.
Renforce Learning (RL) has been proved to be successful in various real world issues, but the RL algorithms usually do one task and have a hard time becoming a new one. It is designed. We propose distillation of algorithm. This is a method in which a single model can train a new task efficiently by training a transformer and imitating the learning history of RL algorithms throughout the various tasks. In addition, the RL model is learned by very data -intensive and time -consuming trial and error. To reach the human -level performance in 57 Atari games, the model agent 57 required about 80 billion frames data. Use 200 times the experience to share new methods to train to this level, greatly reducing computing and energy costs.
AI for science
AI is a powerful tool for researchers to analyze a huge amount of complex data and understand the world around us. Some papers indicate how AI accelerates scientific progress and how science is promoting AI.
It is important for drug discovery to predict the characteristics of the molecule from the 3D structure. It provides a new cutting -edge prediction of molecular characteristics, enabling large -scale pre -training, and removing the entire biological dataset to generalize. In addition, a new transformer that can perform more accurate quantum chemical calculations will be introduced using data on the atomic position only.
Finally, using a figure net is an inspiration to model complex collisions from physics to teapots and donuts. This simulator may have applications for robot engineering, graphics and mechanical design as a whole.
See the complete list of DeepMind papers in ICLR 2023 and the schedule of the event.