Your own teacher: how algorithms learn without human help and make drones better

Unmanned vehicles, digital twins and automatic control of telecommunications are not

predictions of science fiction writers, and alreadyNear future. It is brought closer by scientists who are engaged in applied artificial intelligence and research in the field of reinforcement learning. Hi-Tech talked about the future of technology with Oleg Svidchenko, Alexander Grishin and Alexey Shpilman, winners of the annual Segalovich Prize.

How AI learns without a mentor

Reinforcement learning,RL) assumes that the AI ​​itself interacts with a certain environment - for example, a board for the Go game or the outside world if the robot moves along it. The device needs to identify common patterns and focus on them when performing tasks. And when learning with a "teacher" you need a person who must indicate the correct action on which the AI ​​will train.

“The essence of RL is that the machine or, as we say,agent, learns in the mode of constant practice,” notes Oleg Svidchenko, laureate of the Yandex Science Prize. - AI is placed in certain conditions and "say" - act. This is similar to the situation when a mouse goes in search of cheese in a maze. Having made a turn in the wrong direction, the animal collides with the wall, comes back, tries again, and so on. In the case of reinforcement learning, correct steps are rewarded. The more correct the action, the more points the AI ​​will receive. If the choice turned out to be incorrect, then the agent loses points. During training, the machine remembers which combination of actions was more profitable, and next time it will use it.”

Independent search for a solution allows the agentsooner or later surpass the man. This was shown, for example, by DeepMind's MuZero algorithm, which learned to play dozens of old Atari video games, chess, and Go-type board games. To create it, they used previous developments of the company: for example, AlphaGo, thanks to which it was possible to beat the Go champion Lee Sedol, and AlphaZero, which is used in chess. The improved algorithm extracts more information from less data - now it needs half the training steps.

Reinforcement learning algorithms canuseful in a variety of industries. For example, in medicine - for organizing personalized dynamic treatment, in the entertainment industry - for automatic testing of computer games, or in aviation - for autonomous control of a stratospheric balloon.

In which areas AI will come to the aid of people

Digitalization of retail: fully automated stores

The first machine learning is implemented in industrieswhere the process of collecting and digitizing large amounts of data is debugged. For example, in retail, all information passes through cash registers, which means that AI has something to work with. According to Alexey Shpilman, the use of AI algorithms will make it possible to create automated stores everywhere, where all processes will take place without human intervention.

This format was tested back in 2016.Amazon company. The buyer takes the cart, picks up goods in it and just leaves - the money for the purchase is debited from the card automatically. In Russia, a similar project was developed by Azbuka Vkusa.

“The buyer takes the trolley, picks up goods in it and just leaves - the money for the purchase is debited from the card automatically”

Telecommunications management: identifying network faults 

Thanks to reinforcement learningtechnological breakthroughs can occur in the management of various networks - telecommunications, heating networks, electric power industry. Many processes here are quite easy to robotize, since there is not much interaction with people.

Automation will lead to the creation of systems thatwill make more informed decisions and optimize energy consumption. For example, based on RL algorithms, an HVAC controller is being developed (an acronym for Heating, Ventilation, & Air Conditioning - Heating, ventilation and air conditioning) - this is a room temperature and ventilation control system. Using this technology in businesses will help both save energy consumption and reduce carbon emissions.

Unmanned Vehicles: Testing Technology and Legislation

Another area that is waiting for a breakthrough thanks toreinforcement learning - transportation. Already today, unmanned vehicles and delivery robots can be found on the streets. Despite technological advances in the industry, McKinsey analysts predict that drones will not become mainstream until 2030 at the earliest. Implementation is complicated by the need to develop regulations. In Singapore and the United States, automated transport is already in full swing along the highways, and permission has recently appeared to test an unmanned taxi in Russia.

“Automation almost always improvessecurity, but people greet the introduction of such technologies with fear,” Oleg Svidchenko is sure. — If you replace all transport with unmanned Teslas, the number of accidents on the roads will drop several times. But every accident will raise many questions. We cannot say for sure, as in the case of a person, what caused the accident. And people are afraid of this unknown.”

“Another area that is waiting for a breakthrough thanks to reinforcement learning is transportation”

How digital twins will be useful to mankind

Reinforcement learning algorithms have made it possiblecreate digital twins - virtual prototypes of objects, processes and even people that contain the same properties and characteristics as the originals. Industrial enterprises use this technology, for example, to check whether all processes are properly adjusted before launching a new conveyor. Of course, you can immediately insert the plug into the outlet, but if a failure occurs, it will take time and resources to fix it. Therefore, the conveyor is first launched on a computer. 

Everything is much better with human digital twinsmore difficult, because a living organism is a more complex system. And yet, scientists continue to master the technology, creating virtual copies of both individual organs and the whole organism. For example, a Boston hospital uses a digital twin of the heart to plan surgeries. In the future, this will allow testing methods of treatment on a virtual patient, predicting diseases, and may well claim to be a revolution in medicine.

“The development of AI, including RL, could lead tothe fact that people will begin to understand themselves better,” suggests Aleksey Shpilman. “Man is a closed system, because we use our own brain for self-knowledge. But is this tool enough for us? Even in psychology, two people are needed for reflection, and we are closed within ourselves. Globally, in the context of the Universe, humanity is still alone, which means that we have no one to talk to in order to learn something new about ourselves and look from the outside. Perhaps, thanks to reinforcement learning, we will create some kind of entity outside of ourselves. It will not be limited by our brain and consciousness and will be able to give a person new answers and meanings.”

Why the widespread implementation of RL is still limited

Despite the progress that scientists have made, the practical application of RL is still limited. The system takes a long time to learn and makes a lot of mistakes, so implementing the algorithm everywhere is difficult and unprofitable.

“The agent needs more repetitions, so the processlearning takes quite a long time, - explains Alexander Grishin - Moreover, it is not enough for AI to perform the best action. He needs to explore the environment, as a big reward may be hidden behind currently unattractive moves. The whole logic of reinforcement learning comes down to the fact that AI learns to sacrifice short-term benefits for long-term success. To do this, you need to think ahead and calculate possible scenarios for the development of events. For example, when the agent gives up the knight to capture the queen, the scientists will be very happy.”

The task of scientists is to ensure that AI growspace of learning and improved ability to analyze. But one mundane problem prevents rapid progress: there is a staff shortage in R&D laboratories and IT companies. Universities are creating laboratories and research centers, technology giants are opening specialized courses.

“Research in machine learning nowvery much in demand. The industry is developing rapidly, and the shortage of personnel is increasing every day,” says Alexey Shpilman. “Specialists have a great chance to get involved in processes that will change the world beyond recognition. Lots of interesting work. Now we are at the very beginning of the path, but we have already achieved good results. Can you imagine what prospects will open up for humanity through the use of RL?

Read more:

The space probe flew 200 km from Mercury. Look what he saw

Scientists uncover how vitamins affect the incidence of cancer

Chinese mind-reading helmet sounds the alarm when a person sees porn content