Bots will be forced to be polite: an anti-toxic algorithm has been developed for them

Researchers at the University of California, San Diego have developed algorithms to clean speech from

offensive language generated by online bots.

Experts have previously tried different approaches tocleaning the speech of bots, but they turned out to be ineffective. Listing toxic words leaves out words that, when used out of context and alone, seem normal, but become offensive when used in combination with others. Trying to remove toxic speech from training data is time-consuming and far from reliable. Similar problems arise when developing a neural network that would detect toxic speech.

Now computer science specialistsfrom the University of California at San Diego tried a new method. First, they fed “harmful” cues into a pre-trained language model to force it to generate toxic content. The researchers then trained a model they called “evil” to predict the likelihood that content would be offensive. The engineers then trained the “good model,” which was taught to avoid all content that was highly rated by the “evil model.”

As a result, the authors of the development confirmed that theirthe "good model" proved to be more effective than the most modern methods. The researchers presented their work at the AAAI Online Conference on Artificial Intelligence.

Read more:

It has been hunted for centuries: what do we know about the planet Vulcan next to the Sun

Physicists have experimentally confirmed a new fundamental law for liquids

Astronomers have found the source of mysterious radio bursts that come from space