Visual recognition of phishing: how to stop cybercriminals using AI and computer vision

Phishing is a well-known social engineering technique that comes in many different forms: phone calls,

Smishing (SMS phishing – "Hi-Tech"), phishing emails and websites.Cybercriminals use it to trick them into providing sensitive data, such as credit card details, logins, and passwords.

Phishing links to malicious sites are often contained in emails that appear to be from trusted sources.They are forwarded in messages from social networks and apps like Facebook and WhatsApp.They can even appear in search bars, misleading users.And it can be quite difficult to determine whether a site is phishing.Many of these resources are almost identical to the sites being copied.Phishing emails are usually less effective, as modern technology recognizesHowever, some of them still end up in the inbox.

The reason for the popularity of phishing is clear — cybercriminals can attack a large number of people at once.To counter massive user attacks, Avast uses artificial intelligence (AI) technology.

Phishing Detection with AI

To deceive people, cybercriminals create websites that are very similar to real and reliable resources.Visual similarity is often enough to mislead gullible users – they easily leaveyour credentials and other sensitive information.

In theory, cybercriminals can use the same images on phishing pages as on the original resources.However, the owners of the original sites are able to see links to images placed by fraudsters on their servers.In addition, it takes time and effort to create an exact copy of the site.In this case, cybercriminals would have to reproduce the design on a phishing resource, paying attention to every pixel.As a result, they approach the task creatively and create pages that are very similar to the original ones, but at the same time haveMinor differences that are barely noticeable to the average user.

Avast has a network of hundreds of millions of sensors that supplyAI data. Avast scans every site that users visit and scrutinizesOther factors, such as the certificate, are also considered when assessing whether a page should be loadedwebsite, domain age, and the presence of suspicious tokens in the URL.

The lifetime of a phishing site is usually extremelysmall, and search engines do not have time to index it. This is reflected in the rating of the domain. Its popularity and history can also be the first signs of whether a page is safe or malicious. After checking this information and comparing it with the visual characteristics, the system concludes whether the site can be trusted.

Phishing version of Orange.fr login pageThe original version of the Orange.fr login page

By comparison, these pages lookin a completely different way: the malicious version uses the outdated Orange site design, and the original site has a more modern and secure one, since the password is requested from the user in the second step, and not on the same page simultaneously with the login.

Obviously, the domain of the phishing site has a verylow level of popularity. At the same time, the rating of this page Orange.fr - 7/10. Although the design of a phishing resource is very similar to the previous version of the Orange.fr site, it is not hosted on Orange.fr or on another popular domain. This information, testifying to the potential danger of a fake website, starts a protocol for more thorough study of it.

Analysis domain orangefrance.weebly.com - phishing version of the site. This data can be used to assess its popularity.

The next stage is design verification. At first glance, a pixel-by-pixel comparison of a fake website with a real one is quite enough. This is not true. A different approach is taken using image hashes. In this method, the original image is compressed to a smaller size while maintaining the necessary detail. The result is a fixed-size bit vector with a simple metric. Thanks to this approach, the AI ​​compares the same type of images with a given statistical deviation. However, this technology turned out to be less reliable and error tolerant than expected.

A much more effective method turned out to be the use of computer vision.information about the images by looking in detail at specific pixels and their surroundings.To do this, we use descriptors, which are numerical descriptions of the relative changes in the fragment around the pixel.This process allows you to more accurately assess the variability of gray shades, including detecting the presence of a gradient and determining its intensity.

The pixels selected by the algorithm, the so-called points of interest, can be checked against an updated descriptor database after they are retrieved.However, the mere fact that there are pixels in the image that are similar to the pixels of another is not enough to conclude that the picture corresponds to the one in the database.For this reason, the "spatial verification" method is used to compare the spatial relationships between individual pixels in an image.

a) An example of a valid spatial configuration of pixelsb) An example that was rejected as invalid

Spatial verification is the sourcevalid data, but additional steps are added to eliminate possible false-positive results, including checking on image hashes.

Analyzing points of interest in a picture that contains text is problematic.These images have a large number of gradients by default, as the letters and text elements createEven a small section of a drawing with letters contains many points of interest, which often leads toSpatial verification is powerless here.

To solve this problem, developedsoftware capable of analyzing image fragments for text. In these cases, the AI ​​will not use points from such sites in the process of comparing pictures.

The entire verification procedure is fully automatic.recognize a phishing site in less than ten seconds, and Avast's connected users will have access to itblocked.

Phishing sites detected

Modern phishing sites are great cheaters. Cybercriminals make great efforts to make them look like real ones. The examples below show how a phishing site may be similar to the original.

There are also slight differences in the colors of the user's avatar and the selection points in the gray login module. Phishing site. Old version of Google login page.

Over the years, phishing sites have significantlyimproved and look very convincing. Some even use the HTTPS protocol, and the “green lock” in the browser bar gives users a false sense of security.

The icons of the fake Apple authorization site are slightly different from the original ones. Phishing site.Apple ID login page.

Small flaws in the phishing pagebecome apparent only when compared with the original, reliable resource. By themselves, they do not attract attention. Try to remember now what the login page of the service you use often looks like. You are unlikely to be able to present the design in all its details - and that’s what fraudsters who create fake websites count on.

How is the threat spread?

Links to phishing sites are most commonly sent in phishing emails, but they can also be found in paid advertisements that appear in search results.

Most often, cybercriminals create fake emails from well-known companies that users trust: banks, airlines, and social networks.

Another attack vector is a technology called"Clickback". Cybercriminals usually use this technique on social networks: users see a tempting headline like “Get a free phone” or “N brand with an incredible discount” and click on the malicious link.

In addition, hackers can hack or create fake accounts of popular people and place malicious links on their profiles and posts.

What happens after the victim pecked at the bait?

The goal of phishing, like almost any other cyberattack, is to gain financial gain.By obtaining a user's login credentials through a phishing site, a cybercriminal canIf it's a malicious copy of the website of a financial institution — a bank or a company like PayPal, the hackerget direct access to the money of the deceived person.

Received fraudulently login and password forLogging in to the website of the transport company, for example, UPS or FedEx, of course, will not bring immediate profits. Instead, the attacker may try to use the details to gain access to other accounts with more valuable information - including, try to crack the victim's e-mail. It is well known that people often set the same password to log into different services. Another way of income for cybercriminals is to sell stolen personal data on a darknet.

This is the so-called “pointless attack” mechanism. There are many outdated WordPress sites on the Internet. They can be hacked for a low cost and used for phishing campaigns. The average cost of deploying phishing tools is $ 26.

How to protect yourself

There is usually a gap between a successful phishing attack and the fact that the stolen credentials are being used by cybercriminalsThe sooner the threat is eliminated, the more potential victims we will be able to protect.If the username and password have already been stolen, the user only needs to change them, and as quickly as possible.

How to protect yourself from one of the most successful cyber attack technologies - phishing:

  • First of all, install antivirus on all your devices - PC, Mac, smartphones and tablets. Antivirus software is a safety net that protects network users.
  • Do not follow links in suspiciousemails and do not download files attached to them. Do not respond to such a letter, even if, at first glance, it came from a person or organization that you trust. Instead, contact the addressee on a different communication channel to confirm that the message actually came from this source.
  • Try to enter the address of the site in the browser in all cases - this will protect you from accidentally switching to the version created by scammers.
  • HTTPS is not a “green lock”security guarantee. This icon only indicates that the connection is protected by encryption. The site on which you are located may be fake. Cybercriminals implement encryption on phishing sites to deceive users, so it is especially important to check and verify the authenticity of the resource that you use.

In 2018, Avast investigated malicious emails from compromised MailChimp accounts, sex phishing, and fraudulent campaigns related to the implementation of the GDPR regulation.In the future, according to experts, the volume of phishing attacks will grow, and there will be new ways to disguise the actions of attackers aimed at stealing confidential user data.