How does a spam filter for emails work?

Author: HOSTTEST Editorial   | 28 Jan 2020

Spam filter - how do they workSpam emails not only mean a constant annoyance for email address owners - they also pose a concrete danger in many cases. In the so-called "phishing" attacks, criminals attempt to obtain information about critical data such as logins, passwords, credit card numbers, or TANs through targeted spam. Therefore, besides convenience, there are concrete security benefits to using a reliable spam filter.

 

What is SPAM?

The term spam originates from a sketch by the British comedy group Monty Python, where a couple in a restaurant consistently receives spam - an abbreviation for Spiced Pork And Meat or Spiced Ham - regardless of what they actually want to order. Initially, the main dish contains only one serving of spam - later the menu ends with a plate of tenfold spam with baked beans.

From this context, the term spam evolved to refer to unwanted, disruptive, and unsolicited emails. Unlike viruses and trojans, spam itself is not dangerous if promptly deleted. However, it is crucial not to click on any links and never open attachments. Both could contain malware that infects a computer unnoticed.

What is a Spam Filter?

A spam filter's task is to sort out harmful or unwanted messages sent to a specific address and evaluate them based on their informational content. There are different approaches for this task, ranging from sorting based on origin to artificial intelligence. In general, a filter analyses each incoming message based on its available data and assigns a probability that it is a "genuine" message with relevant information or spam. Based on this assessment, it decides whether to move the email to a special folder or even block it entirely.

What Methods Does a Spam Filter Use?

To distinguish spam from genuine messages, spam filters utilise different features and algorithms. Some common procedures and techniques include:

  • Sorting emails based on the sender
  • Analysing the content for specific keywords commonly used in spam
  • Evaluating the address and metadata
  • Content assessment through artificial intelligence
  • User-trained classification (Bayes or Markov filter)
  • Comparing email addresses and links with a database

Each method has specific advantages and disadvantages that affect their reliability and usability. Advanced spam filters use multiple approaches simultaneously to increase their accuracy.

Spam filters through data matching: Black and White Lists
These spam filters extract information - whether it's key terms, domains, or server providers - and compare them against a database where allowed or blocked parameters are stored. There are various ways to configure the filter:

  • Emails must meet all or specific conditions (Whitelist)
  • Emails must not contain known spam senders or keywords (Blacklist)
  • Blocking all messages that do not allow classification
  • Delivering unknown emails or moving them to a special folder

The main issue with this method is that it can only make an assessment based on known data. It fails when no information is found in the White or Blacklist. Depending on the database used, there is a moderate to high risk that it either does not reliably detect spam or incorrectly categorises messages. Furthermore, the classification is not based on probability. Most Exchange Hosting offerings already include a spam filter.

Spam filters through experience: Artificial Intelligence and individual training

This class of spam filter uses a self-learning algorithm that considers selected, characteristic features of an email. It needs to be initially trained and improves its detection over time through continuous training. It depends on the program used whether the spam filter needs to be trained or accesses an existing database. Like with speech recognition, some services move the analysis to the cloud - this has several advantages:

  • Reliable optimisation through large amounts of data
  • External servers handle computationally intensive operations
  • Quick response to new variants of spam
  • Universal and international coverage
  • Good detection immediately after setup

For the end user of an email address, artificial intelligence also has disadvantages. When individually trained, it first needs to be trained - requiring at least 1000 messages before the algorithm achieves reliability with a single-digit error rate. Using a service in the cloud eliminates this initial work, but customisation is only limited. If messages with specific content are classified as spam by a majority of users but are personally interesting for various reasons, this leads to problems.

Spam Filters on Multiple Levels

Many providers like Exchange Hosting already use internal spam filters before receiving a message to select incoming mail, reducing data transfer and storage consumption. Various databases on the internet such as Spamhaus register IP addresses, providers, and domains that stand out due to intensive spam sending. Servers or URLs listed on these organisations' blacklists are automatically blocked by some Internet Service Providers (ISPs). When identified, the mail server terminates the connection after passing on the metadata and reports an error. This happens automatically with services like Exchange Hosting.

The next level involves programs that monitor the inbox for spam on the user's own computer. Depending on their functionality, they either reject receiving the message altogether or move it to the appropriate folder. Similarly, plug-ins allow integration of additional spam filters for email clients like Outlook or Thunderbird. They can be installed afterwards and apply various methods.

Spam Filters: A Useful Addition

Unlike an advertisement or a letter, sending an email practically incurs no significant costs. For this reason, spammers often send messages to millions of addresses, some of which they generate randomly. Additionally, lists can be purchased on the black market that combine a valid email address with further information like contact details. These lists are made up of illegally obtained and public data, enabling customisation of spam. Therefore, an important spam filter is to keep a private email address confidential and ensure the security of personal computers and smartphones.

Photo: Clker-Free-Vector-Images from Pixabay

Write a comment


    Tags for this article

  • E-Mail

More web hosts


More interesting articles