Computing Topics --> Electronic Mail (E-Mail) --> Getting Started with E-mail --> E-mail Filtering With The Junk Mail Analyzer -->

E-mail Filtering With The Junk Mail Analyzer

The problem:
Unsolicited e-mail (sometimes called spam) is a growing problem for most e-mail users now. Recent studies show that up to half of all mail messages on the internet today may be unwanted. These unwanted e-mail messages represent lost productivity, and since many also contain computer viruses, may also cause data loss.

The Office of Information Technology is acutely aware of the magnitude of the unsolicited e-mail situation here at the University of Maryland. The system load of our servers is affected by the number and size of these undesired e-mails. Our Help Desk also feels the burden when e-mail users receive infected e-mails.

In an effort to alleviate the effects of these undesired e-mails OIT is undertaking several initiatives on the enterprise mail system (Mail@UMD). These measures include system-level virus filtering (currently in place and functioning) as well as server level rules-based e-mail filtering.

Unfortunately, system-level junk mail filtering is not available on all OIT e-mail systems. With some restrictions however, OIT is able to offer the Junk Mail Analyzer for most of our e-mail services. If you use your @umd.edu e-mail address, or use WAM, Glue or DEANS, and have consolidated your ID's you may take advantage of the Junk Mail Analyzer.

Additional filtering is available in that many desktop e-mail clients, (including the Mail@UMD web client) provide their own message filtering. These filters allow automatic actions to happen based on keywords or phrases in the e-mail headers or message body. These actions may include sorting, marking, or even automatic deletion of unwanted e-mail.

Server-level filters:
Undesired mail often has certain characteristics such as many lines in all capital letters, or malformed return addresses. These characteristics can be used to indicate the probability that a certain message is junk mail without having to understand the meaning of the actual message body or headers.

The Junk Mail Analyzer system optionally runs several tests on the headers and body of each message. Each test that matches alters the “junk mail score” of that message. If the score of the message exceeds a user-defined threshold, the message subject can be altered (if this option is selected) or the message may be attached to a new message as an attachment, with the new message reporting that the attached message was marked as probably unwanted. This behavior is selectable by the individual e-mail user.

The altered message may then be processed by client-side filters to allow sorting or message deletion as desired. Please note that some messages will not be tested by the Junk Mail Analyzer.

Additionally, each user may define some From: or To: addresses as unconditional indications that the message is unwanted. Likewise certain From: or To: addresses can be defined to unconditionally indicate that the message is not unwanted.

Descriptions of the various tests that our system uses are detailed at: http://www.spamassassin.org/tests.html

These tests will undoubtedly evolve over time as the senders of unsolicited e-mail modify their messages in attempts to deceive our system.


Getting Started:
There are three things to do to get started with the Junk Mail Analyzer:
  1. Elect to have your e-mail scored
  2. Set up your personalized analysis configuration
  3. Set up your client-side filtering
How to opt-in and set up the server-side analyzer:
The configuration page for the Junk Mail Analyzer is at: https://www.oit.umd.edu/email/junkmail/. Using this page, you can opt-in for this service, as well as configure the custom subject-line tag. You can also choose addresses to unconditionally mark as junk or to unconditionally leave alone. We have some information about the various configuration items at Setting up the Junk Mail Analyzer

Client filters:
Please refer to the documentation for your particular mail client. Clients capable of filtering e-mail include Thunderbird, Outlook and Outlook Express, Eudora, and even Pine.

Client-side filtering is an excellent way to block mail from a particular address or domain. Look for a pattern to match. Set your client filter to match the From: Reply-to: or Received: header fields. Some clients support matching on any header field.

You can also define client-side filters to look for characteristic words or phrases that are unique to undesired messages. Candidates for this type of filter are: “FREE”, “DOLLARS”, “TONER CARTRIDGE” and exclamation points in the subject line. There are many other examples.

Client side filters can work in concert with the system level filtering described above to classify or delete messages according to defined rules.

We have a page to provide guidance on configuring your e-mail client for client-side filtering


Mailing list:

OIT will maintain a listserv list to publish system announcements for the Junk Mail Analyzer. That list is called: junk-mail-analyzer@listserv.umd.edu. To subscribe, enter your e-mail address in the form below, and click the "Sign Up" button.

E-mail address:


Definitions:

Client: The computer in a client/server architecture that requests files or services. The computer that provides services is called the server. In the case of e-mail, the client is the program on your computer -- e.g., Outlook Express, Thunderbird, Eudora -- that handles mail retrieved from, or about to be sent to, the mail server.

Server: A system that provides network service such as disk storage and file transfer, or a program that provides such a service. In the case of e-mail, there are two kinds of servers. One kind, the incoming mail server (POP or IMAP), is the machine which handles mail delivered to your email address until you retrieve it; if you are using an IMAP server, it also stores your mail. The other kind, the outgoing mail server (SMTP), handles the mail that you wish to send to other people.

Client / Server: A method of communication between computers in which one computer can get information from another. The client is the computer that asks for the information (data, software, or services). The server supplies the information requested by the client. Either machine can be anything from a personal computer to a mainframe.

E-mail headers: Email headers are used to deliver a message over the Internet, and are included in every message you receive. You are probably familiar with some of the lines in a header (To, From, Subject, Date, Cc:, etc) that are commonly shown by mail programs. But the "full" (or "Internet") header actually also contains many other lines, which provide a record of the specific route taken by the email, and most mail programs have a method for specifying whether to display this full header or a shorter version.

Rules-based filtering: A method of filtering that evaluates an e-mail message based on certain "rules" that you establish. Rules generally specify what characteristics of a message to look for, and what action to take if the message matches those characteristics. What these characteristics can be depends on the sophistication of the rule-making capability of a program. Among other things, they might include particular text-strings found on various header lines (e.g., addresses present on or missing from To or From lines, undesirable topics on Subject lines, etc), or the presence and quality of other header lines, or how the body of the message is constructed.

How do I:
How are we doing? Comments on this page?
Office of Information Technology
Office of Information Technology University of Maryland