Thursday 3 March 2016

SMTP Routing in Exchange 2010

Message flow

In Exchange Server 2010 all messages are always routed through the Hub Transport Server. This is the case for:
  • Messages routed to and from the Internet, with and without an Edge Transport Server;
  • Messages routed to other Active Directory sites within the Exchange organization;
  • Messages between mailboxes on the same Mailbox Server, even mailboxes within the same Mailbox Database;
Yes, you read that last one correct. If there’s one Exchange 2010 Mailbox Server with only two mailboxes on it, and the first mailbox sends a message to the second mailbox, it is routed through the Hub Transport Server.
If the Hub Transport Server is not available, the message will not be sent and the message will not leave the mailbox!

Transport internals

To get a better idea of the message transport in Exchange Server 2010 we have to identify several services in the Exchange environment that play an important role in message routing.
  1. Mail Submission Service – when a message is created and the Send button is clicked, the new message is placed in the mailbox outbox. There’s a service running on the Mailbox Server role called the “Exchange Mail Submission Service” which notifies the Hub Transport Server that a new message is awaiting for processing. The Mailbox Server has an internal list of Hub Transport Servers in the same Active Directory site (the submission server list) which is updated every 10 minutes. This is done by the server discovery process. A round robin mechanism is responsible for load balancing the SMTP traffic across these Hub Transport Servers;
  2. Store Drivers – the Hub Transport Server’s Store Driver retrieves the message from the Outbox and puts it in the Submission Queue on the Hub Transport Server. The Store Driver uses RPC to retrieve the message from the Mailbox Server. There’s no traffic on port 25 (i.e. SMTP) between the Hub Transport Server and the Mailbox Server;
  3. Submission Queue – this is a queue, located on the Hub Transport Server where all messages are stored that need to be processed. Not only the Store drivers can store messages in the submission queue, but this can also be done through a receive connector or the pickup directory.
  4. Categorizer – the categorizer retrieves messages from the submission queue and determines where the message needs to be sent to. This can be an internal Active Directory recipient or an external recipient. The categorizer also expands distribution groups and identifies alternative recipients or forwarding addresses.
  5. Pickup Directory – this is a directory that is checked once every 5 seconds for new messages. When a message is in the correct EML format it is picked up from this directory and when the process is completed the file is deleted from the pickup directory.

Figure 1: Overview of the Hub Transport Server Role

Queues

A queue is a temporary location where messages that are waiting for processing are stored. The submission queue is already discussed, but there are more queues:
  • Mailbox Delivery Queue – This queue stores messages that will be delivered to mailboxes on a Mailbox Server. Again, communication between the Hub Transport Server and the Mailbox Server for mailbox delivery is encrypted RPC. Note: a mailbox delivery queue only exists on Hub Transport Servers and not on Edge Transport Servers;
  • Remote Delivery Queue – this is a queue where messages are stores that need to be delivered remotely.  Remote delivery queues can exists on Hub Transport Servers as well as Edge Transport Servers. There’s a remote delivery queue for every domain the Transport Server is sending messages to. So, if there are outbound messages sent to for example jaapwess@hotmail.com, then a remote delivery queue exists for the domain hotmail.com. These remote delivery queues only exist for a small amount of time. After successful delivering of the message to hotmail.com, the queue for this domain is deleted, but only when no other messages are sent within a couple of minutes. Besides a remote delivery queue for every external domain, there’s also a remote delivery queue for every Active Directory site that contains other Hub Transport Servers;
  • Poison Message Queue – There are specific messages that Exchange considers to be harmful to the Exchange system after a server failure. If such messages are encountered then they are stored in the poison message queue. The poison message is empty during normal operations and therefore it does not appear in the Queue viewer.
  • Unreachable Queue – This queue contains messages that cannot be routed to their original destination, for example after reconfiguring the routing infrastructure.

Queue Database Files

In Exchange 2003 and earlier the queues were stored on the local disk in the c:\mailroot\queue directory, or a queue directory in the Exchange Server directory. Exchange Server 2007 and Exchange Server 2010 store their queues in an ESE database. This database is located in the “C:\Program Files\Microsoft\Exchange Server\TransportRoles\data\Queue” directory. ESE operations on this Queue Database file are similar to those of a normal ESE Mailbox Database file. So transactions are stored to the log files first and later on are committed to the Queue Database file. The checkpoint file keeps track of which transactions are already written to the Queue Database file.
Log files of the Queue Database file “circular logging” are enabled by default. This means that older log files are deleted when they are no longer needed. Because of this, it is not possible to recover old data from the log files. Since the data in the Queue Database file is volatile (i.e. it exists for a short period of time in the queue and the database) this is not an issue.
The Queue Database consists of the following files:
  • Mail.que – the actual Queue Database file;
  • Tmp.edb – temporary file for Queue Database file operations;
  • Trn*.log – the individual log files for storing Queue Database transactions;
  • Trn.chk – the checkpoint file that keeps track which transactions are already committed to the Queue Database;
  • Trnres00001.jrs and Trnres00002.jrs – two reserved empty log files that are used when the disk is full during normal log file operations.

Figure 2: The Queue directory on local disk. Already 2459 log files (99B hexadecimal) are created on this serve

Send Connector

Right out of the box, Exchange 2010 doesn’t have a way to send message to the Outside world. Before you can send message you have to create a Send Connector:
  1. Logon as an administrator on an Exchange 2010 Server and open the Exchange Management Console. Navigate to the Organization Configuration and select Hub Transport. In the results pane, select the Send Connector tab;
  2. The New Send Connector wizard appears. Enter a name for the Send Connector and select its usage (“Internet”) and follow the wizard. Enter * for the address space. This way the Send Connector will be used for all SMTP domains, except for the internal SMTP domains of course.
  3. Next is to select whether you want to use a smart host (an SMTP server with your provider for example) of whether you want to use DNS. In the latter case the Hub Transport Server will be responsible to retrieve all MX records and send the outbound messages accordingly. My personal preference is to use DNS, but you might have valid reasons to use a smart host of course;

Figure 2
  1. In the next Windows you have to select the Source Server. If you have multiple Hub Transport Servers in your Active Directory site you can enter them here. These servers all participate in the Send Connector. SMTP messages will automatically be load balanced across all source servers in the Send Connector;
  2. Finish the wizard and the Send Connector will be created.
The Hub Transport Server is now able to route messages to the Internet.

Receive Connector

By default, a Hub Transport Server will not accept anonymous connections. And connections coming form the Internet are anonymous. Microsoft’s opinion is that you need to use the Edge Transport Server (in your DMZ). The Edge Transport Server does accept anonymous connections from the Internet, and communications between Edge Transport Server and Hub Transport Server is secure. I’ll get back to using an Edge Transport Server in one of the next articles in this series.
But if you want to enable anonymous connections you have to enable this on the Default Receive Connector. Open the Exchange Management Console, Navigate to the Server Configuration and select Hub Transport. In the results pane, select your Hub Transport Server and in the low pane, select the Default Receive Connector and request its properties. Click on the Permission Groups tab to see who is allowed to connect to this Receive Connector. TheAnonymous Users is not selected by default as shown in this screenshot:

Figure 3
Check the Anonymous users checkbox and click OK to close the window. The Transport Services needs to be restarted for this setting to take effect. Open an Exchange Management Shell window and enter the Restart-Service MSExchangeTransport command.
We now have a Hub Transport Server that can send and receive SMTP messages to and from the Internet.
Please note that when you use an appliance in your DMZ, for example a Barracuda, IronPort or McAfee solution this is considered as “anonymous” from a Hub Transport Server point of view.

Mail flow to the Internet

When a client creates a new message and clicks the Send button the message is placed in the Outbox. A Hub Transport Server will pickup the message and will place it in the Submission Queue on the Hub Transport Server. In this Submission Queue the message is safely stored on the hard disk of the Hub Transport Server. The Hub Transport Server can crash and reboot, but the message is still safe.
The Categorizer would pick up the message from the Submission Queue and check the To: field in the message to determine the recipients. The routing component of the Categorizer determines the final destination for the message. For Internet delivery, the message is delivered to “SMTP Out” (part of the MSExchange Transport Service) which uses the Send Connector that was created in the previous step. Depending on the address space a routing decision is made.
A queue is created for every message that is sent. This is true for internal messages (back to a mailbox server in the current Active Directory site as well as other Hub Transport Servers in the Exchange organization) and for external messages.
You can view the queues using the Queue Viewer. Open the Exchange Management Console and navigate to theToolbox section. Select the Queue Viewer in the results pane.

Figure 4
You can see the individual queues, the status, the number of messages in the queue and when there are some issues, the error is shown as well. Using the Queue Viewer you need to connect to every Hub Transport Server so see messages in queues on other servers.
You can also use the Exchange Management Shell to get queue information using the Get-Queue cmdlet. When combined with the Get-TransportServer cmdlet you can see all queues on all Hub Transport Servers:

Figure 5
What’s a reasonable amount of messages in a queue and how long do they stay there? That’s not an easy question to answer. You’ll see quite a number of queues as a result of NDR’s (when spam is sent to non existing users) that cannot be delivered. Typically these queues have only 1 or 2 messages. When there are problems using delivery on the other SMTP hosts, the number of messages will increase. This is the time to start troubleshooting

Active Directory sites

Exchange Server 2003 used the concept of Routing Groups for routing SMTP messages, where Active Directory used the Active Directory sites for exchanging information. In Exchange Server 2007 this was combined and Routing Groups are no longer used and this is true for Exchange Server 2010 as well.
An Active Directory site is associated with one or more IP subnets. Exchange Servers are automatically assigned to a particular site as soon as their IP address is part of one of these subnets. Exchange Server is a so called “site aware” application, which means it has knowledge the site the Exchange Server is a member of, and when it needs services of another server or role (think of a Domain Controller or Global Catalog server) it queries its own site first for these services. Since Exchange Server is site aware it can also use the Active Directory topology for routing messages. It also uses the Active Directory topology for determining other Exchange servers in the same site, possibly with other Server Roles on it. The Active Directory site is the “service discovery boundary” which means the Exchange Server will not look outside its own Active Directory site for other Exchange Servers.
Site membership is determined with a series of DNS queries during startup of the server; the Netlogon service is responsible for this. When you check the DNS entries of your Active Directory you’ll see not only the standard A records but also service records. These service records are used to find other services in the Active Directory site. Site membership is determined by comparing the local IP address with information found in DNS. The Active Directory Topology service also queries Active Directory, by default every 15 minutes, to retrieve a list of Domain Controllers and Global Catalog Server in the Active Directory site. This information is written to the Application Eventlog:
Log Name:      Application
Source:        MSExchange ADAccess
Date:          24-1-2011 10:41:59
Event ID:      2080
Task Category: Topology
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      EXCH01.infra.local
Description:
Process MAD.EXE (PID=2484). Exchange Active Directory Provider has discovered the following servers with the following characteristics: 
 (Server name | Roles | Enabled | Reachability | Synchronized | GC capable | PDC | SACL right | Critical Data | Netlogon | OS Version) 
In-site:
DC01. infra.local CDG 1 7 7 1 0 1 1 7 1
DC02. infra.local CDG 1 7 7 1 0 1 1 7 1
 Out-of-site:
AMS-DC01. infra.local   CDG 1 7 7 1 0 1 1 7 1
AMS-DC02. infra.local   CDG 1 7 7 1 0 1 1 7 1
Clearly visible: this Exchange Server has two Domain Controllers (and Global Catalog) in his site, and two other Domain Controller (and Global Catalog) in the Amsterdam Active Directory site.
Every Exchange Server has a property called msExchServerSite. This property is populated with the Distinguished Name of the Active Directory site the Exchange Server is a member of. The Active Directory Topology service is also responsible for populating this property.
Every site that has an Exchange Server 2010 Mailbox must have at least one Hub Transport Server and at least one Client Access Server in the same site. The Mailbox Server uses the Hub Transport Server to send messages to other Mailbox Servers in other sites. So the Mailbox Servers themselves never communicate with each other.
Suppose we have two Active Directory sites and each sites has a Mailbox Server, a Hub Transport Server and a Client Access Server (I’ve not drawn the CAS in this picture).

Figure 1:
 Direct delivery from one Active Directory site to another Active Directory site
User1 has a mailbox on MBX01 and he sends a message two User2 who has a mailbox on MBX02. When the message is sent it is picked up by HUB01, the categorizer on HUB01 determines where the recipient is located. User2 is in the same Active Directory, but in another site. Since HUB01 cannot communicate with MBX02 the message is sent to HUB02 in the 2nd site. On HUB02 the messages is stored in the submission queue, picked up by the categorizer and the categorizer determines that the message is a local delivery. The message is now stored in adelivery queue for MBX02 and the message is delivered in the correct mailbox.

Queue at point of failure

A sending Hub Transport Server always tries to connect to the corresponding Hub Transport Server in the other site directly. But when the receiving Hub Transport Server is not available, the sending Hub Transport Server tries to deliver the message as close as possible by determining a so called backoff path. By using the backoff path the Exchange Server tries to deliver the message as close as possible to the recipient’s Hub Transport Server.
Suppose a situation like this: HUB01 wants to send an SMTP message to HUB03 but unfortunately HUB03 is not available. HUB01 determines a backoff path where HUB02 is the next hop in the routing path. The message is delivered to HUB02 where it is queued. The queue will be in a retry state and will try to deliver following the retry settings. As soon as HUB03 is available again the message is delivered. If in the end the message cannot be delivered in a timely manner, the message will expire and an NDR will be sent back to the original sender. This mechanism is called “Queue at Point of Failure”.

Figure 2:
 Queue at point of failure. The message is queued on HUB02

Delayed Fan-Out

A typical message can contain not one, but maybe multiple recipient email addresses. Suppose a user in Site-1 wants to send a message to users in Site-3 and Site-4. The message is picked-up by Hub Transport Server HUB01. HUB01 tries to determine the most cost effective way to send all messages and it determines the routing path for each message (i.e. recipient). All routing paths are compared, and HUB01 sends a message to a point where the recipients can be split. This splitting is known as bifurcation in Exchange Server and bifurcation occurs at the point where the message is split.
So, HUB01 sends the message to HUB02 where bifurcation occurs. HUB02 sends a message to HUB03 and HUB04 where the messages will be delivered. This mechanism is called “delayed fan out”.

Figure 3:
 Delayed Fan-Out. The message is split on HUB02

Unreachable Queue

There’s one queue I haven’t mentioned so far, which is the unreachable queue. When Exchange isn’t able to determine a valid route for a given recipient the message is placed in the unreachable queue. When Exchange recalculates the routing table and a valid path is determined for this recipient the message will be sent. If no routing path can be determined, the message will expire and an NDR will be sent to the original sender.

Shadow Queue

When you check the queues on a Hub Transport Server using the Queue Viewer of the Get-Queue cmdlet you’ll see message stuck in a shadow queue. You’ll see these messages in this particular queue, even when the messages are successfully delivered.
Shadow queues are a new feature in Exchange Server 2010 which facilitates redundancy on the transport layer of SMTP messages, so it’s a Hub Transport Server feature. This is achieved by deleting sent messages from a queue after the next hop in transit has successfully delivered the SMTP message. If it’s not successful the Hub Transport Server will try to find another route to the final destination.
For example, a Hub Transport Server will send a message to another Exchange 2010 Server, in this example an Edge Transport Server. When the message is delivered to the Edge Transport Server, the Hub Transport Server will storethe message in a shadow redundancy queue. The Edge Transport Server will deliver the message to another SMTP Server on the Internet. When the message is delivered, the message will be stored in a shadow redundancy queue on the Edge Transport Server. The Hub Transport Server will periodically poll the Edge Transport Server for successful delivery. If it is successful the message is deleted from the shadow redundancy queue on the Hub Transport Server.

Figure 4
If the Edge Server cannot deliver the message, the Hub Transport Server will know because of this polling. It will try another route and will send the message to the other Edge Transport Server.

Queue Viewer

As explained in SMTP Routing in Exchange 2010 part 1 that all messages are always stored in Queues. Using theQueue Viewer’s tool it is possible to view the queues and the properties of these queues. Please, note that you can also check the properties of the messages but you CANNOT view the actual messages.
To open the Queue Viewer, open the Exchange Management Console and navigate to the Tools section. There you’ll find the Queue Viewer. When you open the Queue Viewer you can see the SMTP queues on this particular server.

Figure 1:
 Queue Viewer on Exchange 2007. It's the same on Exchange 2010
You can see the status of the queue, the number of messages and the last error the SMTP Service has encountered for this queue. When the status is “ready” and the message count is “0” the message is successfully delivered and the queue will be removed shortly.
The number of messages in a particular queue can grow, but except for NDR’s as a result of spam (sent from a non-existing domain) the queue will shrink automatically within a reasonable amount of time when all is well. If messages cannot be delivered in a timely manner, the messages will be deleted from the Queue, typically within 48 hours and an NDR will be sent to the sender (in your Exchange environment that is). There can be a number of reasons why a legitimate message cannot be delivered. A DNS error, network issues on your side, network issue on the recipient side, server issue on the recipient side, just to name a few…
The queue viewer only shows the queues for one Hub Transport Server. If you want to check another Hub Transport Server you have to select “Connect to Server…” in the Actions Pane and select another Hub Transport Server.
You can also use the Exchange Management Shell to check the queues on the Hub Transport Server (my personal preference) using the Get-Queue cmdlet on the Hub Transport Server:

Figure 2:
 You can use the Get-Queue cmdlet to check the queues
When you combine the Get-TransportServer cmdlet with the Get-Queue cmdlet you can see all queues on all Hub Transport Servers. This will give a quick overview, and it is much faster than using the Exchange Management Console.

Figure 3:
 Check multiple Hub Transport Servers with the Get-Queue cmdlet
Normally, the SMTP queues will appear and disappear, queues will grow and shrink. If you have a particular queue that’s growing you may want to troubleshoot.
Johan Veldhuis wrote an interesting article regarding Exchange 2010 Queues and the status queues can have. You can find his article here.
The first question of course is “is this a legitimate domain and are there legitimate e-mails?” When you have 150 messages destined for a domain called “iwanttobuybigbluepills.com” or something like that you can be pretty sure there are 150 NDR’s waiting to be delivered as a result of SPAM.
The queue will be automatically cleaned up typically after 48 hours or so, but if you want to empty the queue immediately, open the Exchange Management Shell and enter the following CMDLets:
Get-Message -Queue "<<server>>\435" | Remove-Message -Confirm:$false
Assuming of course that the messages are in queue 435, but this will vary per situation. If the queue is for a valid domain check the last error of the queue. When it’s a DNS issue check the MX records using the NSLOOKUP utility. Open the NSLOOKUP utility and set the query type to MX and enter the domain name. It will show the MX records and the accompanying A records of the mail servers:

Figure 4:
 Use the NSLOOKUP tool to gather information regarding external mail servers.
The next step is to check if there’s network connectivity to these mail servers by using the PING command. When successful use the Telnet Client utility to open a connection on port 25. The Telnet Client utility is not installed by default (unfortunately) but use the Windows 2008 (R2) Server Administrator to install the Telnet Client utility. It can be found under the Features option.
Using the Telnet Client utility, open a connection to the recipients SMTP host on port 25:

Figure 5:
 Use TELNET to open a connection and check if you can send a test message
When checking the mailbox we can see the message has actually arrived:

Figure 6:
 The test message from TELNET has arrived!
So, using the Get-Queue cmdlet combined with the NSLOOKUP, PING and TELNET client you have a very powerful set of tools for troubleshooting the message flow. If you’re not familiar with the Telnet possibilities in combination with the SMTP Service please check out the following Microsoft article here.

Transport Logging
Another way of troubleshooting the SMTP transport is by means of protocol logging. The SMTP service has the ability to log all actions it performs. Because it logs everything this functionality is disabled. Enabling it can be useful but should only be used for troubleshooting purposes. If leaving this enabled in a production environment the local disks will quickly fill-up with logging information, and a completely filled boot and system disk generally means a dismounted server.
To enable protocol logging on an SMTP connector, open its properties in the Exchange Management Console and set the “Protocol Logging Level” to verbose on the General tab:

Figure 7:
 Enable logging on the connector
The SMTP Transport logs are now stored on the following locations:
C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\ProtocolLog\SmtpSend or C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\ProtocolLog\SmtpReceive
There are tons of information in these log files and they can be very useful for troubleshooting the SMTP communication between servers.
There’s one nifty issue though! Check out the following screen shot from the Exchange Management Console where an Edge Server is used in combination with a Hub Transport Server:

Figure 8:
 Only Edge related connectors are visible, not the connector between Hub and Edge Servers
If SMTP transport logging is enabled on these connectors, logging will occur on the Edge Transport Servers. There’s no way here to enable logging from the Hub Transport Server. The same is true for SMTP traffic between Hub Transport Servers in different Active Directory sites.
For these purposes a hidden connector is used, and protocol logging should be enabled on this hidden connector. This can only be performed using the Exchange Management Shell by entering the following cmdlet:
Set-TransportServer EXCH01 –IntraOrgConnectorProtocolLoggingLevel Verbose
Now logging will be enabled on connectors between the Hub Transport Server and the Edge Transport Server and between Hub Transport Servers in different Active Directory sites.

Routing Table Viewer

In my previous article regarding SMTP Routing I talked about the Routing Table on a Hub Transport and Edge Transport Server. This Routing Table is an in-memory table containing the Exchange organization’s routing infrastructure. This Routing Table is automatically calculated based on information found in Active Directory or sent from the Hub Server to the Edge Server in case of an Edge Transport Server.
The SMTP Service takes a snapshot of the Routing Table periodically and saves it to disk. This snapshot is taken:
  • After a fixed amount of time;
  • When the Transport services is (re)started;
  • When the routing configuration changes.
To view these snapshots, open the Exchange Management Console, navigate to the Tools section and open the “Routing Log Viewer”. There’s information regarding the Active Directory Sites, Routing Groups, Exchange servers, Send Connectors and Address spaces.

Figure 9:
 A snapshot of the Routing Table on a Hub Transport Server

Tracking Log Explorer

All messages that are sent or received by an Exchange organization are logged in so called “Message Tracking” log files. These log files are large text files (stored in C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\MessageTracking) and can be searched by a tool called “Message Tracking Explorer” which can be found in the Tools section of the Exchange Management Console.
Note:
There’s also a tool called “message tracking” under the Tools section, but this one will redirect to the Exchange Control Panel. This part is about the Tracking Log Explorer, a separate entry under the Tools section in the Exchange Management Console.
The Message Tracking Explorer can search on Sender, Recipient, Subject, Message ID etc. And EventID can be selected, like Send, Received, Submitted, etc.
The Message Tracking Explorer can be used for troubleshooting when users complain about messages that are not received or got lost or something. Of course these messages can only be searched within the Exchange organization. When messages are sent outside the Exchange organization they’re ‘out of sight’ and further message tracking is no longer possible.

Figure 10:
 Message Tracking in Exchange 2010. Note the corresponding Powershell command.
One side step regarding the message tracking log files. From a compliancy perspective it can be necessary to back up the message tracking log files in your (daily) backup cycle. This way the message tracking log files don’t get lost in case of a server failure. As mentioned earlier, the message tracking logs can be found in the directory C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\MessageTracking. Be aware that these log files can grow significantly!

Conclusion

In a series of articles I explained the functionality of the SMTP Transport Service in Exchange Server 2007 and Exchange Server 2010, also sometimes referred to as the Transport Pipeline. We’ve seen internal routing, external routing, send and receive connectors and in this final article I explained a bit more regarding troubleshooting and logging on the transport service. By default most functionality is enabled, but for troubleshooting purposes it is needed to tweak a bit more.

No comments:

Post a Comment