01Oct/18

Interesting Gateway AD Routing Failure with Skype for Business Server

Routing calls based off Active Directory LDAP lookups can be a handy feature to utilize.  Personally speaking, I don’t typically use LDAP for much else than authentication with Sonus and Audiocodes devices, but that’s just me.  In some cases you inherit a previous person’s configuration where it was enabled, and when things break….well….you have to unravel the ball of yarn.

With all that being said, the summary of the configuration at hand was:

  • Two T-1 PRI’s connected to an Audiocodes Mediant 1000B Gateway from an upstream provider.
    • Note:  The PRI’s are all located in the United States
  • Two T-1 PRI’s connected to the same Audiocodes Mediant 1000B Gateway from an Inter-Tel Axxess 9000 TDM PBX.
  • As a result, the Audiocodes was downstream from the provider, but upstream of the legacy PBX
  • A single IP Group/Proxy Set was configured for a Skype for Business Server 2015 deployment, consisting of three Front End Servers with collocated Mediation roles.
  • To “simplify” routing & configuration on the Inter-tel for when a user is moved to Skype4B, AD routing was enabled on the Audiocodes gateway.

This is a fairly standard configuration, but AD routing lookup matches were failing for certain numbers on calls inbound from the provider PRI trunk.  The numbers were absolutely assigned to an object in Skype for Business Server and were enabled for Enterprise Voice.  As a result, there were what seemed like intermittent cases where call flows were being sent to the legacy PBX instead of Skype4B.

Alright, challenge accepted….

The Initial Config

LDAP configuration on Audiocodes can be fairly simplistic and doesn’t necessarily require the use of Call Setup Rules.  This particular configuration had them in place so I took at look at the initial config:

The PRI provider was sending the full 11 digits for the called party (DstPn) so the Call Setup Rule was effectively configured to take the 11 digits and then search the msRtcSip-Line attribute to see if there was a user that matched.  Additionally, there was a (rightly configured) fail-safe configuration in place that also made sure the user was actually enabled for Enterprise Voice by searching the msRtcSip-OptionFlags attribute.

Note:  Greig Sheridan has a good write-up of the “fail-safe” configuration for Sonus SBCs, located here.

Given that the configuration seemed to be correct, I thought…alright….let’s do some tests and gather some logs.

Testing Commences

  • I called test number #1 and the call worked as expected with AD routing successfully finding the user in AD.
  • I called test number #2 and the call did not work as expected.  AD routing did not detect the user in AD.
  • I called test number #1 again and the call worked as expected with AD routing successfully finding the user in AD.
  • I called test number #3 and the call did not work as expected.  AD routing did not detect the user in AD.
  • I called test number #4 and the call worked as expected with AD routing successfully finding the user in AD.
  • I called test number #5 and the call worked as expected with AD routing successfully finding the user in AD.

Head, meet wall

At this point I am definitely confused because it is very apparent the LDAP lookups are at least working in some capacity.  Caching was also not enabled so these lookups are hitting AD real-time for numbers that have been assigned for 2-3 days.  AD replication was up-to-date.  Beginning to have a hunch that the issue was AD-lookup related, I began to examine the actual user accounts that had the matching LineUri’s and after sorting through a few of the accounts, I noticed why the call was not matching.  I’ll give you the data to see if you can catch it….

Test Number #1

Test Number #2

Test Number #3

Test Number #4

Test Number #5

Have you figured it out?  If no, I’ll give you a little nugget of help…

Schema attributes for Skype for Business

Root Cause

The Call Setup Rule configuration on the Audiocodes gateway was explicitly looking for a value of ‘385’ for the msRtcSip-OptionFlags attribute.  The call tests that were failing to route, however, showed that the AD attribute of the account had a value of ‘1409’ for the msRtcSip-OptionFlags attribute.  Since the rule didn’t match the value…the call doesn’t route according to the AD matching rules.

The immediate fix was to simply edit the Call Setup Rule with a logical ‘or’ statement that searches for both values:




ldap.attr.msRTCSIP-OptionFlags == '385' or ldap.attr.msRTCSIP-OptionFlags == '1409'

With the additional condition statement in there, I called all the test numbers again and all calls were routed as expected.  AD routing was now truly functional for all account types.

But why ‘1409’?

The OptionFlags value is a bit-mask that takes individual values and adds them up to show what features you are enabled for in Skype4B (or Lync or OCS…).  For all the tests conducted, I saw that the following accounts had the expected ‘385’ value:

  • Common Area Phone accounts
  • User Accounts enabled for EV
  • Response Groups
  • Analog Phone accounts
  • Exchange UM Subscriber Access accounts
  • Announcement Service

There was one, and only one, account type that was not showing up with the expected ‘385’ value:

  • Exchange UM Auto Attendant accounts

Indeed, every single test number that was failing was isolated to an Exchange UM Auto Attendant contact object that had been created.  Initially I thought it might have been a side effect of using the OcsUmUtil.exe application, but this same configuration occurs if you create the contact object using the New-CsExUmContact cmdlet.  It appears that either method of creating AD contact objects for Exchange UM Auto Attendant objects stamps the OptionFlags attribute to ‘1409’.

Wrap-Up

Unfortunately I’m not entirely sure why the ‘1409’ value gets set the way it does.  Even so, if you are doing AD routing and using that fail-safe configuration – either on Audiocodes or Sonus – you’ll want to update your rule conditions to also filter for the ‘1409’ value.  If you don’t, you may end up with inconsistent AD routing results.  Given that I don’t typically configure AD routing, I didn’t initially think to examine this as a potential root cause, but you can guarantee that I will be sure to do so from here on out!

22Aug/18

Odd Cause Code 50 with Skype for Business Server

For all the grief that IT architects and engineers give the SIP RFC “standard” causing interoperability issues between different systems/vendors, I could argue that ISDN suffers from similar – although not quite as widespread – deficiencies.  ISDN protocol errors are fairly descriptive, but some errors do not clearly correlate to a particular root cause and could be returned for any number of reasons for a call failure.  A recent PSTN gateway replacement caused a day of head scratching before I realized what the issue was.

Background Info

The pertinent details of this situation are as follows:

  • This was Skype for Business SBA deployment for a US site.
  • The site was changing from one PRI circuit to another PRI circuit – effectively a PRI cutover.
  • The site was changing from another gateway vendor to an Audiocodes Mediant 1000B
    • The configuration on the existing gateway vendor was a cobbled-mess to the point that you really could not rely on anything that was there.  Burn to the ground and effectively start from scratch.
  • The site did not have documented specifics of what the PRI carrier required for ANI/DNIS presentation for outbound calls
    • This was despite my plea for the information and clear communication that this could cause issues on Go-Live
  • Cutover day arrived and the circuit was swapped
    • Inbound calls functioned
    • Outbound calls did not

The Error

Every single outbound PSTN call attempt from Skype for Business resulted in the PRI carrier rejecting the call:

It did not matter what type of number was dialed, nor the format of the digits presented on either the Calling or Called number – local, mobile, US national, international – the call was released by the carrier.  Syslog messages showed the reason for the rejection was an odd Cause Code 50 error:

local0.notice  [S=10351] [SID:2010590319] (   lgr_psbrdex)(     10161)   pstn recv <-- CALL_RELEASED Trunk:0 Conn:0 RetCause:104 NetCause:50
local0.notice  [S=10352] [SID:2010590319] (   lgr_psbrdif)(     10162)   Abnormal Disconnect cause:50#GWAPP_REQUESTED_FAC_NOT_SUBSCRIBED Trunk:0 Conn:0

The “Requested Facility Not Subscribed” error was not one I had previously encountered and was not particularly familiar with.  As any “normal” person would do, I turned to Bing/Google in the hopes that I could find a similar case to help guide me, but few searches were helpful.  It seemed that Cause Code 50 could be returned for various scenarios where you are attempting to use a feature that isn’t supported on the trunk.  My unsuccessful initial troubleshooting involved changing either the Called Number or Calling Number format (including the TON/NPI indicators), thinking that it was an issue of the carrier not “liking” how we were presenting those digits.  My “ah-HA!” moment came when I stumbled across a Cisco Community forum post with a similar issue and realized it had nothing to do with digits at all….

The Cisco forum post was undeniably helpful in opening my eyes to the fact that this error could be caused by Calling Name information in the outbound call request to the carrier.  Examining the Syslog entries again, I could see that a Calling Name was being presented:

pstn send --> PlaceCall: Trunk:0 BChannel:9 ConnID:0 SrcPN=+1714XXXXXXX SrcSN= DstPN=01161298702200 DstSN= SrcNT=0 SrcNP=0 SrcPres=0 SrcScrn=0 DstNT=0 DstNP=0 ServiceCap=M RdrctNum= RdNT=0 RdNP=0 RdPres=0 RdScrn=0 RdRsn=-1 Excl=1 Display=Test3 UserNMC3 IE= UUIE=0, RawData:0 CLIRReason:-1 OrigPN= OLI=-1 OffhookInd=0 SendingComplete=1 MoreDigitsInNext=0

The format didn’t contain any hidden or non-standard ASCII characters from what I could see.  I went so far as attempting to modify the Calling Name to be a different value than the initial attempts, but those calls were still rejected.  Re-reading and focusing on the Cisco article again, I realized the true root cause…that the carrier did not want Calling Name at all.

The Fix

There are a few ways of potentially removing Calling Name on Audiocodes appliances for PRI calls:

  • Configuration->VoIP->GW and IP to IP->Digital Gateway->Digital Gateway Parameters->Remove Calling Name=Enable
  • Configuration->VoIP->PSTN->Trunk Settings->Remove Calling Name=Enable
  • Configuration->VoIP->GW and IP to IP->Manipulations->Calling Name IP->Tel

While any of these approaches would have effectively done the trick, I typically don’t like to set Global parameters unless absolutely required.  My personal preference is to set parameters as explicitly as possible so I went with the last option and created a quick manipulation table entry to remove the Calling Name:

I tested a call after setting the configuration above…and voila!  Outbound calls to the carrier now proceeded and connected without issue:

pstn send --> PlaceCall: Trunk:0 BChannel:10 ConnID:0 SrcPN=+1714XXXXXXX SrcSN= DstPN=01161298702200 DstSN= SrcNT=0 SrcNP=0 SrcPres=0 SrcScrn=0 DstNT=0 DstNP=0 ServiceCap=M RdrctNum= RdNT=0 RdNP=0 RdPres=0 RdScrn=0 RdRsn=-1 Excl=1 Display= IE= UUIE=0, RawData:0 CLIRReason:-1 OrigPN= OLI=-1 OffhookInd=0 SendingComplete=1 MoreDigitsInNext=0

pstn recv <-- CALL_CONNECTED Trunk:0 Conn:0 BChannel:10 Number: SubAddr:

In the end it was as simple as Caller-ID Name being sent on the call….who woulda-thunk!?

I suppose for some carriers, Calling Name support may still be a purchased “feature” on a voice circuit which is why the error of “Requested Facility Not Subscribed” made sense after-the-fact but didn’t necessarily send off any light bulbs at the outset of the issue.  With the configuration in place, production outbound calls started working as expected and we were able to complete the site deployment.  Definitely an odd issue – probably not encountered all that often – so here’s hoping that this will help someone else down the road!

22Jan/18

Retrieving Office365 IPs and URLs via PowerShell

Full Disclosure

I truly stink at scripting and rather dread it.  I knew early on in my career that programming was not a strength of mine, so I chose to gravitate towards networking.  I will admit that PowerShell has been a bit easier to grasp than C#, or C++, or VBScript, but even so…I am no master.  All that being said, in this fast-paced IT world it is nearly impossible to avoid some level of PowerShell scripting, so I have been forced to learn by doing.  Even so, it was a fool’s errand to go things alone and I have been fortunate and unbelievably grateful to have some ridiculously smart people – such as Kendra Thorpe, Mark Gossard, and Jeff Brown – teach me new tricks along the way.

The Headache

Office365 is a beast.  There are hundreds of IPv4 blocks that Microsoft operates the service under.  There are hundreds of DNS domains that Microsoft operates the service under.  It changes on a regular basis, typically once a month, with IP addresses and URLs changing to support new features or ongoing maintenance and optimization of the service.  Microsoft publishes nearly all of the operational information in a fairly nice-and-tidy Office Support article:

Office 365 URLS and IP address ranges

That single reference also contains references to an RSS feed that contains changes that will (or have) occurred to the service, and an XML file that mirrors the IP/Domain information that’s already on the support site.  Consumers of the service are expected to monitor the page(s) and keep up with their internal change processes to make sure that things like firewall rules and proxy servers (if used) get updated to support the changes going on within Office 365.  This expectation is often where things break down as many IT organizations simply can’t keep up with the scope of data and the amount of changes.  Without a doubt, it is a difficult task to achieve without some level of automation to assist in gathering the data.  That’s where PowerShell comes in…

Existing Solutions

There are several solutions out there that attempt to solve this problem:

  • ZScaler actually automates the entire process if you are consuming their cloud-proxy service, but they only automate it within the confines of their solution…you can’t have them update your firewalls, for instance.
  • MindMeld automates the process if you are using Palo Alto firewalls.
  • Azure Range provides an API where you can request IP information for all Microsoft online services, but for folks like me…API calls are a bit out of reach.
    • Alternatively, you can manually request things like Cisco ACLs from the website but it requires manual effort to do so.

The lowest common denominator are PowerShell based solutions that provide a bit more flexibility to integrate with non-heterogeneous IT environments.  While several existing scripts are out there, the one I initially homed-in on was from Jeremy Bradshaw on TechNet:

Get Office 365 IP (v4) Ranged from Published XML

While the script itself is useful, it misses out on gathering IPv6 and URLs.  So I did what most people would do…use his code to start and then add additional pieces to fit the bill.

The Resulting Script

Given that I had never created a PowerShell module before, this was a learning curve.  I was able, however, to accomplish all my goals and end up with a PowerShell module that could easily retrieve information from the Office 365 XML without much trouble.  As a result, Get-O365Endpoints was born.  I share this module with the greater world to help folks who may need it for more “simplistic purposes”, and also because someone – somewhere – will inevitably make it better.  A few usage examples below:

Get-O365Endpoints -Products LYO
Get-O365Endpoints -Products Teams -AddressType IPv4
Get-O365Endpoints -Products LYO,EXO -AddressType URL

Download?

Version 1 of this module available for download here:

Get-O365Endpoints-v1.0

Function Get-O365Endpoints
{
<# 
    .Synopsis 
    Powershell Method for pulling updated list of IP ranges for Office 365 endpoints from Microsoft's published xml file. Initial reference came from existing script on TechNet available here:
    https://gallery.technet.microsoft.com/Get-Office-365-IP-v4-562987d5
     
    .Description 
    Explanation: 
    https://support.office.com/en-us/article/Office-365-URLs-and-IP-address-ranges-8548a211-3fe7-47cb-abb1-355ea5aa88a2 
    XML file: 
    https://support.content.office.net/en-us/static/O365IPAddresses.xml 
  
    .Parameter Products 
    One or more Office 365 products by their abbreviation in the xml file: EOP, EXO, Identity, LYO, o365, OneNote, Planner,  
    ProPlus, RCA, SPO, Sway, WAC, Yammer.

    Note: Parameter will need to be maintained as products are added and removed by Microsoft, at which point the parameter 
    should be updated to match the current list of products in the xml file.
 
    .Parameter AddressType
    The desired information regarding the product: IPv4, IPv6, or URL.
        
    .Example 
    Get-O365EndPoints -Products LYO -AddressType IPv4

    .Example 
    Get-O365EndPoints -Products LYO -AddressType URL
 
    .Example 
    Get-O365EndPoints -Products LYO -AddressType IPv6 | Export-Csv Office365IPRanges.csv -NoTypeInformation 
#> 
[CmdletBinding()] 
Param 
( 
    [Parameter(Mandatory = $true)] 
    [ValidateSet("o365","LYO","Planner","Teams","ProPlus","OneNote","Yammer","EXO","Identity","Office365Video","WAC","SPO","RCA","Sway","EX-Fed","OfficeMobile","CRLs","OfficeiPad","EOP")] 
    [string[]]$Products = @(),
	
    [Parameter(Mandatory = $false)] 
    [ValidateSet("IPv4","IPv6","URL")] 
    [string[]]$AddressType = @()
) 

Begin{
	try { 
		$Office365IPsXml = New-Object System.Xml.XmlDocument 
		$Office365IPsXml.Load("https://support.content.office.net/en-us/static/O365IPAddresses.xml") 
    } 
	catch { 
		Write-Warning -Message "Failed to load xml file https://support.content.office.net/en-us/static/O365IPAddresses.xml" 
		break 
    } 
	if (-not ($Office365IPsXml.ChildNodes.NextSibling)) 
    { 
		Write-Warning -Message "Data from xml is either missing or not in the expected format." 
		break 
    } 
} 

Process{
    foreach ($Product in ($Office365IPsXml.products.product | Where-Object ({$Products -match $_.Name}) | Sort-Object Name)) 
            {If($AddressType -contains "IPv4")
                {
                    $IPv4Ranges = $Product | Select-Object -ExpandProperty Addresslist | Where-Object {$_.Type -match "IPv4"} 
                    $IPv4Ranges = $IPv4Ranges | Where-Object {$_.address -ne $null} | Select-Object -ExpandProperty address 
                    foreach ($Range in $IPv4Ranges) 
                    { 
                        $ProductIPv4Range = New-Object -TypeName psobject -Property @{ 
                            'Product'=$Product.Name; 
                            'IPv4Range'=$Range; 
                        } 
                        Write-Output $ProductIPv4Range | Select-Object Product, IPv4Range 
                    } 
                }
            ElseIf($AddressType -contains "IPv6")
                {
                    $IPv6Ranges = $Product | Select-Object -ExpandProperty Addresslist | Where-Object {$_.Type -match "IPv6"} 
                    $IPv6Ranges = $IPv6Ranges | Where-Object {$_.address -ne $null} | Select-Object -ExpandProperty address 
                    foreach ($Range in $IPv6Ranges) 
                    { 
                        $ProductIPv6Range = New-Object -TypeName psobject -Property @{ 
                            'Product'=$Product.Name; 
                            'IPv6Range'=$Range; 
                        } 
                        Write-Output $ProductIPv6Range | Select-Object Product, IPv6Range 
                    } 
                } 
            ElseIf($AddressType -contains "URL")
                { 
                    $URLRanges = $Product | Select-Object -ExpandProperty Addresslist | Where-Object {$_.Type -match "URL"} 
                    $URLRanges = $URLRanges | Where-Object {$_.address -ne $null} | Select-Object -ExpandProperty address 
                    foreach ($Range in $URLRanges) 
                    { 
                        $ProductURLRange = New-Object -TypeName psobject -Property @{ 
                            'Product'=$Product.Name; 
                            'URLRange'=$Range; 
                        } 
                        Write-Output $ProductURLRange | Select-Object Product, URLRange 
                    } 
                }
            Else
                { 
                    $AllRanges = $Product | Select-Object -ExpandProperty Addresslist 
                    $AllRanges = $AllRanges | Where-Object {$_.address -ne $null} | Select-Object -ExpandProperty address 
                    foreach ($Range in $AllRanges) 
                    { 
                        $AllProductRange = New-Object -TypeName psobject -Property @{ 
                            'Product'=$Product.Name; 
                            'Address'=$Range; 
                        } 
                        Write-Output $AllProductRange | Select-Object Product, Address
                    } 
                    
                }
        }
    }
}

Final Notes

Again, I fully acknowledge the Gallery script where I got the start and other helpful bits of information from Kendra, Mark, and Jeff to get this thing running.  I am fully expecting optimizations can be made so please be gentle as you provide feedback, but hopefully this will help out folks who need a better method of pulling this information than continuously copy/pasting from the Office support site!

15Jan/18

Transport Relays in Skype4B Online and Teams

Despite the major announcements around Microsoft’s change to position Teams as the first-and-foremost UC solution in the portfolio over Skype4B, the other just-as-big-change was the announcement that the underlying network infrastructure used to support Skype4B within Office365 was changing to introduce TRAP (aka – Transport Relay).  Despite my love for creating in-the-weeds technical posts, I’ll defer on reinventing that wheel since there are already sufficient resources out there in the blog-o-sphere that talk about TRAP:

What I will choose to cover, however, is what those TRAP IP addresses are and how they have changed over the past few months.

TRAP at the Start

To quote Admiral Ackbar, “It’s a Trap!”…  A “small” but worthy goal of moving towards AnyCast routing for Skype4B so that media performance is increased by getting traffic onto the high-performance Microsoft network as soon as possible.  It was a simple beginning that only consisted of just over 10 IP addresses.  Microsoft didn’t explicitly publish the IP addresses, but you could obtain bits and pieces of the IP addresses in play if you knew where to look.  For instance, if you examined the Wireshark traces of the original Skype for Business Network Assessment Tool, you could see that the 13.107.8.2 Global Relay IP address widely advertised on the Internet wasn’t actually where the media traffic ended up coming to/from:

At that time, RTP streams were actually flowing to/from a 104.44.195.154 IP address instead of the global Anycast IP address.  Even for the third-party partners such as IR Prognosis, you would see the global AnyCast IP was only part of the puzzle used for the tests:

104.44.195.X
104.44.195.X
104.44.195.X
104.44.200.X
104.44.200.X
104.44.200.X
104.44.200.X
104.44.201.X
104.44.201.X
104.44.201.X
13.107.64.X
13.107.65.X
13.107.8.2

Sadly, the only way orginally saw the list above is because IP Prognosis’s tool had them listed in the tool portal.  If it hadn’t been for that chance viewing, I don’t believe I would have been able to see them all because I never saw them listed anywhere else.  Even so, it was relatively clear that most production Skype4B traffic in Office365 wasn’t actually traversing the testing IPs and that this was a pre-cursor to a big change from Microsoft….it just wasn’t completely clear to everyone at the time what was coming:

I had suspicion that AnyCast was coming, but Microsoft’s tight-lipped-nature didn’t expose much.  Ignite 2017 changed that reality and they’ve gone full blown with AnyCast/TRAP.  Even so, the announcement was that tenants would be enabled in the future, which leaves many to wonder when this will occur and how this really changes things for them.

TRAP in CY 2018

Microsoft still does not officially publish the TRAP IP addresses for customers to examine, which is a bummer.  Even though they aren’t officially published, there is a deceptively simple way to obtain a new’ish list, which contains significant changes that goes to show it truly has begun to roll out to wider audiences.  Obtaining this list is far simpler than before, thankfully, and again comes thanks to the Skype for Business Network Assessment Tool.  The testing tool was updated in late 2017 and that update contained some new functionality, one of which included a “/connectivitycheck” parameter.  If you add on a final “/verbose” parameter…voila!:

.\NetworkAssessmentTool.exe /connectivitycheck /verbose

If you pass all the results to “| clip” and drop it in Excel, you can eventually filter out the following list of 56 unique IP addresses:

104.44.195.1
104.44.195.11
104.44.195.254
104.44.200.1
104.44.200.254
104.44.200.53
104.44.200.92
104.44.201.1
104.44.201.146
104.44.201.254
13.107.64.2
13.107.65.5
13.107.8.2
52.114.124.1
52.114.124.108
52.114.124.110
52.114.124.221
52.114.124.254
52.114.125.1
52.114.125.254
52.114.126.1
52.114.126.254
52.114.127.1
52.114.127.254
52.114.188.1
52.114.188.18
52.114.188.19
52.114.188.21
52.114.188.254
52.114.188.27
52.114.188.28
52.114.189.1
52.114.189.254
52.114.190.1
52.114.190.254
52.114.191.1
52.114.191.254
52.114.220.1
52.114.220.10
52.114.220.254
52.114.221.1
52.114.221.254
52.114.222.1
52.114.222.254
52.114.223.1
52.114.223.254
52.114.60.1
52.114.60.254
52.114.60.32
52.114.60.33
52.114.61.1
52.114.61.254
52.114.62.1
52.114.62.254
52.114.63.1
52.114.63.254

The list has grown considerably but more important is the fact that it has grown into the 52.112.0.0/12 block of IP addresses that Microsoft explicitly lists as used within Skype for Business Online.  If your break out your CIDR skills and crunch the numbers, you’ll see that that range goes from 52.112.0.0 to 52.115.255.255, so it’s clear that Microsoft has begun to move the technology into the production Office365 IP ranges.  I say “production IP ranges” because most of us involved with Skype4B Online implementations know that the IP consolidation Microsoft has been performing results in a very large percentage of tenants being located in the 52.112.0.0/12 range.  If you perform a trace-route, you’ll see that these new IP addresses are given PTR DNS records under the .relay.teams.microsoft.com domain:

52-114-124-1.relay.teams.microsoft.com

52-114-60-1.relay.teams.microsoft.com

52-114-189-1.relay.teams.microsoft.com

As you can see from my results, the RTTs are a bit over the map.  If these truly are AnyCast IP addresses, then Comcast should simply be peering via BGP to the closest peer router advertising AS8075 (and thus keeping the RTT low), but you’ll notice slight routing differences between the examples.  Assuming that these are AnyCast IP addresses, then I would expect the RTT for the final hop to be a bit lower but it is possible that some of the 52.X.X.X IP addresses in the testing list are used as geographically-regionalized relays, such as NOAM, SOAM, EMEA, APAC, etc.  For instance, 52.114.60.1 seems to traverse Los Angeles (be-3-0-ibr02-laxo2 and ae72-0-lax) and Tokyo (ae26-0-tya and ae-10-0-tyo01) before it arrives at its final destination.

Note: I don’t actually know for certain that those acronyms correlate to those cities.  It is an absolute possibility that I could be wrong, but examination of the naming standard that Microsoft uses (based off other trace-route tests and from the Ignite 2017 content) seems to suggest that this routing path is likely.

If that is actually the case (geographically-regionalized relays), then it would be expected that my trace-routes have higher-than-expected RTT since the data packets must traverse a longer distance to actually arrive to the node in the remote locale.  Even so, all those trace-routes show that my client is able to ingress into the Microsoft network in roughly 20 milliseconds via an edge node in Atlanta, GA (ae3-0-atb and ae8-0-atb), which means that BGP peering is working exactly as it should in taking my data packets from Nashville, TN and allowing them to ingress at the closest Microsoft network ingress location in Atlanta, GA:

52-114-124-1.relay.teams.microsoft.com

52-114-60-1.relay.teams.microsoft.com

Finding Peering Locations

There are several places that Microsoft puts information in relation to Office365, Azure, and its global network, but they don’t make it particularly easy to find and they don’t include everything in a single one-piece-gift-with-a-bow-package.  As a result, you have to dig through several pieces of information to pull out what exactly you may be looking for.  For example…

Office365 Datacenter Locations

If you are looking for where Microsoft’s Office365 data centers are and what services are offered from each, this is an important URL to bookmark.  Even so, it does not make any mention of the Microsoft network ingress locations.

Azure CDN POP Locations

If you are looking for where the Azure Content Delivery Network Points-of-Presence are, this is an important URL to bookmark.  While Office365 does utilize some of the same CDN infrastructure as Azure, it isn’t a complete 1:1 reference and as a result it doesn’t completely match up with the Microsoft network ingress locations.

Azure Regions

If you are looking for where Microsoft’s Azure data centers are and what services are offered from each, this is an important URL to bookmark.  Even so, it does not make any mention of the Microsoft network ingress locations.

Ignite 2017 Content – BRK1005

Probably the best content on listing network ingress points comes from an Ignite 2017 presentation.  Slide 8 of the BRK1005 presentation includes a table with 69 of the peering locations – even though more than 69 are actually available – and it also includes a hyperlink to the most important piece of information…the Microsoft network Autonomous System number.

This particular Ignite 2017 session is probably the most important for network architects/engineers to watch because it truly does provide the most in-depth information about how Microsoft is expanding and enhancing their networks supporting Office365.

PeeringDB

This is probably the single most important list out there because it includes all ISPs (public and private) and geographic locations where Microsoft peers their Autonomous System Number 8075 with the Internet.  If you are looking for whether or not your office location has a peer available in-region and you cannot find it on any of the other resources, the PeeringDB list is the authoritative resource that may tell you.  You may typically filter the list by entering the name of the region/city (such as “Hong Kong” or “Brazil” or “Atlanta”) but you may have to dig a bit deeper and actually enter your ISP name if all else fails:

Hong Kong

Brazil

Atlanta

Remember that while Microsoft may advertise the ASN number from these locations, it is up to each ISP in those regions to receive and honor the BGP routes that Microsoft provides.  If you find traffic routing across other carriers or potentially arriving at an ingress node that isn’t in-region, you’ll need to reach out to your ISP & Microsoft so that they can hash out the issues with peering (assuming there is an in-region Microsoft node).  For some locations – like China – all traffic should ingress via Hong Kong so that particular routing path is not cause for alarm because it is by-design.  For most large regions in NOAM, SOAM, EMEA, and APAC, Microsoft edge nodes are available in-region and it should greatly reduce the distance required to hop on their high-speed network.

Final Notes

In any event, the migration to TRAP is very much real and Microsoft’s guidance to “drop traffic on the Internet as close as possible to the user” aims to take advantage of this network infrastructure.  This is also why Microsoft wants customers to use the Internet and not ExpressRoute for Office365, because ExpressRoute circuits are effectively centralized Internet egress points which don’t offer much advantage to users unless an ExpressRoute circuit is available at every single physical location that exists within an enterprise.  Only the largest enterprises with the biggest checkbooks could afford that type of infrastructure, so as a result most enterprises would have ER circuits only at their major data center locations, thus negating the benefits of AnyCast routing for users that may connect to those data centers from far-away geographic regions.  Microsoft will continue to enhance their network infrastructure and it’s up to ISPs and Enterprises to make sure that routing to the AS8075 network is as fast and performant as possible.

27Sep/17

Examining Network Traffic for Microsoft Teams in Office365

Updated 9/28/2017 – Including direct references to Ignite content relevant to architecture

Ignite 2017 has turned out to be quite the stir for Unified Communications…err…I mean, Intelligent Communications. The big news that Microsoft intends to (eventually) sunset Skype for Business Online in favor of Microsoft Teams has once again significantly altered the trajectory of partners and customers consuming Microsoft’s communications services.  While much can be said about the pros & cons of this approach, the end result is that customers and partners (myself included) must change and adapt.  The back-end processes and infrastructure of Microsoft Teams is a bit of a mystery with limited technical information when compared to Lync/Skype for Business.  Given that this information will begin to come out over time as Microsoft enhances Teams with the IT-policy controls and documentation that existed for Skype4B, I realized that some insights can be gathered by some old-fashioned manual work: that’s right…simple network traces have proven to be hugely informational and provides a peek into the inner-workings of Teams.  With that in mind, what follows are pieces of information I was able to gleam, with the caveat that the information will be updated/corrected later on, as Microsoft begins to release official information that will supersede the info I have here.

Sign In Process Information

For any seasoned Lync/Skype admin, we all know that specific DNS records are required in order for the client to discover the FQDNs for the pools the accounts is homed to.  The autodiscover process is (relatively) well documented and often times poorly understood (and implemented).  For Teams, there is no hybrid support – you’re all-in within the cloud.  Microsoft doesn’t explicity document what FQDNs are used…but Wireshark or Message Analyzer will!

Upon application start, Teams initially performs a DNS A record query for:

  • pipe.skype.com

The DNS query response gives us the first clue that Microsoft’s usage of CDN networks has begun to creep into its UC (IC) platform.  Two separate CNAME records are returned for this query:

  • pipe.prd.skypedata.akadns.net
  • pipe.cloudapp.aria.akadns.net

The resulting IP address is 40.117.100.83, but given the usage of CDN is in play, this IP address will vary for others across the globe. Indeed, the akadns.net domain is owned by Akamai and is part of their global CDN network. An examination of the final CNAME record shows that at least 11 separate IP addresses are available across the globe!

There is a good deal of TLS encrypted traffic following the resolution of pipe.cloudapp.aria.akadns.net, but eventually another DNS query is triggered for:

  • teams.microsoft.com

The DNS query response gives us a separate CNAME record:

  • s-0001.s-msedge.net

The resulting IP address is 13.107.3.128, but an important note is that the FQDN of the IP is associated with the Microsoft Edge node network, msedge.net.  The IP address resolution across the globe for this FQDN is the same which leads me to believe that Microsoft has begun to migrate some Teams traffic to utilize AnyCast, thus ensuring clients take the shortest path to ingress to the Microsoft network.  It also may be possible that there is only one ingress point for this name and Geo-DNS and/or AnyCast is not is use, but I’m not sure if that would be the case.

Following the connection to the edge node, authentication requests occur and I’m prompted for Modern Authentication credentials.  The process happens largely outside of the FQDNs and IP blocks that Microsoft lists for Teams (login.microsoftonline.com), so I won’t cover the details here.  Following completion of the authentication process, however, the client then continues communications to pipe.cloudapp.aria.akadns.net.

A few thousand packets later, another DNS query comes across:

  • us-api.asm.skype.com

The DNS query response gives another entry point into the CDN networks via another CNAME query:

  • us-api.skype-asm.akadns.net

The resulting IP address is 40.123.43.195, but given the usage of CDN is in play, this IP address will vary for others across the globe. An examination of the final CNAME record shows that at least 2 separate IP addresses are available across the globe.

I cannot really speculate what the us-api FQDN is for, but it sure seems like a Front End system because shortly thereafter, my client is returned a very specific geo-localized FQDN that is queried for:

  • amer-client-ss.msg.skype.com

The DNS query response gives multiple CNAME references:

  • amer-client-ss.msg.skype.com.akadns.net
  • skypechatspaces-amer-client-geo.msg.skype.com.akadns.net
  • bn3p-client-dc.msg.skype.com.akadns.net

The IP address returned is 40.84.28.125, but the amount of CNAME referrals and even the name of the FQDNs leads one to believe that several layers of CDN and/or Geo-DNS localization are potentially occurring.

Note:  I’m skipping several DNS queries just to keep things short(er), but know that there are 3-4 other FQDNs and referrals I am leaving out for brevity sake.

There was a critical note made during an Ignite presentation that the Teams infrastructure was built to run on Azure, and eventually a DNS query crossed the wire that proves it:

  • api.cc.skype.com

The DNS query response gives multiple CNAME references:

  • api-cc-skype.trafficmanager.net
  • a-cc-usea2-01-skype.cloudapp.net

Big Deal…So Where’s Azure?

The answer to that, is in the CNAME FQDNs above:

  • trafficmanager.net
  • cloudapp.net

Both of these domains are owned and utilized by Azure.  Each has its own purpose, mind you, as Traffic Manager is designed to direct client requests to the most appropriate endpoint based on health status and traffic routing methods, while CloudApp FQDNs are used when architects build an app or service within Azure.  This is the “proof in the pudding”, as they say, that Microsoft really is putting on their chips on Azure as the future of the cloud, folks:

The Teams service really does operate via Azure and Microsoft is using their own tools and services to optimize the traffic:

What About Skype4B Interop?

While is it true that Teams has a brand new infrastructure, the Teams client does still offer some backwards compatibility with Skype4B.  Indeed the DNS queries prove that there absolutely is connectivity to at least some portion of the Skype4B Online infrastructure:

  • webdir.online.lync.com
  • webdir1a.online.lync.com
  • webdir2a.online.lync.com
  • webpooldm12a17.infra.lync.com

There’s no configuration in the client anywhere for the legacy webdir discovery record, so this must be a hard-coded behavior that triggers the resolution process.

But What About Media?

Of all the unknowns most interesting to me about Teams, it’s the media stack.  Lync/Skype4B had very robust media stacks that were configurable to an extent (more so for on-premises customers).  Teams, however, largely has little information known about media.  A few things we can safely assume:

  • Given that Teams & Skype4B can interop, that means ICE, STUN, and TURN are used.
  • Audio and video codecs between Teams & Skype4B offer at a minimum Silk and H.264UC, but also (hopefully) G.722 and yes, even RTAudio
  • Media is, as expected, encrypted by SRTP

Given that little can be known without examining ETL files, I’m surmising a few details and noticing a few others….  The following details were noticed when joining a Teams-native conference, including IP audio, IP video, and screen share.

1 – Skype AnyCast Servers are in the Mix

Fire up a conference and you will indeed see the Teams client fire off STUN requests to the global Skype AnyCast IP of 13.107.8.22:

The traffic itself does NOT remain there, but there were 33 packets sent to-and-fro the AnyCast IP.  Indeed the Skype Network Testing Tool is similar as only the first sets of packets are sent to the AnyCast IP before the traffic is offloaded to a different IP…

The second IP referenced is short-lived as well, with a total of only 51 packets in total.

2 – Teams Edge Servers?

What seems very interesting is that for a time STUN traffic seems to be duplicated to multiple IP address destinations:

  • 104.44.195.205
  • 23.100.65.165

The duplicate traffic flows exist for the start of the call, but then traffic settles on what appears to be a direct path to the 23.100.65.165 IP address, accounting for 8,303 packets:

The final flow above looks like a similar connection you would expect to see when an external Skype4B client is connecting to the 50K port range of a call negotiated through the external interface of an edge server.  Seems like ICE, STUN, TURN are definitely at play…

3 – Source Ports seem to be Different

For enterprise customers, Skype4B offered defined source ports you would see client traffic originated from (50,000-50,059 UDP/TCP).  Teams, it seems, (HA – unintentional rhyme) does not adhere to those same ports.  I count at least three separate source ports utilized by my client when communicating to the cloud MCU:

  • 8085->51261
  • 20878->53692
  • 26563->59150

It was difficult to determine which modality was using which source port unfortunately (and especially difficult since Teams doesn’t produce logs that can be examined in Snooper), but I’m pretty confident that 8085 was my audio stream.  The other two were video and/or desktop share.

The port change is surprising and worrisome, as enterprise customers cannot police QoS without having pre-defined ports available, such as the previous configuration in Skype4B.

4 – STUN Ports Still Standard (Mostly)

UDP 3478 is known as the port used for STUN, and the Teams client definitely uses it:

UDP 3479-3481 were recently added to Microsoft’s requirements for Teams & Skype4B, but I cannot find a single packet that used it.

This port usage is likely still down the road before it is really ready for prime-time, perhaps?

Final Thoughts

There are so many unknowns to go through regarding the Teams infrastructure and the client.  Microsoft will definitely begin releasing this information over time now that announcements are public, and some of this information may be updated, solidified, or removed.  At a minimum, it’s an interesting dig into the product…all from a little network sniffing!

23Jan/17

Examining the Call Quality Dashboard Template in SOF

On the week of January 9, 2017, Microsoft added some considerable new offerings within the Skype Operations Framework. While SOF is helpful in many aspects, its breadth and scope make it difficult to understand what to use and where to use it and how to use it. In more than one way, trying to understanding SOF is like the the old saying ‘trying to drink from a fire hose‘ – the content is all good but the volume of stuff seems to get in the way. Even so, Microsoft provided a home-run in the new content by giving customers a template to utilize for the Call Quality Dashboard within Skype for Business Online. If you are using Skype for Business Online today, you should go download this template and begin looking at your data, because the findings will be eye-opening and worthwhile.

What’s CQD, anyway?

For some out there, you may have no idea what CQD is.  Maybe you don’t use Skype4B.  Maybe you do but you haven’t delved into the inner-workings.  Either way, CQD can simply be described as an advanced way to analyze representations of media streams, media quality, and usage metrics. Before diving in to CQD though, you need a small history lesson…

Within on-premises deployments, you have two databases that comprise what’s known as the ‘Monitoring Databases’:

  • CDR.mdf – The CDR database contains call detail records – session information that contains who did something, what they did, and when they did it.  Examples include: SIP URI’s, modality type, timestamps, etc.
  • QoE.mdf – The QoE database contains quality metrics – specific network and performance information that contains where someone did something and how it performed. Examples include: IP addresses, modality type, packet loss, jitter, MOS, etc.

The big problem back in the Lync Server 2010/2013 era was that while the CDR/QoE information was great to have, the Monitoring Reports that MSFT provided to query the data weren’t overly robust. The pre-built reports offered value but they were not customizable (meaning they were static in the data they queried) and creating new reports required you have an intimate knowledge of SSRS, T-SQL, and an understanding of the CDR/QoE database schema. Most folks – myself included – don’t have that level of understanding so we simply used things as-is.

When Skype for Business Server 2015 landed, Microsoft offered a new solution called the ‘Call Quality Dashboard‘. There are several good things about the solution but my top three would be:

  • Reporting and analysis using the power and speed of Microsoft SQL Server Analysis Services – CQD utilizes Microsoft SQL Analysis Services to provide fast summary, filter, and pivoting capabilities to power the dashboard via an Analysis Cube. Reporting execution speed and the ability to drill down into the data can reduce analysis times dramatically.
  • New data schema optimized for call quality reporting – The Cube has a schema designed for voice quality reporting and investigations. Portal users can focus on the reporting tasks instead of figuring out how the QoE Metrics database schema maps to the views they need. The combination of the QoE Archive and the Cube provides an abstraction that reduces the complexity of reporting and analysis via CQD. The QoE Archive database schema also contains tables that can be populated with deployment-specific data to enhance the overall value of the data.
  • Built-in report designer and in-place report editing – The Portal component comes with several built-in reports modeled after the Call Quality Methodology. Portal users can modify the reports and create new reports via the Portal’s editing functionality.

It’s fast. It’s easier to use. It’s customizable. Win-win-win. Not so fast…

A significant remaining limitation was the lack of in-depth templates (and thus, guidance) for what you should be querying, but bigger than that was the complete lack of visibility to user accounts that may be hosted within Skype for Business Online. Customers were left completely in the dark and unable to examine quality issues for user accounts that were homed within Skype for Business Online. Microsoft heard the complaints though and eventually released the Call Quality Dashboard for Skype for Business Online, thus allowing customers the same data analysis that is available to CQD on-premises. Even though CQD (in both scenarios, on-premises and online) contain some pre-built reports, customers were still left scratching their heads about what other pieces of information they should be examining. What other metrics could shed light on issues they’ve been having? Enter the CQD SOF template (v1)

What’s Included?

The SOF CQD sample template is a multi-layered set of reports, with a primary mission at examining audio quality. While audio is the primary reporting factor, it does not mean you cannot duplicate reports to search for data for video or application sharing. If a customer wants to report on that data then they absolutely can, but start with audio analysis, resolve your issues, and you should see the remaining modalities start to fall in line.

The top-most report is a usage/trend report that aims at showing you the total number of streams and the percentage of those streams that classify the call as having been poor (or outright failures):

If you click ‘Edit’ to examine the query, you see the data that is being pulled:

A few things to know and understand when looking at queries in the query editor:

  • Dimensions – These are items that get put on the X-axis of the chart but those items are used as groupings to summarize the queried measurements into a better visualization of the data. Month/Year is very common but you could report on others as well, such as Network Subnet, or Building Name, etc..
  • Measurements – These are the pieces of data that make up the Y-axis of the chart. Your available query options here contain all the pieces of stream information imported from the QoE database.  Jitter, packet loss, round-trip-time, percentages, etc are all available at your disposal.
  • Filters – Filters can be used to isolate and return only specific sets of data from the larger CQD data sets. Filters impact what is returned for the ‘measurements’ and could be configured to be many things. Month/Year is common (to look at data from only a certain set of months) or you could configure a filter to look at only internal network segments, etc.

Each of the types above effectively correlate back to T-SQL, so if you can grasp your mind around T-SQL then you can work with the query editor in an easier manner:

  • Dimensions are like T-SQL GROUP BY statements
  • Measurements are like T-SQL SELECT statements
  • Filters are like T-SQL WHERE statements

To dig deeper into these reports, you simply ‘follow the rabbit hole’ by clicking on the hyperlink of the report name.

Note: If a report name is clickable, that means there are sub-reports available, otherwise the report name will not be clickable.

One-Level Deeper

The first sub-report contains a bevy of data and includes reports that offer even more sub-reports.  The second level top reports include:

Audio

This report (and its sub-reports) is where you will like spend most of your time. We’ll dive a bit further into this report as we keep peeling back the layers of the CQD onion.

Media Reliability – Call Setup Failures

This report (and its sub-reports) is another useful report where you will likely spend some time. If you want to determine why clients aren’t able to connect to media or for supplementary information about why calls are poor, then you’ll find that additional data here. We’ll dive a bit further into this report as we keep peeling back the onion layers.

User Reported Experiences – Rate My Call

This report (and its sub-reports) is, IMHO, useless. Most people I know don’t fill out those prompts asking them to rate a call. Maybe your users do but I don’t find this report all that useful. YMMV.

Client Versions

This report is a useful way to track client versions. Guidance from MSFT is to remain no more than 4 months behind the current version of the client software and this report will help you identify folks using out-of-date versions.

Devices – Microphone

This report is a useful way to track microphones used by clients. Want to find out people who are using internal microphones instead of certified devices? This report (and sub-reports) will tell you.

Devices – Speaker

This report is a useful way to track speakers used by clients. Want to find out people who are using internal speakers instead of qualified devices? This report (and sub-reports) will tell you.

Audio Sub-Reports

You will spend most of your time here, as these reports identify specific stream paths and metrics issues that help you identify the biggest problems in your network environment. The best reports here and most useful (IMHO) are as follows:

Client-to-Server Poor Audio Streams

Use this report to easily examine the number of streams considered poor that involve things like conferences or CloudPBX calling. Dig further info the sub-reports to begin identifying what buildings and/or subnets are the most prone to issue…

Client<->Server Poor Audio Streams by Building

This report will give you the exact location (assuming you have filled out and imported your subnet locations – which you absolutely should!) of the building and network name involving your poor streams. Dig further info the sub-reports to begin identifying what made the calls poor, including metrics and/or connection type…

Client<->Server Poor Audio Streams by Building, Subnet and Network Connection

This report will give you the reason of ‘why’ a call was classified as poor and where those calls are from. Instead of identifying the call as ‘poor’ the table shows you the calls that are poor by classification – packet loss, degradation,round trip, concealed ratio. Another report at this level includes additional information to allow you to potentially help identify last-hop routing issues…

Client<->Server Poor Audio Streams by Reflexive IP

This report will give you the reason of ‘why’ a call was classified as poor and where those calls are from, but also adds the reflexive IP address used in the stream. The Reflexive IP is the IP address as seen by Office365 (the NAT IP or STUN address) of the stream. Use this to help you determine if media streams are egressing from an unexpected network location or to identify if a particular network egress point is potentially saturated.

TCP Usage

This report (and sub-reports) will identify audio streams that use TCP for transport instead of UDP. These reports will effectively help you quantify and isolate firewall configurations that don’t allow the right protocol or right ports. Dive in further to determine network subnets that are the culprits…

TCP Breakdown by Building and Subnet

This report gives you subnets involved in calls using TCP as transport. If TCP is used for transport, there is a possibility that either ports or IP’s may be mis-configured in your network firewalls. Another possibility is that client streams may be egressing via an HTTP proxy…

HTTP Proxy Usage

If you have streams egressing via a proxy, be ready for some significant issues. Avoid proxies if at all possible and you can do so by ensuring traffic to Skype4B Online IP’s are bypassed. Unfortunately this report doesn’t show what network sites these calls are coming from, but one could easily build a sub-report to do so.

Peer-to-Peer Sub-Reports

Without re-posting a bunch of pictures, this report (and sub-reports) contain the same information as the Client-to-Server reports but it is filtered to provide you information for calls between endpoints (P2P) within your network. These calls should never go out to the Internet (or ExpressRoute) so you can help isolate and identify network segments that are problematic within your internal network.

Media Reliability Sub-Reports

Call Setup Failures by Building, Subnet and Reflexive IP

This report helps you identify subnets and/or external IPs that have firewall rules blocking traffic to the Skype4B Online IP ranges. It potentially also helps you identify firewall rules that may be configured for SSL/TLS inspection or DPI/IPS traffic manipulation. Use the ‘Call Setup Failure Reason’ column to help with that identification. Despite this being great, it doesn’t identify which IP addresses are in the communication failure…

Custom Report – Failures to Office365 Media Relays

You can create this custom report to identify exactly which Office365 IP addresses are in the failed communication path. Your firewall team claims that they have things right? Well…if so, this report will show nearly zero failures. If failures exist, then somewhere there is a firewall or router blocking communication to Office365 and you can show them this data to prove it.

Limitation to Note

While the data is great, you should note a few ‘gotchas’:

Limitation One

Since the client submits the QoE reports at the end of the call, the default report data may include information for elements outside of your corporate network. CDR/QoE reports include all parties for a call and/or conference, so it could include federated partners or anonymous guests. As a result, your reports and tables will include IP addresses that confuse and confound both yourself and your network team. You will almost certainly need to filter the queries to isolate your internal network using one of a few methods:

  1. Use the Second Tenant ID filter
  2. Use the Second Inside Corp filter
  3. Use the Inside Corp Pair filter

There seems to be multiple ways to try and filter the data and unfortunately I receive varying results when using each of the queries above. You’ll likely need to play around with the queries and export the data to CSV for some manual analysis, but at the end of the day you can begin to identify network segments using one (or many) of the methods above.

Limitation Two

CQD is all historical data. You cannot use CQD to pre-emptively identify quality issues nor is CQD useful if you haven’t imported your building data. Take the time before deployments to fill out this data.

What’s Next?

The template is great. The insights are valuable. It’s not perfect, however. I’ve already built some tables and reports with content that I’d like to see, especially around actual metrics reports for streams and not just what CQD uses as classification for ‘poor calls’. Microsoft will undoubtedly continue building this template and I definitely look forward to what’s next. Kudos to MSFT for a solid foundation on this!

14Dec/16

Handling SIP OPTIONS Requests on Audiocodes SBCs

12/20/2016 – Updated to include alternate IP-to-IP Routing configuration

SIP OPTIONS requests are a crucial piece of functionality for Lync/Skype4B deployments, but even so, OPTIONS requests are utilized within other Unified Communications platforms as well.  OPTIONS requests are most commonly used as a keepalive mechanism between SIP-based systems to determine if the remote end is ‘alive’.  For many of the IT Admins out there, you’ll recognize this as the difference between performing a TCP test on Port 80:

monitoringexample-tcptest-overallfailure

vs. ensuring an HTTP 200 OK is returned to an HTTP GET request to the same server:

monitoringexample-httptest-overallsuccess

The difference is critical:  just because a port is open or a TCP connect completes, doesn’t mean the application on the remote end utilizing that port is actually functional.  In the above scenario, our web server may be functional but perhaps the IIS Application Pool isn’t running and as a result, HTTP 500 errors are being generated thus taking an e-commerce website offline.  Not good!

For most environments you’ll see monitoring systems and hardware load balancers configured for this additional in-depth configuration, but OPTIONS requests are the equivalent in the SIP world of this more advanced monitoring capability.  Despite this common keepalive usage, OPTIONS requests as per RFC 3261 can actually be utilized to obtain much more:

The SIP method OPTIONS allows a UA to query another UA or a proxy server as to its capabilities. This allows a client to discover information about the supported methods, content types, extensions, codecs, etc. without "ringing" the other party. For example, before a client inserts a Require header field into an INVITE listing an option that it is not certain the destination UAS supports, the client can query the destination UAS with an OPTIONS to see if this option is returned in a Supported header field. All UAs MUST support the OPTIONS method.

Want to know what codecs are supported by the remote endpoint?  Check.

Note:  You typically don’t see codecs listed in most OPTIONS requests but the capability does exist within the RFC to handle it.

Want to know what content types are supported by the remote endpoint?  Check.

Bottom line:  OPTIONS requests are your baseline heartbeat for SIP user agents.  If a OPTIONS request isn’t responded to by the remote UAS, then the UA believes the remote UAS is ‘down’ and must attempt to re-route requests to another remote UAS.

Where Skype4B Fits In

Mediation Servers in Skype4B send OPTIONS requests to all Trunks defined in Topology, which means that each Mediation Pool is checking each PSTN Gateway for status.  If SIP OPTIONS requests are not processed by the PSTN Gateway then Skype4B thinks the trunk is down and won’t attempt to route calls to it.  After all, why attempt to route a call somewhere when it may be down? In turn, the SBC typically maintains status of the remote endpoints within its configuration, such as a upstream Cisco Call Manager cluster:

optionsarchitecture-skype4bandcucm

So long as OPTIONS requests are processed by all endpoints, calls should flow without issue.  Even so, a potentially outage inducing scenario exists if you aren’t prescriptive in your configuration…

The Audiocodes Specifics

When you look at most of the Audiocodes guides out there, you’ll notice that they don’t typically outline what is required to properly handle OPTIONS requests.  In most general documentation, you see a more ‘*‘ approach whereby all messages from Skype are simply forwarded to a remote system without much thought:

sbcconfig-wildcardroutingsample

What the rule above shows is that if any type of SIP request (Request Type=All) comes from the Skype4B Mediation Servers (Source IP Group), the message is to be routed to a remote SBC (Destination IP Group).  This is all fine and dandy – and will likely result in calls that function – but there’s a very dirty secret:  it results in OPTIONS messages being routed to the remote endpoint as well.  Instead of the SBC terminating the OPTIONS message from Skype4B, the OPTIONS message gets passed along to the remote endpoint (say CUCM, for example) and the Audiocodes SBC won’t report back status to Skype4B until it hears back from CUCM.  In effect, your OPTIONS status from Skype4B to the SBC becomes dependent upon the successful completion of the OPTIONS status being reported by another remote system upstream (CUCM, for example) as a result of the routing rule configuration.

Take my advice:  don’t use the approach above!

The correct way to handle this would be to ensure that the Audiocodes SBC is configured so that it locally handles OPTIONS messages.  Instead of passing OPTIONS requests along to a remote system for a downstream system, you want the SBC to handle each OPTIONS request locally, thus ensuring an independent view of status from the perspective of the SBC and for each connected remote system.

For each system the SBC is interacting with, you need to define an IP-to-IP routing rule that resides at the very top of your rule list:

sbcconfig-optionsconfig

There are a few critical differences in this rule:

  • Request Type = OPTIONS
  • Destination Type = Dest Address
  • Destination Address = internal

With this rule in place, the SBC locally handles the OPTIONS request from Skype4B and immediately reports back its own status back to Skype4B.  It does not pass the message along to an upstream system (CUCM, for example).  And yes, you need to have as many of these rules as you have SIP systems sending OPTIONS requests to the Audiocodes SBC.

12/20/2016 Update

Note: Another configuration is possible.  Instead of creating an IP-to-IP rule for each IP Group (potentially resulting in tens or hundreds of rules), you can essentially create a “catch all” rule that allows a single IP-to-IP rule to handle receiving OPTIONS requests.  Using the same base format as the rule above, you simply need to change the Source IP Group so that it is ‘Any’:

sbcconfig-optionsconfig-wildcard

This rule will function exactly the same and allow the SBC to locally handle OPTIONS requests.  The big difference is that this rule will allow any IP Group defined in the SBC to be responded to all while using a single rule instead of defining individual rules for each IP Group in the configuration.  Personally, I would define an IP-to-IP rule per IP Group but choose the configuration that suits you best.

As a result of this rule, the flow changes dramatically and the SBC processes the request locally:

sbcconfig-optionsconfig-inboundrequest
sbcconfig-optionsconfig-200ok

The syslog entries show additional detail of the SBC processing the OPTIONS request locally:

sbcconfig-optionsconfig-inboundrequest-sysloginfo
sbcconfig-optionsconfig-200ok-sysloginfo

Moral of the Story

When configuring Audiocodes SBC’s, make sure you have specific IP-to-IP routing rules defined using above as a basis for properly handling SIP OPTIONS messages.  There are a few Audiocodes documents out there that have these settings defined, but many of the Lync/Skype4B related documents seem to be absent this info.  It seems counter intuitive that you’d have to define special rules for OPTIONS requests, but given that the SBC is flexible enough to actually route OPTIONS messages at all, it does make some logical sense.  An OPTIONS message is still a SIP message, so just remember the extra steps required to properly configure an Audiocodes SBC to handle the messages.

As a final note, this type of logic doesn’t seem to exist in the Audiocodes gateway code.  In that scenario, the OPTIONS requests seem to just ‘automatically handled’ and additional configuration isn’t required.  IMHO, it seems that the SBC code is the truly first place this became a requirement.

31Oct/16

Musing About ‘Enterprise Control Issues’ with Office365 Networking Configuration

First off, all opinions and thoughts here are my own.  You, my dear reader, are not required to agree with me nor are you required to read the post.  Continue at your own peril.

Second, while I’m not a neurosurgeon, psychotherapist, psychologist, or sociologist, I can still use that mushy-grey-matter in between my ears to notice things and draw conclusions using deductive reasoning and critical thinking.

Third, I fully realize that I’m making some generalizations in my statements but I also realize that many of these generalizations have been proven time and time again by the customers/organizations I’ve interacted with over the past four years.

The Musing?

"A vast majority of Enterprises - but especially large(r) ones - act like petulant children when they begin their journey to 'the Cloud'."

This seems to manifest itself in multitudes of ways:

  • Features/Functionality may be different, resulting in user training challenges and thus resistance (or outright refusal) to adapt.
  • Internal Business Processes must be retooled to work around deficiencies or adapted to take advantage of enhancements, which sometimes never occurs.
  • Cost model structures typically need to change to account for simple service consumption costs instead of complex CapEx/OpEx models that were previously used.
  • Internal corporate fiefdoms battling each other to ‘maintain face’ or ‘maintain their ground’ or ‘maintain reason for this is how it’s done’, resulting in significant delays or stoppage altogether.
  • Etc…

As a result, Enterprises often act like children and throw temper tantrums, scream and cry, or go pout in a corner as a response to the changes seen as a result of their ‘Cloud journey’.

Acknowledging Reality

Despite my statements above, the reality is that all of these responses are natural to our human nature and our psychology.  We, as humans, all hate change.

https://hbr.org/2012/09/ten-reasons-people-resist-chang

http://www.huffingtonpost.com/morty-lefkoe/is-it-really-human-nature_b_906331.html

http://www.forbes.com/sites/lisaquast/2012/11/26/overcome-the-5-main-reasons-people-resist-change/#5beaf0553393

For all of the reasons I’ve listed in the previous section, there is a valid point to be expressed. Each one impacts the nature of the Enterprise and how it operates.  It impacts not only the business but also those employed by the business, which include people like you and me.  As a result, we all have skin in this game.

 I make no argument that our concerns ought not be fleshed out.  Concerns must be acknowledged, worked on, and resolved.  If any organization is to be successful in the journey to “the Cloud”, they must embrace Operational Change Management and solve problems that arise.

That being said, there are one or two groups within IT that often exhibit a far greater resistance to change. They often remain steadfastly gripped to their existing ways, resisting at all opportunities, mumbling ‘over my dead body’ with each change that comes.  Coincidentally enough, these are my old peeps.  My old team members from a previous life.  Folks responsible for defending the Enterprise castle from the Barbarians that seek to take it over.  Who, you ask?

Information Security and their counterparts, Enterprise Networking.

The Enterprise Castle

For years, InfoSec and EntNet were responsible for defending the castle:

  • 10.0.0.0/8
  • 172.16.0.0/12
  • 192.168.0.0/16

We used firewalls, VPN tunnels, IDP, IPS, HTTP proxies, and other defense-in-depth ‘stuff’ to keep the bad guys out. Our data and processes were based around keeping the Enterprise castle safe and making sure the crown jewels remained in the king and queen’s vault.

Where Office365 Breaks the Castle

With the advent of ‘the Cloud’ and offerings like Office365, the castle mentality fails at the outset.  The Cloud is Saas (Software-as-a-Service) and runs in data centers outside of your control.  You, the enterprise, interact with services that run over the public Internet and thus the ‘protect our internal stuff’ mentality fails because your ‘stuff’ isn’t internal anymore.  Despite that, the SaaS ‘castle’ mentality is similar to the Enterprise ‘castle’ mentality:  they still protect their internal stuff just as you do, but given that their castle contains jewels for multiple kings and queens, they operate in a new state that completely differs from how a single Enterprise operates.

When Office365 comes in the picture, InfoSec and EntNet usually steps in and issues their list of demands:

  • Communication must be restricted only to specific IPs
  • Ports must be restricted to only TCP/UDP ports required
  • HTTPS traffic must be inspected
  • We control how/where the traffic goes
  • Etc…

Now this is fine-and-dandy to demand but the reality is that this is a tall order to implement.  More importantly, some of these simply cannot be ascertained no matter how much you dislike it.  Examples?  Ok, no problem…

Office 365 URLs and IP address ranges

In the URL above, Microsoft provides customers with a centralized list of all the IPs, FQDNs, and TCP/UDP ports required to interface with their Office365 services.  They even break it down by service…how nice of them!  When this list is presented to InfoSec/EntNet, the push back is immediate and fierce:

  • There are hundreds of IP ranges listed!  Not allowed.  We need something more specific.
  • Our firewalls cannot handle DNS-based rules.  See demand #1.
  • The ports are too many (especially Skype4B)!  Not allowed.  We need to restrict them.
  • The FQDNs are too many and go across too many domains.  Not allowed.  We need to restrict them.
  • Etc.

While I ‘understand’ the asks above, the stark reality is that you won’t get your demands or there are significant issues with your asks:

  • While MSFT breaks down the IP ranges by service, due to the dynamic nature of HA/DR within Office365 for each service, your data could be accessible via any of the IP ranges listed.  If you restrict IP’s and a fail over occurs within Office365 that results in a different IP block now responsible for communication, and that IP block is not in your ‘allowed list’, then the fault is yours not Microsoft’s.
  • While MSFT lists the TCP/UDP port ranges by service, you risk an outage if you alter your config to not allow the listed ports.  There is a functional reason for those port ranges being required and you risk causing a service disruption for your end-users if you deny the ports.  Don’t shoot yourself in the foot.
  • MSFT lists FQDNs for each of the services because it is easier to administer by DNS than by ranges of IP blocks.  MSFT adds and removes IP blocks and FQDNs from Office365 as required, so DNS-based resolution automatically keeps up with those changes if you can implement it.  Otherwise, you – the Enterprise – must keep track of changes that occur to the service and respond accordingly.

Why OCM and the ‘petulant’ mentality matters

At the end of the day, Microsoft is providing you a service, guaranteeing that it will work with the published configuration.  Despite that published information, nothing in IT is stagnant, and Office365 remains true to that statement.  Almost every month Microsoft publishes updates to the Office365 FQDN/IP/Port page to an RSS feed that every single Office365 customer should follow because that RSS feed includes changes that are not yet active and published on the FQDN/IP/Port lists:

https://support.office.com/en-us/o365ip/rss

What typically happens is that Enterprises often take the original lists, plug them into firewalls, IDP, IPS, HTTPS proxies, etc, and then move on to other tasks.  Until something breaks, that is, and then Microsoft support is involved and determines that the issue is because the Enterprise didn’t keep up with the changes in the Office365 service or that security was too restrictive:

  • Maybe firewall rules weren’t updated to account for new IP blocks.
  • Maybe HTTP proxies weren’t updated with new FQDNs.
  • Maybe firewall rules weren’t updated to account for new TCP/UDP ports.
  • Maybe client communication paths are using CDNs or other non-Microsoft controlled endpoints:
WARNING: IP addresses filtering alone isn’t a complete solution due to dependencies on internet based services such as Domain Name Services, Content Delivery Networks (CDNs), Certificate Revocation Lists, and other third party or dynamic services. These dependencies include dependencies on other Microsoft services such as the Azure Content Delivery Network and will result in network traces or firewall logs indicating connections to IP addresses owned by third parties or Microsoft but not listed on this page. These unlisted IP addresses, whether from third party or Microsoft owned CDN and DNS services are dynamically assigned and can change at any time.

Whatever the reason, it often boils down to a stagnant mentality by the Enterprise that change doesn’t occur, or that they don’t ‘agree’ with the change, or maybe it was just an honest mistake.  For instance, 61 IP sets were added to the Skype for Business Online service this month, and those IP addresses become effective on 12/1/2016.  You, as the Enterprise customer, simply don’t have an option on whether those IPs are used for your ‘stuff’.  If you draw a line in the sand and say “NO!  We don’t WANT it that way!”, then expect issues and egg on your face.  The better option is to keep up with the changes and implement as required so that things continue to function.

Bottom Line

I “get it”.  I really do.  I understand the ‘old mentality’ and the ‘castle’ mindset.  That mindset will bite you though.

Enterprises and my fellow friends in InfoSec/EntNet must adapt to the changes and realities of a shared service like Office365.  Every decision made is a trade off between security, usability, and risk.  Microsoft isn’t perfect and neither are Enterprises.  They are, however, doing their part in alerting Enterprises that may have stricter needs in regards to security.  We all hate change, myself included, but change is a bona-fide fact of life and those who don’t adapt will suffer (or fail) in their journey.  Please, please, please make sure you start thinking about OCM and how you will adapt to the dynamic structure of not only Office365 but also other cloud services as well.  Your future truly does depend on it.

22Aug/16

Skype4B Online PSTN Conferencing Service Numbers

8/30/2016 – Additional information/clarification regarding outbound dialing & PSTN Consumption Billing

Microsoft is ever-expanding the availability of PSTN conferencing services within Skype4B Online, adding significant geographic footprint every 6 months, or so.  Despite this increase, there is still some rampant confusion about what is available and where it can be had.  Given how widespread the confusion is – especially amongst customers looking at the service – this post is attempting to clear up some components to allow a more clear picture of the service.

Where can you purchase it?

Your first step is to simply find out if PSTN Conferencing is available for your users.  This boils down to two separate requirements:

  1. Is PSTN Conferencing available where my Office365 tenant is located?
  2. Is PSTN Conferencing available where my end-users are physically located?

Microsoft does have an Office support article on this subject that outlines exactly what countries are available for PSTN Conferencing, but it doesn’t mean they explain it very well.  The list within the Office support article means that if your tenant or end-user is physically located within the list, the user is officially able to be licensed to utilize PSTN Conferencing features.  Microsoft also refers to this list as the ‘sell-to’ list.

I’m not in the list – can I still use it?

This is where things get tricky and very confusing.  In most cases it boils down to these scenarios:

  1. My Office365 tenant is in the ‘sell-to’ list but my end-user is located in a country that is not
  2. My Office365 tenant and my end-user is in the ‘sell-to’ list but callers who dial into my conferences are not

Scenario #1

For each user that you enable within Office365, there is a critical attribute that is required for each user object and that attribute is the ‘UsageLocation’ attribute:

Skype4BOnline-UserAttributes-UsageLocation

There are many blog posts out there that talk about how UsageLocation is utilized and the PSTN Conferencing feature follows suit with what those other posts talk about.  If your end-user exists in Pakistan, for example, you’ll notice that assigning the license isn’t available (or becomes removed) and it’s all because the UsageLocation attribute doesn’t match up with where the service is available.  I’ve seen some customers try to change the UsageLocation attribute, say in the aforementioned scenario, to utilize the US and after a short while the licensing then works.  The problem with this approach is two fold:

  • The Office365 terms of service don’t officially allow this
  • The UsageLocation attribute is used to help determine domestic VS international calling types when it comes to PSTN billing charges

Since the PSTN Conferencing Service allows dial-out as part of the functionality*, you could end up with tens-of-thousands of dollars of unexpected international call charges when you start altering the UsageLocation.  Take for instance the case where you change a Pakistani user’s location to the US and then they join a conference and have the conference dial back to their local number…  In this case, the call is international and not domestic, because their UsageLocation is ‘US’.  In nearly every scenario, this is not advisable and customers should not change usage locations just to get PSTN conferencing.

*Note:  Prior to September 2016, users could utilize international dial-out functionality within PSTN Conferencing with no additional charges as Microsoft covered call charges under an ‘all-you-can-eat style’ method of billing.  Starting in September 2016 you must have PSTN Consumption Billing enabled within your tenant to support outbound international dialing within PSTN Conferencing.  Domestic outbound dialing should continue to be ‘free’ and does not require PSTN Consumption Billing.

Scenario #2

In this scenario you’ve got true access to utilize the PSTN Conferencing functionality but the callers who join your meetings may not be calling from a location that MSFT has in its ‘sell-to’ list.  This is somewhat less of a problem because Microsoft does actually have local dial-in numbers available in locations that are not included in the ‘sell-to’ list.  Callers can simply dial a number within a country closest to them and reach the conference without much thinking. If the caller can’t find a local, domestic number though, long distance and/or international call charges may apply until Microsoft adds numbers to that geography in the future.

Types of Numbers Available

When PSTN Conferencing originally was released, only shared local toll numbers were available for each geographic region.  Toll-free numbers were not available until recently, with the introduction of PSTN Consumption Billing.  Now that toll-free and toll numbers are available, the PSTN Conferencing feature set is a bit more complete and on-par with other solutions in the market.

Within the types of numbers available, there are three different configurations for those numbers:

  1. Shared Phone Numbers
  2. Dedicated Phone Numbers
  3. Service Phone Numbers

Shared Phone Numbers

This is essentially the toll numbers that have always been available since the introduction of PSTN Conferencing.  These numbers are shared across the entirety of the Office365 infrastructure and any customer can utilize these numbers for inbound calling to their meetings.  Additionally, the language support for the IVR menu system cannot be changed for shared phone numbers.  Microsoft pre-populates these numbers within Office365 and all you must do is simply assign a number to a user account (matching the number’s geography with the location the user is in).  This list is by far the largest in terms of sheer scope of geography.

Dedicated Phone Numbers

This is new(-ish) and includes toll numbers that are specific to your organization/tenant.  These numbers are not shared with other Office365 customers.  To obtain dedicated phone numbers you have two options:

  1. Obtain a phone number directly from Microsoft
  2. Port an existing number from your on-premises PSTN provider to Microsoft

The important thing to note is that while these seem like great options, this option has now been deprecated from usage within PSTN conferencing.  Within this deprecation, Microsoft has begun to separate end-user phone numbers (dedicated phone numbers) and conferencing or auto-attendant phone numbers (service phone numbers) and as a result, dedicated phone numbers are no longer able to be utilized for PSTN Conferencing and must be used exclusively for end-user phone numbers.  The ‘Dedicated Phone Number’ functionality within PSTN Conferencing has shifted to Service Phone Numbers, even though Service Phone Numbers are still dedicated phone numbers.  It’s more of a logical distinction related to billing and capacity support.

Service Phone Numbers

This is new and is intended to take the place of Dedicated Phone Numbers for features such as PSTN Conferencing, Call Queues, and Auto-Attendants within Office365.  Service Phone Numbers allow customers to request dedicated numbers that are specific to their organization/tenant and use those numbers for the aforementioned functionality.  These numbers include toll and toll-free numbers within a subset of the countries that are supported by the Shared Phone Number functionality.  The current countries included in support are:

Skype4BOnline-ServiceNumbers-AvailableCountries1
Skype4BOnline-ServiceNumbers-AvailableCountries2

What is listed is 26 countries where dedicated Service Numbers can be obtained for organizations to utilize for PSTN Conferencing functionality.  Even better is that each Country/Region not only allows toll-free, but it also allows you to request numbers specific to a region within that country.  Where this becomes advantageous is in countries of large geography, say Australia, that may bill calls from Perth to Sydney differently than an intra-Sydney call.  By obtaining numbers that are as local to your users as possible, this will help organizations reduce calling costs as much as possible.  Service numbers will additionally allow you to specify the language utilized for IVR menus, unlike shared numbers.

Despite the Office Support articles saying that porting numbers is an option for Service Numbers, there is a significant limitation in that Microsoft will only allow number porting for countries where PSTN Calling Services are active:

Skype4BOnline-NumberPorting-AllowedCountries

At the current time, that means that only US or UK numbers can be ported.  All other numbers are unavailable to be ported until services are expanded to reach the other geographies.

Shared Number Availability

Given that shared numbers are, well, shared, it may help some customers and architects to see what numbers are available prior to purchasing the PSTN Conferencing service.  Why bother, you ask?  Just because a number is available in a country doesn’t mean that it is a local call for someone within that country.  A user in Melbourne or Perth calling a PSTN number in Sydney is generally billed at a different rate than someone in Sydney calling a PSTN number in Sydney (same goes for the United States intra-region calling or UK intra-region calling).  Microsoft doesn’t publish these numbers publically so the spreadsheet below may help architects in planning the cost structure of a rollout of PSTN Conferencing within Skype4B Online:

Service Number Availability

Given that Service Numbers are very, very new, I haven’t had a chance to put together a full spreadsheet of availability for the 26 countries and the regions supported.  Stay tuned as that will be forthcoming…

Wrapping Up

Hopefully this helps clear up some confusion around PSTN Conferencing.  The service and details are always changing – literally – so this will likely be outdated in a few months.  I will endeavor to keep this updated every 3-4 months or so to reflect the latest information available from Microsoft.

16Aug/16

Using Lync Server 2013 Persistent Chat Whilst Blocking IM Capabilities

In the midst of a 2010 to 2013 migration, a requirement was proposed that was, well, one of those ‘head scratcher’ asks:

"We are upgrading from Group Chat 2010 to Persistent Chat but we don't want Persistent Chat users to be able to IM each other.  IM must be disabled for the users who utilize Persistent Chat".

I’ll openly admit I struggled with understanding why one would require to do that, but it was a business requirement by the managers of the specific business units so we simply had to take it as-is and move on.  One of the big advancements with Lync 2013 was that the Persistent Chat client was built-in to the normal Lync client application and did not require the deployment of a separate application like Group Chat 2010.  Given the desire to upgrade to the newer client and newer back-end, a few options are available, all with caveats and issues to consider.  So without further a-do, the options:

Option 1 – Keep Using the Group Chat 2010 Client

Given that the Group Chat 2010 client is supported with Persistent Chat, it does provide you a method to allow the restriction of IM but allow the usage of P-Chat.  It’s not exactly an elegant solution, however, especially if you want to take advantage of the built-in functionality of P-Chat within the Lync 2013 (or Skype 2015/2016) client.  You’re now maintaining two different sets of applications and having to ensure they only get installed on systems that require it.  Not ideal and certainly not the easiest of solutions.

Option 2 – Deploy Client-Side Registry Keys to Disable IM

This one comes from way back in the OCS days where you could utilize registry keys to manage the client modalities and functionalities.  Even with the newest clients, there are still registry keys that take precedence over what a client receives through in-band policy configurations.

reg add HKLM\Software\Policies\Microsoft\Office\15.0\Lync /v DisableIM /t REG_DWORD /d 1 /f

reg add HKLM\Software\Policies\Microsoft\Office\16.0\Lync /v DisableIM /t REG_DWORD /d 1 /f

reg add HKCU\Software\Policies\Microsoft\Office\15.0\Lync /v DisableIM /t REG_DWORD /d 1 /f

reg add HKCU\Software\Policies\Microsoft\Office\16.0\Lync /v DisableIM /t REG_DWORD /d 1 /f
  • Using Office 2013?  Make sure you’re using the ‘15.0’ key above.
  • Using Office 2016?  Make sure you’re using the ‘16.0’ key above.
  • Disabling IM for every user on a machine?  Make sure you use one of the ‘HKLM’ keys above.
  • Disabling IM for a specific user on a machine?  Make sure you use on of the ‘HKCU’ keys above.

You can add the registry key to your standard Windows Image, add it via Group Policy Preferences or add it to a batch file for usage by SCCM or login scripts.  However you accomplish it, it should look something like this when you’re done:

Skype2016-ClientRegistry-NoIM

Option 3 – Use Ethical Walling Software

This would be the equivalent of Hub Transport rules in the Exchange Server world, but in Lync Server these are applications that run MSPL and/or UCMA functionality to examine and intercept SIP traffic.  There are several third-party solutions that could be utilized for ethical walls:

  1. Ethical Wall for Lync (Microsoft)
  2. Vantage (Actiance)
  3. Ethical Wall (MultiEx)
  4. Ethical Wall (SkypeShield)

None of these are free, however, and given the desire to remain low-cost by this client, any third-party solutions were simply not an option.

The Result?

As a result of the deliberations, Option #2 above was chosen and implemented.  The registry key was pushed out to the enterprise.  Following the registry key push, restart the Skype4B (or Lync 2013) application and when you do, you’ll notice that IM capabilities are now not available…

Skype2016-ClientRegistry-NoIMResult

…while the Persistent Chat functionality is available…

Skype2016-PChatAccess-NoIM

I can chat away, all day long, within the confines of Persistent Chat but I have no ability to utilize normal Instant Messaging features within the client.  Problem solved, right?

The Limitations

Given that this is a client-side registry key, it only applies to systems the key is installed on.  Unfortunately this leaves a very large gap of places that someone could use IM:

  1. Systems that don’t have the registry key deployed
  2. Outlook Web App IM
  3. Mobility Clients
  4. Skype Basic Clients (or Lync Basic)
  5. Lync for Mac 2011 clients
  6. Skype for Business for Mac client

Generally speaking, Options 1 & 2 at the top of the post are valid ways to prevent users – in a well managed desktop environment – from utilizing IM.  That being said, there are still numerous ways to potentially circumvent these settings and be able to send IM’s.  Most circumventions could be managed by various policy configurations within Lync and/or Exchange, but in my opinion, you are far better off to look at utilizing ethical walling software to limit those interactions and to provide reporting on your users that may have breached those policy requirements.