Lync Top Support Topics and Troubleshooting Tools

Lync Performance Monitoring
Centralized Logging Service (CLS)
Lync Media on WiFi
Lync Call Generators
Troubleshooting Tools
Deployment completed, SMEs are gone, the
customer is left to support an ever-changing
software update cycle.
Customers are often not equipped to manage
the complexity of Lync. Many proactive steps can
prevent the most common scenarios that
generate support calls and generally can
potentially leave Lync functionality crippled and
We will discuss the most common issues and
discuss how to prevent them altogether.
We will also discuss troubleshooting tools and
Lync Performance Monitoring
System Center Operations Manager (SCOM): SCOM is an alerting system providing data on server status
Performance Counters: Feed into SCOM and for general server performance monitoring. Includes active connections, processing of messages, failures
raised by server, latency
Event Logs: Used to report to SCOM, configuration state on server, security policy update, service availability
Synthetic Transactions: Automated tests to detect outages in service features (e.g. , Instant Messaging [IM], registration, presence)
Call Detail Records (CDR): CDR provides telemetry on usage patterns (e.g., call volume), call establishment (e.g., conference join)
QoE Metrics: media, network, endpoint
and connection metrics collected on
QoE Metrics: Media, network, endpoint and connection metrics
collected on endpoint
QoE Metrics: Media, network, endpoint and connection metrics
collected on endpoint
UFD: Actionable notifications displayed to user
UFD: Actionable notifications displayed to user.
Network Bars: Indicator providing users with information when
network performance is causing media quality issues
Network Bars: Indicator providing users with information when
network performance is causing media quality issues
SQL Database
Front End Server
Lync Storage Service
Data Collection
Queue DB
Unified Contacts
Archival Processing
(IM, WebConf)
Monitoring Processing
Replication for
In Lync 2013, improved video
metrics are aligned to the new video
feature set
Reports will have both audio and
video media performance analysis
New QoE will enable administrators
to better identify problems with
both audio and video
QoE provides information on
Network performance and problem
Audio performance issues
Video usage and performance issues
QoE data assists in
Network planning (e.g., wired and
wireless access requirements)
Server and general infrastructure
procurement decisions
Centralized Logging Service
Get-CsClsScenario global/<ScenarioName> |
Select -ExpandProperty Provider |
Format-Table Name,Level,Flags -a
Component Name
COMMAND Description
Starts trace session for given scenario. Mandatory option: scenario. Other valid option: duration
Stops trace session for given scenario. Mandatory and only valid option: scenario
Query list of scenarios being traced. Valid options: None
Flush logs and make them available for searching immediately. Valid options: None
Update the duration active (nondefault) scenario needs to be traced for. Mandatory and only
valid option: duration
Search logs. Results are returned in a text file. Valid options: starttime, endtime, components, uri,
callid, phone, ip, loglevel, matchany, matchall, keepcache, correlationids
Will display command line usage along with scenario names
Scenario name (Valid scenario names were given earlier)
Duration (in minutes) to trace the given scenario for. Default duration: 24 hours
Specify this to require the search to match all criteria specified
Specify this to require the search to match any criteria specified. This is the default.
(timestamp) timestamp to search the log entries from
(timestamp) timestamp to search the log entries to
(fatal | error | warn | info | verbose | noise)
This is the least severe log level to search on. For example, if 'warn' is specified search will be
limited to 'warn', 'error' and 'fatal'
List of comma separated component names to restrict the search scope
Phone number scope for search command. This needs to be exact match
URI scope for search command. This needs to be exact match
Call id scope for search command. This needs to be exact match
IP address scope for search command. This needs to be exact match
Lync Media on Wi-Fi
Lync Call Generators
and authentication
Public Key Infrastructure (PKI) /
TLS Certificates
Signaling and media
High availability / disaster
recovery (HA / DR)
Lync address book
and Authentication
Lync clients have different requirements because they are limited
by the platform capabilities.
Changes from the legacy client platform have necessitated a
“fallback” approach to client DNS lookup.
Secure connectivity required for passing authentication.
Certificate-based authentication requires obtaining a certificate
via the web services.
Seldom will you see two deployments with the identical
network/infrastructure requirements.
DNS Complexity
Network Infrastructure
Securing External Access
Public Key Infrastructure (PKI) /
TLS Certificates
PKI is everywhere in the product.
Correct use of certificates for internal roles, public certs from well known CAs for
external users, PIC, federation, Office 365, mobility, and reverse proxy.
Certificates used for antivirus encryption and authentication are NON-public.
Internal namespaces on external facing certificates are increasingly under scrutiny
because of new PKI standards.
Oauth is a new way to ensure intra-role communication is simplified. Server to
server; prevents trust issues between Lync and other trusted roles.
All connections in Lync use TLS or MTLS with the exception of antivirus
Avoid wildcards in certificate names
Supported as Subject Alternative Name (SAN) on Web Services (RP)
Many public CAs won’t allow a direct import of a certificate request; names are
often added or certs recycled from other modalities because of the cost factor.
Only external services need public CA-issued certs.
No internal namespace on public certificates.
DNS must succeed for proper trust. Edge DNS pointers to internal split domain
Scaled Edge servers share identical certificates (private
Transport Layer Security (TLS) is used not only to secure traffic but
also to establish a trusted relationship between SIP proxies.
Secure Real-time Transport Protocol-User Datagram Protocol
(SRTP-UDP) cannot provide TLS with the certificates. However, it
can still scramble a packet payload.
Oauth provides a framework for authorizing components to
interoperate and reduces the trust model management through
certificate replication.
Use wizards for certificate requests
Primary SIP domain = public namespace
No wildcard certificates
Use internal CAs for internal roles and access points
Avoid all-in-one certificates
Signaling and Media
Media Relay Authentication Service (MRAS), Interactive Connectivity
Establishment (ICE), Session Description Protocol (SDP) candidates
Edge server as a functional firewall device
Media bypass, hair pinning, mediation
Bandwidth management / Call Admission Control (CAC) / Quality of
service (QoS)
Monitoring / Quality of experience (QOE)
External registrar SIP proxy users and federation
External conference proxy (SIP signaling still traverses Access)
All audio, video, and media sharing using Real-time Transport
Protocol (RTP)
Uses ICE (Session Traversal Utilities for Network Address Translation
(STUN) / Traversal Using Relay NAT (TURN) – secure using MRAS (is
not TLS)
No user services (that’s the reverse proxy role)
HTTPS connection for mobility clients, ABS, Meeting Lobby, etc.
Media Relay Authentication Service (MRAS) - (5062) Internal via SIP
Allocate (3478) and ‘Are you there ping’ to ensure connectivity?
Open ports on NAT host | Reflective | Relay
Deep packet inspection – XOR
UDP and TCP open port ranges are largely overrated as a security
DNS Load Balancing vs. Hardware Load Balancers
TLS everywhere but media exchange.
Internal / external namespace depends on DNS pointing the right direction.
No logical sub-netting to prevent physical isolation.
Routing to Internet and internal networks should never overlap and will require
manual management of the networks in most cases
High Availability and Disaster
Recovery (HA/DR)
Don’t confuse High Availability and Disaster Recovery
No limited functionality
Pool pairing
RPO/RTO - Recovery point objective / Recovery time
Windows Server 2012 with Lync 2013 - known issues with
Windows fabric
All servers hung in “starting” state
Reset -CsPoolRegistrarState -ResetType QuorumLossRecovery -PoolFQDN <FQDN>
Reset-CsPoolRegistrarState -ResetType FullReset -PoolFQDN <FQDN>
Lync Address Book
Changes in Active Directory Properties
Pushed to the Lync Back End servers every 60 seconds
Default Setting for Address Book Service =
Get-CsClientPolicy … -AddressBookAvailability
FileDownload in Lync has all the same caveats as R2. Delay in updating,
differential files, 24-hour updates, and so on.
Personal Information Manager (PIM)
Relies on Exchange web services (EWS) to obtain Outlook contacts and also
synchronize Outlook calendar entries with presence state in the database; this is a
client-side process
Unified contact store (UCS)
Introduces a host of potential caveats with contact loss. but relies on FE process to
proxy contact storage to the users mailbox. This is not PIM, but gets access to
Exchange using the same process.
Subscribe to presence
HA/DR real-time presence across all Front End servers and backup registrars
Privacy relationship
Trust with Office 365
Troubleshooting Tools

similar documents