Monitoring Microsoft Teams: Two Approaches
By Rob Doucette, Martello Technologies
The need for visibility into Microsoft Teams service availability and delivery quality has led to the rise in interest in monitoring Microsoft 365 and Teams services from the user perspective.
The IT team is generally responsible for ensuring users can access Microsoft 365 services. When there’s an issue with a user connecting to, say, Microsoft Teams, IT has very little intel to go on to determine the root cause of the issue – let alone steps to remediate the problem.
Microsoft offers service status detail and can provide some degree of visibility with native tools such as the Call Quality Dashboard (CQD), but deeper insight and alerting is needed for IT teams to be timely and accurate in their response when a service quality issue arises.
Remote/hybrid workers add an additional layer of complexity to determining whether a Microsoft 365 service delivery issue is a Microsoft problem or just slow home or remote WiFi connection.
Visibility Starts With The User
IT is on the outside looking in, having the user on one end of the spectrum saying there’s a problem, and the Microsoft datacenter on the other. When you consider a Microsoft Teams call, there are multiple points in the journey a call takes which can impact the user experience. The problem is usually not Microsoft, but an issue at one of these points, such as the ISP or a network device.
One perspective that can provide IT with far more visibility into the source of a service delivery issue is the user. Because the user interacts with the client application, the network, authentication, and a variety of Microsoft 365 services, there’s an opportunity to gain actionable insight into what is and what isn’t working – and what to do about it.
So, if IT can somehow look at the user’s experience with all the components involved in delivering Microsoft 365 and Teams, it is possible to regain much of that lost visibility.
According to Nick Cavalancia, Microsoft Cloud & Datacenter MVP, “Two approaches have evolved over the last few years to provide organizations with visibility from the user’s experience with Microsoft 365 services. Real User Monitoring (RUM) provides insight into how an individual user interacts with a Microsoft 365 service via an agent on the user’s endpoint. This method focuses on the user’s interaction with specific services and the quality of service provided therein within the Microsoft 365 cloud. Synthetic Transaction monitoring (ST) uses robot agents installed on separate systems (per location, connection, network, or geography) to simulate Microsoft 365 user activity, continually testing workloads to help identify drops in service quality, providing detail on scope, location, and service impact.”
These two approaches, while both looking at service delivery from the user’s perspective, are very distinctive technologies that provide different visibility and value to an organization.
Remote Workforce Options
Because so many organizations have a material percentage of their employees working remotely, there’s a specific need to ensure these users – who rely heavily on Microsoft 365 and Teams as their virtual workspace – are productive and are having a quality experience. Both service quality monitoring approaches have a play here but offer differing value.
RUM helps the organization determine at a high level whether the individual is having a slower experience but provides no detail into why. This is because RUM isn’t aware of how the user is routing to the Microsoft 365 cloud – are they using a VPN and going through the corporate network? Does the corporate network have traffic scanning solutions that create latency? And is the corporate network infrastructure having any issues itself?
Synthetic transactions can be implemented to provide the missing detail around the user experience for those connecting through the corporate network (and do so proactively even when no users are connecting to Microsoft 365) but will only be able to indicate an issue for a particular user by inference, as long as the user in question is taking the path monitored. Because it is continuously simulating user actions, synthetic transaction monitoring also can provide the IT team with advance warning before a problem can affect an actual user, allowing IT to be more proactive.
In short, the synthetic transaction approach will provide more detail and an advance warning of problems, but the RUM still provides value for IT to be aware of the individual user’s current state.
Both methods provide more visibility than achieved with native Microsoft services only. So, it is a case of one or the other? In short, no – both methods provide IT with valuable visibility and detail around whether users are having issues with Microsoft 365 or not. The choice of which to use comes down to what your objectives are. If it’s about monitoring a specific individual where service and function granularity isn’t needed, RUM may be a better choice. But if it’s more about the organization as a whole and/or needing visibility into specific Microsoft 365 functions with some degree of proactive detection before the user experiences a problem, synthetic transactions are the right choice.
It also depends on the Microsoft service. Take the case of Teams, which technically utilizes a number of additional Microsoft services including OneDrive for Business and SharePoint to, in total, present itself as Teams. RUM (using either approach mentioned previously) would not provide any insight into which back-end services are having issues. An organization relying on RUM will optimally also need synthetic transaction data to provide context and color to highlight what parts of Teams are having issues.
The Microsoft Teams example makes the case that using both approaches together provide even greater visibility and, assuming solution integrations exist where data can be shared, the combination would provide organizations with complete visibility into whether the organization – down to the specific user – is experiencing service quality issues and allow them to correlate the two sets of data to quickly determine what’s not working and who specifically is impacted.
About The Author
Rob Doucette is Vice President, Product Management for Martello Technologies. Rob has more than 15 years of experience building market-driven solutions with a focus on real-time diagnostics, monitoring, and analytics. He has assembled multiple software development teams from scratch, managed executive-level partnerships and relationships, and helped to secure funding from Microsoft. Prior to joining Martello, Rob was the CTO of Savision, now a subsidiary of Martello where he led software development teams and was responsible for building industry-leading products.