Build 129

  1. September 2017

Features

Website Monitoring

With this release, we are pushing out major changes to the way website monitoring (end-user / real-user monitoring) is done with Instana. Monitored websites were previously just a few of many monitored elements visible on the application map and the application comparison tables. We think that website monitoring is important enough to deserve its own top-level entry point. This entry point, called Websites, opens a new part of Instana that is focused all around the experience of your users.

Website list showing all the monitored websites

Getting started with website monitoring has also been made a lot easier. There is now a wizard which guides you through the first steps. Just drop in the name of your website, place the tracking script on your website and you are good to go!

Website Monitoring Setup 1 Website Monitoring Setup 2

Once a website is monitored, all the user activity is visible within a completely redesigned dashboard. This dashboard not only organizes the information in much more comprehensible packages, but it also guides you from key performance indicators down to the nitty-gritty details of JavaScript stacktraces and resource statistics. These details are grouped within sub pages of the dashboard. Previously, dashboards only ever had a single page. The website dashboard is the first of many to come with an improved structure.

Website summary page. Showing key performance indicators that guide you to sub pages

One of the many things we thought about, is data granularity and metric aggregations. As part of this, we removed some previously existing metrics, e.g. 75th page load time percentile, and added more helpful replacements, e.g. 90th page load time percentile. Next, we looked at the charts and we have changed those to more readable alternatives. This also includes reduced data granularity to account for the inherent data volatility. But don’t worry, one second granularity is still available if you need it! Just use the timeline to select a smaller time window and you are back to one second granularity.

The website speed page is all about page load speed

The new website monitoring views contain a lot more improvements and UX treats that would be too much to cover in the release notes. If you haven’t already started with Instana’s website monitoring, now is the perfect time to do so!

Python Tracing

Introducing our new Python monitoring capabilities! We’ve built our Python sensor from the ground up with hands free Tracing & Runtime monitoring. Just install the package (pip install instana), set an environment variable and go. No code modification and re-deploys required!

Out of the box, we support a large number of default runtime metrics:

Python Tracing

…and the Python integration is extensive: you can now search for Python applications and traces with trace.type:python or entity.type:python.

Python Tracing 2

And of course, Python supports distributed tracing that will automatically integrate and combine tracing calls with other supported languages. Here is an example of a Go app making HTTP calls to a Django application.

Python Tracing 3

With the addition of Python distributed tracing support, Instana’s Service Quality Engine now discovers, maps and monitors Python entities, services and connections and their relationships to other components in your infrastructure.

Following in the footsteps of all our other supported languages, Python also supports OpenTracing and is opensource on Github. Be sure to also checkout our recent blog post for the official introduction.

Offline Alerting

It’s common for teams to want to be informed if some entity goes offline for some reason. We’ve always tracked online and offline events, but now this information can be used to define custom issues based on offline events. To access this new feature, open the custom issue configuration dialog and select the new “Offline” rule.

If any entity matching the applied filter query goes offline, this custom issue will fire (creating a warning event, a critical event, or an incident based on your configuration).

Offline Alerting

Service Mapper

Services build the foundation for useful monitoring, metrics, alerting and breakdowns within Instana. This is why we are improving the service mapper with the following new features:

  • Service extraction based on Docker labels and label value matches, e.g. use a substring of the label instead of the whole label.
  • Service extraction based on host tags. This is a flexible way to reuse existing host tags for service extraction. This can be used in order to create separate services for production and qa/test as well as to group services by region.
  • And fallback extraction rules under the general section which will be used when no HTTP, batch, ejb or other technology specific rule can be applied. Grouping by region and standardized docker labels becomes a lot easier thanks to this feature.

The following image shows a general service extraction rules which creates services based on the host tag zone and the Docker label component.

Service mapper employing host tag zone and docker label component

Agent Health Monitoring

The only constant in modern environments is change. Hosts come and go as they are needed, and applications appear and vanish frequently. A modern monitoring tool should be able to tell its operators which agents are reporting at a particular point in time, and if not, historical information about when it last reported. With our new agent health monitoring view, you’re now enabled to just that.

There are several use cases which can be handled:

  • You want to see all agents which reported yesterday, but not at a specific time today: The timeline can now be configured to show the time range of the yesterday (eg. 5am - 8pm on August 3). The selected moment can be set to today (eg. 5am on August 4). Instana will now list all agents which reported on August 3 but are not running at 5am on August 4.

All agents are listed between the configured time range

  • You want to see all currently running agents: The timeline can be set to live and the view will list all agents in the given time window, as well as indicate which ones are reporting at the moment.

The tooltip gives further information about the concrete period of reporting

  • The agent health monitoring table contains the following information:
  • Agent entity link (links to the agent management dashboard)
  • Host entity link (links to the host the agent was installed on)
  • Agent boot version
  • Mode (Infrastructure-only or Full APM)
  • JVM name and version
  • Reporting status: if the agents is running at the selected moment

Note: users can only access the agent health monitoring view if their role has the canConfigureAgents permission set. The property is set for owners by default.

Newly Supported Technologies

  • Java tracing now supports Vertx Cluster, Amazon SQS, FaunaDB and Quartz
  • Java Tracing now supports Java 9 Modular Applications. This is BETA, please let us know should it not work fully in your configuration.
  • Capturing custom http headers and user-provided error logs in PHP
  • .NET Tracing now supports ASP.NET Core (on classic CLR, Windows only), Microsoft Message-Queue and Enyim.Caching library for Memcached in version 2.11 and 2.16
  • Ruby now supports Redis and Sidekiq

Improvements

  • The general Instana settings are now split into “User Settings” and “Team Settings”. The user will land on the user interface settings by default.
  • Showcases are now removed from the general drop down menu. The graph showcase can now be accessed from the “About Instana” dialog.
  • X and Y axis within charts got their first overhaul. The minimum and maximum values are now always visible on the Y axis. The X axis now always shows the data in addition to the time. Also, no more label overlaps! Chart Overhaul
  • Spark charts within Instana have always been small variants of their larger brothers. They suffered from too many data points and non-functional tooltips. These issues have been resolved. Spark Charts
  • For services, we now show error rate within the calls vs. latency charts. Throughput Error Rate
  • Cassandra Sensor now collects unreachable nodes and alerts on them
  • Log messages using a warning level will not be counted as error
  • Naming of Glassfish EJBs is now much better
  • Dropwizard Sensor now has more accurate 1s resolution metrics.
  • The agent now has an option to use the unique id of an instance running in a cloud rather than the mac of a public interface as its unique id. This supports the mac re-use done for example by Google Compute.
  • .NET Tracing spends less time in instrumenting code at startup-time
  • When instrumenting .NET assemblies, the profiler now uses it’s own map of Type-Defs per assembly for resolving Type-Refs which is more reliable and results in better instrumentation on complex inheritance-chains
  • .NET Tracing does not require the addition of Instana.ManagedTracing.Modules in your web.config anymore. You can safely remove it.
  • Ruby: Track and occasionally check background thread health #91

Fixes

  • A bug that prevented the 3D-maps to render was fixed. This problem occurred with some combinations of Chrome on Linux, and sometimes with Mesa graphic drivers.
  • Httpd Sensor was alternating between two error messages when status urls could not be reached, resulting in noise.
  • A bug in the docker sensor could cause the agent to die when it encountered a zombie container. We gave it silver bullets so it survives now.
  • Submitting a single trace to the trace API no longer produces an error.
  • A bug in .NET native profiler was fixed, which could crash application-pools when instrumenting methods with a specific signature
  • A bug in .NET native profiler was fixed, which led to the profiler only instrumenting one overload of a method to be instrumented instead of all configured overloads
  • A problem in .NET tracing has been solved which occurred when instrumenting classes in assemblies that have been loaded into the shared AppDomain and resulted in an exception related to differing permission-sets in different AppDomains
  • A bug has been fixed in the managed instrumentation for Asp.Net MVC Controllers which resulted in wrong timing for async controller-methods and possibly blocked threads
  • Go: Fix: Detect error logging and properly mark span as errored #38

Known issues

When running the agent on z/OS USS, many infrastructure metrics like memory and cpu usage are unavailable. However, monitoring of databases, web servers, and runtimes like Java works normally.