Website Monitoring FAQ

Terminology

What is the difference between page views, loads and transitions?

Single-page applications and multi-page applications (or “classical websites”) work quite differently from a technical perspective. This difference is essential to judge performance and thus, user experience. For this reason, we are differentiating between page views, page loads and page transitions. With the help of these terms, we can clearly communicate how a website is used.

  • Page Load: The retrieval of the initial HTML document and all subsequent actions until the next navigation in the browser. For example, a navigation that requires a new HTML document to be loaded. Typically the website content is rendered as HTML on the server and then delivered to the user. A multi-page application (or “classical websites”) will almost only have page loads. An example of a website implementing this architecture is Wikipedia.

  • Page Transitions: Websites may change the content that the users are looking at via JavaScript. In contrast to page loads, this doesn’t make use of classical browser navigation. Typically new website content is loaded via JavaScript and then turned into HTML. This HTML is then placed into the document to enrich/replace the already existing document. Single-page applications make heavy use of this technique. For single-page applications, there is commonly a far larger number of page transitions than page loads. An example of a website implementing this architecture is Instana’s product.

    A page transition is triggered when the page name is changed via the API.

  • Page View: To judge general activity on a website no matter what architecture is implemented, the term page view is introduced. Page views are the sum of the page loads and page transitions.

What is a beacon?

The JavaScript agent transmits small monitoring payloads to the Instana servers that model specific events occurring within the lifecycle of a page view of a website, e.g. page loading, resource retrieval and HTTP requests. The term beacon comes from the W3C Beacon specification.

What is an origin?

Origins are a central part of the web and especially of web security mechanisms, e.g. cross-origin resource sharing (CORS). The combination of the scheme (also called a protocol), hostname, and port form an origin. Let’s take the example of the following URL:

https://shop.example.com/articles/hoverboard/ratings

This URL has the origin https://shop.example.com.

  • Scheme: https
  • Hostname: shop.example.com
  • Port: 443 (based on the default for the HTTPS protocol)

Two origins are the same when all three parts, i.e. scheme, hostname and port, are equal. This means the following four examples are all unequal to each other.

  • https://shop.example.com
  • http://shop.example.com
  • https://example.com
  • http://example.com

Same-origin and cross-origin calls or resource sharing is a terminology frequently used when mentioning origins. Especially the web security mechanism same-origin policy and the cross-origin resource sharing mechanism are commonly known. When the source origin (the origin of the HTML document) and the target origin (the origin of an API server) are different, a call is considered to be cross-origin. Therefore, not same-origin.

The concept of origins should not be conflicted with the similar, but notably different in implementation and intention concept of sites.

Metrics

Why are the recorded timings for my HTTP calls so unusually huge?

Our website monitoring records the time between the start of the request, e.g. XMLHttpRequest#send, and the success event, e.g. XMLHttpRequest#onreadystatechange with readyState === 4. The time between these two events can vary greatly and is affected by…

  • HTTP redirects
  • DNS lookup time
  • Time to establish TCP / TLS connections
  • Server response times
  • Request queueing times
  • Latency and throughput
  • Request and response sizes
  • Page throttling

While most of these items are well understood, the last one is often surprising. Browsers may decide to throttle down or even stop processing for web pages that aren’t visible. Page throttling is most likely the reason for huge timings. Refer to Google’s blog post about this subject or the respective Mozilla Developer Network Page Visibility API page to learn more.

What do the website metrics mean?

Most of the website metrics are received from the various W3C specifications. Specifically

The terminology used within these documents is adhered to as much as possible. Additionally, we are using the following metrics:

  • onLoad Time: This timing exists for each page load and models the time until a navigation is complete, i.e. the loading spinner has stopped. It is defined as loadEventEnd - fetchStart (see navigation timing). Within the user interface we always make the distinction between onLoad Time and onLoad Event Time to be explicit about the differences in terminology between the Instana user interface and the navigation timing specification.
  • DOM: A variation of the timing defined in navigation timing. This metric is defined as domContentLoadedEventStart - domLoading (see navigation timing). It is considered a more useful timing breakdown than the navigation timing’s processing time. See the picture below to understand DOM time.
  • Children: A variation of the timing defined in navigation timing. This metric is defined as loadEventEnd - domContentLoadedEventStart (see navigation timing). It is considered a more useful timing breakdown than navigation timing’s onLoad times. Also see the picture below to understand Children time.

Navigation Timing Variation

Data collection

How are you gathering this information?

Instana’s website monitoring is based on our open-source library called weasel. Weasel gathers the information from the Browser Navigation Timing API and transmits it in an efficient form. You can inspect its source code on GitHub to get all the information you may need.

Which browsers are supported?

The JavaScript agent supports all browsers, however, some APIs aren’t supported in older browsers. In these cases, the JavaScript agent won’t fail, but certain data points are unavailable, e.g. navigation, resource, and paint timing information could be missing.

How do you handle browsers which do not support the navigation timing API?

For browsers which do not support the navigation timing API, we provide approximate timings. These timings are not reliable, and we do not encourage you to rely on approximate page load timings for these traces too much. When Instana has to resort to approximations for page load times, the values are excluded from statistics. This means that approximations are not part of aggregated page load times like mean, min, and max load times.

What kind of ineum calls can we make?

Details about the global ineum function are available within the website monitoring API documentation.

Why are you blocked by Adblock or similar browser extensions?

While most of those extensions were created not to display advertisements, most of them evolved into extensions that prevent website owners from tracking their users. The Instana monitoring script ended up in many of those extensions, and there is nothing we can do to have them revert their decision. If you have some control over your users, you can ask them to allow the EUM script in their adblocker.

How does backend correlation work with AJAX calls?

We have a dedicated document that explains all our backend correlation mechanisms.

Which HTTP headers are used?

The JavaScript agent makes use of the following HTTP headers to achieve backend correlation.

  • Request Headers (to the backend):

    • X-INSTANA-T
    • X-INSTANA-S
    • X-INSTANA-L
  • Response Headers (to the frontend):

    • Server-Timing

Why is there no data for GoogleBot and others?

Due to bots commonly manipulating JavaScript APIs in order to return predictable/reproducible results, the Instana JavaScript agent and servers identify and block data collection for a variety of bots. This has a positive impact when scraping the web, but unfortunately causes problems when wanting to monitor end-user experiences. For example the GoogleBot’s JavaScript API…

  • Date.now() does not return the current time in milliseconds, but instead some seemingly fixed timestamps. As a result, durations recorded by diffing two timestamps acquired by Date.now() cannot be trusted.
  • Math.random() and crypto.getRandomValues() return values out of a pool of pre-defined values. This causes problems with client-side generated IDs, e.g. causing wrong backend correlation references.

Which HTTP endpoints are end-users making calls to (for SaaS)?

There is no requirement to configure anything for correct data transmission to Instana’s SaaS platform. The Instana user interface always presents the correct tracking snippet, which includes the necessary URLs.

  • Retrieval of the JavaScript agent (all regions)

    • DNS Name: eum.instana.io
    • Ports: tcp/80 (HTTP) and tcp/443 (HTTPS)
  • Transmission of monitoring data to the European region running inside AWS

    • DNS Name: eum-blue-saas.instana.io
    • Ports: tcp/80 (HTTP) and tcp/443 (HTTPS)
  • Transmission of monitoring data to the European region running inside Google Cloud

    • DNS Name: eum-green-saas.instana.io
    • Ports: tcp/80 (HTTP) and tcp/443 (HTTPS)
  • Transmission of monitoring data to the United States region running inside AWS

    • DNS Name: eum-red-saas.instana.io
    • Ports: tcp/80 (HTTP) and tcp/443 (HTTPS)

Is it possible to proxy the HTTP endpoints (for SaaS)?

We strictly recommend not to attempt proxying of the HTTP endpoints. We will not provide support for any proxy setups nor for any issues that may arise due to the usage of a proxy. Should you still want (or have) to do this, you may find these pointers helpful:

  • Set proper Host HTTP headers.
  • Respect the difference between the eum.instana.io and eum-{region}.instana.io servers.
  • Make sure that our servers are aware of the end-users IPs. Send an X-FORWARDED-FOR header to our servers with the end user’s IP. Alternatively, send a X-REALER-IP HTTP header (yes, deliberately not X-REAL-IP) to the Instana servers which contain the end user’s IP.
  • Pass through all the HTTP headers that the Instana servers include in the response body.
  • Don’t do any caching in the proxy.

Are you collecting any data from WebSockets?

Instana’s JavaScript agent does not collect any information about WebSockets. WebSockets by themselves have no semantic model that we could effectively monitor aside from messages flowing between client and server. Because WebSocket messages have arbitrary formats (just character sequences or bytes), we cannot deduce any kind of request/response or success/failure state from these.

While it would be theoretically possible to transmit every message transmitted/received via WebSockets to Instana’s servers, this has several problems:

  • Very high probability of collection of sensitive data that should never land in a monitoring system.
  • Large amounts of data collected from every end-user which results in overhead for end-users.
  • Typically very high signal to noise ratio in the collected data.
  • Lack of standardized semantic model in the data means that we cannot optimize access to the data for you.

For these reasons Instana does not automatically collect any data about WebSockets. We do however understand that customers implement request/response and subscription mechanisms on top of WebSockets. In order to gain insights into these we recommend usage of our custom event API. Through the custom event API you can even realize backend correlation for WebSocket based request/response systems.

Sensitive data

Are you collecting data which can uniquely identify users?

By default, the Instana JS agent does not include data which can uniquely identify users. Additionally, the Instana JS agent also does not apply techniques such as device / browser fingerprinting.

User specific data can be made available to Instana via the user API.

What are you doing with the user data transmitted to Instana?

The user API can be configured by customers to transmit user identifying information to Instana. This information is used only to provide the features visible within the product. Instana does not interpret this data in any other way, nor is it correlated across multiple customers.

Is it possible to delete user data after it is transmitted to Instana?

Infrequent deletion requests, e.g. to comply with GDPR, are supported. If you expect frequent or periodic deletion requests, please instead transmit anonymized data to Instana (e.g. hashed user IDs).

Are you anonymizing IPs?

Yes, IPs are anonymized. The last octet of IPv4 addresses and the last 80 bits of IPv6 addresses are set to zeros.

Impact of web security

We have a Content-Security-Policy in place, is there anything we need to do?

The Instana JS agent is asynchronously loaded from eum.instana.io and can be loaded via HTTP(S). Please ensure that loading scripts from this domain is possible.

Data is transmitted to Instana via image loads as well as HTTP GET and POST requests (via XMLHttpRequest). The origins used for data transmission can be seen in the tracking snippet.

The following Content-Security-Policy definition shows what is required for Instana’s SaaS product:

script-src *.instana.io;
img-src *.instana.io;
connect-src *.instana.io;

What is the impact of same-origin policy on website monitoring?

The same-origin policy is one of the most fundamental website security concepts. Every website is subject to it as all web browsers enforce it. As a website monitoring provider, we cannot control your websites’ or browsers’ security. Because of this, we can only work within the imposed security constraints. Unfortunately, this restricts our monitoring abilities.

  • Browsers restrict access to error messages and stack traces to scripts of the same origin.
  • Browsers restrict allowed HTTP headers for cross-origin requests. This means that backend correlation isn’t always possible.

To unlock these features when multiple origins are involved, cross-origin resource sharing (CORS) can be used. CORS is a mechanism with which small controlled holes are opened within the same-origin policy security mechanism. The following picture describes in detail what needs to be done to address the same-origin policy imposed restrictions.

You can read more about our backend correlation mechanisms and how web security is impacting these in our dedicated backend correlation documentation.

Picture explaining cross-origin capable end-user monitoring.

Why are detailed resource retrieval breakdowns not always available?

The availability of insights into network times, caching statistics, and asset sizes rely on resource timing capabilities. These capabilities are available in most modern web browsers when allowed by the same-origin policy.

To enable insights into resources of cross-origin resources, e.g. origin https://cdn.example.com:443 for an HTML document loaded from https://example.com:443, the Timing-Allow-Origin HTTP header can be used. The following picture shows how to set the header. Also, see the resource timing specification for more information.

Picture explaining cross-origin capable end-user monitoring.

Cloudflare has recently deployed an update and is now setting cookies with the SameSite attribute. If you are still seeing this warning, then it is because your browser has outdated cookies stored. Within the next weeks all cookies expire automatically and the warning will be gone.

Instana makes use of Cloudflare as its content delivery network (CDN) to ensure performance and availability of the JavaScript agent for our customers and their end-users. Cloudflare sets a cookie called __cfduid when serving responses from its CDN that does not yet specify a SameSite attribute. Cloudflare is working on adding the attribute to its cookie. Until this work is finished, we kindly ask you to ignore this warning as it doesn’t have any end-user impact.

Should I use Instana for my business analytics use-cases?

Some business analytics use-cases can be addressed with the data collected by Instana. Our focus is on delivering a best-in-class performance product, and as such, not a replacement for a dedicated business analytics product.

JavaScript Stack Trace Translation

What is JavaScript Stack Trace Translation?

The JavaScript stack trace translation provides clear and more actionable stack traces within Instana.

Before:
at http://shop-demo-app.instana.io/static/js/main.b1510333.chunk.js:1:1559

After:
at ProductDetails.js 26:11

The problem with untranslated stack trace lines is that the errors aren’t actionable. Developers work with many (typically small) files. For performance reasons, these files are shipped to end-users’ browsers in a bundled and minified format, resulting in file names, lines, and column numbers in stack traces that are not human-readable nor actionable.

With a translated stack trace, it’s clear that the error occurring at …/main.b1510333.chunk.js:1:1559 is in fact at ProductDetails.js 26:11.

How does JavaScript Stack Trace Translation work?

The translation works by utilizing source maps. Source maps enable us to translate un-actionable stack traces to actionable stack traces. More specifically, they allow us to translate references to files, names, lines, and columns to their actual source code counterparts.

To achieve this, Instana executes the following steps:

  1. The JavaScript agent reports JavaScript errors to Instana’s servers. For example, let’s assume the stack trace contains the line at http://shop-demo-app.instana.io/static/js/main.b1510333.chunk.js:1:1559.
  2. Instana’s servers attempt to download the JavaScript file to identify the source map responsible for this file.
  3. The HTTP response is parsed, and Instana looks for references to source maps.
  4. When a source map file is referenced, Instana downloads the source map file via an HTTP GET request.
  5. When the download is successful, the source map file is used to translate the file, name, line, and column references.

Let’s look at these steps for the stack trace line at http://shop-demo-app.instana.io/static/js/main.b1510333.chunk.js:1:1559.

  1. An error is reported to Instana’s servers.
  2. Instana’s servers issue an HTTP GET request to http://shop-demo-app.instana.io/static/js/main.b1510333.chunk.js.
  3. The source map reference //# sourceMappingURL=main.b1510333.chunk.js.map is located in the JavaScript file.
  4. The source map file http://shop-demo-app.instana.io/static/js/main.b1510333.chunk.js.map is downloaded via an HTTP GET request.
  5. The source map is parsed, and the stack trace made readable.

How exactly are you retrieving files from our servers?

As soon as we learn about the stack trace, Instana automatically issues HTTP requests to retrieve the JavaScript and source map files. The closest representation of the calls we do is the following:

curl -H 'Accept: */*' \
  # Use a fake user-agent to bypass simple bot blockers
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36' \
  {{url of JavaScript or source map files}}

You can use the previous command to identify what kind of configuration Instana would need to download the JavaScript and source map files from your servers. Please note that the HTTP requests are issued from servers running inside of AWS (Amazon Web Services). Advanced bot detection mechanisms might block requests coming from AWS. Therefore, consider configuring additional headers that Instana should send to your servers to circumvent bot detection mechanisms.