Host

Sensor (Data Collection)

Supported OS

  • Linux
  • Windows
  • Mac OS/OS X
  • Solaris on Sparc
  • AIX

Tracked Configuration

  • Operating System name and version
  • CPU type, count and context switches
  • Memory
  • Hostname
  • Fully Qualified Domain Name
  • Network Interfaces

Metrics

  • CPU Usage

    • CPU Usage per Core
  • Load (when available on the Operating System)
  • Context switches (only supported on Linux hosts)
  • Memory Usage
  • Open Files Usage (open files current vs max, when available on the Operating System)
  • Filesystem per Device

    • Mount, Options and type information
    • Capacity, Free, Leaked, Inodes usage and free inodes (depending on the Filesystem type)    - Leaked refers to deleted files that are in use and equates to capacity - used - free. On Linux you can find these files with lsof | grep deleted
  • Network traffic and errors per Interface
  • TCP Activity
  • Process Top List

    • The process toplist is refreshed about every 10 seconds and contains only processes that have significant system usage
    • It is possible to search through history of snapshots up to one month  - It uses Linux top semantics: 100% CPU refers to a single CPU core
    • Significant usage is currently defined as:
    • more than about 10% cpu usage over the last 10 seconds
    • more than about 512MB memory usage (rss)

Health Signatures

Health Description
CPU Steal CPU is stolen too much between running processes or by the hypervisor/host OS (sampling in sliding window of 60 seconds)
CPU Usage CPU usage of user processes is too high (sampling in sliding window of 180 seconds)
CPU Wait CPU spends significant time waiting for input/output (sampling in sliding window of 60 seconds)
Load System load is too high (sampling in sliding window of 120 seconds)
Memory exhausted System memory is close to being exhausted (triggered instantly)
Disk space saturation Device has less than 1GB available and has lost more than 1MB in last 10 seconds (sampling in sliding window of 10 seconds)
TCP errors Host has unusually high number of TCP errors (sampling in sliding window of 60 seconds)
TCP fails Host has unusually high number of TCP fails (sampling in sliding window of 60 seconds)
TCP retransmission Host has unusually high number of TCP retransmission (sampling in sliding window of 60 seconds)
Open Files Usage Processes are opening files faster than they close them (current vs max ratio exceeds threshold)
High Inode Usage Low level of free inodes on filesystem triggers this health rule (current vs max ratio exceeds threshold)

Configuration

The configuration of the Instana Agent is being explained in more detail here.