Amazon EMR

This sensor monitors AWS ElasticMapReduce (EMR) environments and their instances.

Learn about other supported AWS services on our AWS docs.

Sensor (Data Collection)

Cluster Details

  • Cluster Id
  • Cluster Name
  • Cluster Creation Time
  • Cluster Version
  • Cluster State
  • Grouping zone (region)

Metrics

Cluster Metrics

Name Description
Apps Running Number of applications running currently in the cluster
Apps Pending Number of applications pending for the cluster
Apps Failed Number of applications that failed in the cluster
Memory Allocated Number of megabytes of data that has been allocated by the cluster
Memory Reserved Number of megabytes of data that has been reserved by the cluster
Memory Available Number of megabytes of data that is still avilable to the cluster
Containers Running Number of containers running in the cluster

Node Metrics

Name Description
Active Nodes Number of nodes currently running MapReduce tasks within the cluster
Lost Nodes Number of nodes allocated to MapReduce tasks with a LOST state
Unhealthy Nodes Number of nodes allocated to MapReduce tasks with an UNHEALTHY state
Decommissioned Nodes Number of nodes allocated to MapReduce tasks with a DECOMMISSIONED state

Input/Output Metrics

Name Description
Bytes Written to S3 Number of bytes written to the S3 bucket by the cluster
Bytes Read fron S3 Number of bytes read from the S3 bucket by the cluster
HDFS Utilization The percentage of HDFS storage currently being used
Total Load The total number of concurrent data transfers

Required Permissions

  • cloudwatch:GetMetricStatistics
  • cloudwatch:GetMetricData
  • elasticmapreduce:ListClusters
  • elasticmapreduce:DescribeCluster

Configuration

Metrics for EMR are pulled every 300 seconds, this can be changed via agent configuration in <agent_install_dir>/etc/instana/configuration.yml:

com.instana.plugin.aws.emr:
  cloudwatch_period: 300

To disable monitoring of EMR instances use the following configuration:

com.instana.plugin.aws.emr:
  enabled: false

Tags

Multiple tags can be defined, separated by a comma. Tags should be provided as a key-value pair separated by :. In order to make configuration easier, it is possible to define which tags you want to include in discovery or exclude from discovery. In case of defining tag in both lists (include and exclude), exclude list has higher priority. If there is no need for services filtering, the configuration should not be defined. It’s not mandatory to define all values in order to enable filtering.

Users are able to specify how often sensors will poll the AWS tagged resources using the tagged-servies-poll-rate configuration property (default 300 seconds).

To define how often sensors will poll the tagged resources use following configuration:

com.instana.plugin.aws:
  tagged-servies-poll-rate: 60 #default 300

To include services by tags into discovery use following configuration:

com.instana.plugin.aws.emr:
    include_tags: # Comma separated list of tags in key:value format (e.g. env:prod,env:staging)

To exclude services by tags from discovery use following configuration:

com.instana.plugin.aws.emr:
    exclude_tags: # Comma separated list of tags in key:value format (e.g. env:dev,env:test)

Instana Agent Tags

Please note that tags are currently only available in conjunction with the dedicated AWS Instana agent, described here AWS Agent Installation docs. More details on using tags are described here.