Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters. Moein monitoring system is capable of monitoring Spark in both mode of operation. Performance metrics of master and workers, executes in workers, JVM metrics such as heap and garbage collectors and RDD are collected. The following is a list of the performance metrics of Apache Spark:
General Information:
- Alias Name
- Address
- Port Number
- Node Role
Master Node Information:
- Master Node Address
- Master Number Of Cores
- Master Number Of Used Cores
- Master Total Memory
- Master Used Memory
- Total Number Of Wokers
- Number Of Alive Workers
- Number Of Active Applications
- Number Of Completed Applications
- Status
- Master Used Memory Percentage
- Master Core Used Percentage
- Number Of Active Drivers
- Number Of Completed Drivers
Workers Mertrics:
- Worker ID
- Worker Host Address
- Worker Port Number
-
Worker Web UI Address
- Worker Number Of Cores
- Worker Number Of Used Cores
- Worker Number Of Free Cores
- Worker Total Memory
- Worker Used Memory
- Worker Free Memory
-
Elapsed Time Since Last Heartbeat
- Worker Status
- Worker Used Memory Percentage
- Worker Core Used Percentage
Applications Mertrics:
- Application ID
- Application Name
- Application User
- Application Start Time
- Application Submit Time
- Application Number Of Allocated Cores
- Application Running Duration
- Application Status
- Application Running Status
Worker Mertrics:
- Worker ID
- Master Node Address
- Master Web Service Address
- Worker Number Of Cores
- Worker Number Of Used Cores
- Worker Total Memory
- Worker Used Memory
- Worker Used Memory Percentage
- Worker Core Used Percentage
- Total Number Of Running Executors
- Total Number Of Finished Executors
Execute in Workers Mertrics:
- Executor ID
- Executor Total Memory
- Executor Application ID
- Executor Application Name
- Number Of Executor Application Cores
- Executor Application User
- Executor Application Memory Per Slave
- Executor Status
Memory Mertrics:
Heap and Non Heap memory:
- Committed Heap Memory
- Initial Heap Memory
- Maximum Heap Memory
- Used Heap Memory
- Committed Non-Heap Memory
- Initial Non-Heap Memory
- Maximum Non-Heap Memory
- Used Non-Heap Memory
- Heap Memory Used Percentage
- Non-Heap Memory Used Percentage
Memory Pools KPIs:
- Memory Pool Name
- Memory Pool Committed Memory
- Memory Pool Initial Memory
- Memory Pool Maximum Memory
- Memory Pool Used Memory
- Memory Pool Used Percentage
GC Mertrics:
- Garbage Collection Count
- Garbage Collection Rate
- Garbage Collection Time
- Average Garbage Collection Time
- GC Name
RDD Mertrics:
- File Cache Hits
- Discovered Files
- Hive Client Calls
- Parallel Listing Job Count
- Fetched Partitions
- Compilation Mean Time
- Total Number Of Compilation
- Generated Class Size
- Generated Class Count
- Generated Method Size
- Generated Method Count
- Source Code Size
- Source Code Count
Communication Protocols: