xRay Reporting and Visualization Examples
This section covers the xRay dashboard and some of the commonly used capabilities.
Figure 4: Dashboard View
Users navigate xRay through a basic web interface, starting with a simple homepage. The primary view on the left-hand navigation menu is Applications. Figure 4 shows the dashboard for a user with two application log outputs. Each application or application group has colored icons indicating xRay’s recommendations:
As users develop longer lists of applications, they can define “App Groups” to organize them. The round graphic in the upper left shows the aggregate runtime for all of this user’s applications.
Figure 5: Application Overview—CPU Utilization View
Users click on the application name to find the details of that specific Spark application. They start with an overview—for example, the overview in Figure 5—which includes runtime, cluster time used, the timeline, and more.
Figure 6: Application Overview—Disk I/O View
The App Overview provides timeline graphs for CPU Utilization (Figure 4), Memory Utilization, as well as Disk I/O (Figure 6).
Figure 7: Stage-by-Stage View
Further down the App Overview screen, xRay shows the stage-by-stage execution timeline (example in Figure 7). The dark parts of the individual stage timelines represent periods of time when the stage was runnable, but waiting for other inputs. Green portions show actual execution.
The App Overview also provides the following:
- DAG: Full interactive layout of the Directed Acyclic Graph (DAG) for the application, including the critical path
- Job metrics: Including scan time, CPU utilization, and write times
- Spark configuration
- Cluster configuration
The App Overview data and visualizations often generate clues to help troubleshoot an application, but a user will often probe into a specific stage to find the detailed insights. To view details at the stage level, simply click on a specific stage in the App Runtime stage-by-stage view.
Figure 8: Stage Analysis—Stage DAG
The initial view is the Stage DAG (Figure 8), highlighting the operations executed by the stage.
Figure 9: Stage Analysis—Executor Timeline
The Executor Timeline (Figure 9) presents task-level detail, visualizing delays, deserialization, shuffle read and write, and executor computing time.
Figure 10: Stage Analysis—Task Charts
Task Charts (Figure 10) provides a scatter plot view of the data, giving an alternative visual to identify patterns and outliers. A user may have identified a single long-running stage and use this plot to find “straggler” tasks.
The Task Metrics Table view gives users the data in tabular form, allowing them to sort information by dimensions such as task runtime, end time, result size, shuffle write/read time, and much more.
Figure 11: Stage Analysis—Task Histogram
Finally, the Task Histogram breaks out the number of tasks by execution time (Figure 11).