This document summarizes techniques to analyse application crashes using logs tracing tools, change analysis, and minimal reproduction cases. Emphasis is on isolating root causes and collecting evidence for remediation or reporting.
This document explains how to investigate application crashes by examining logs, enabling diagnostic logging, tracing system calls, analysing recent changes, and building minimal reproduction cases to isolate root causes.
When an application terminates unexpectedly, the investigation should start with all available evidence and proceed by systematically reducing the problem scope. Logs, tracing tools, and change history provide primary clues.
Logs are the primary source of evidence. Relevant data includes timestamps and error messages around the crash time. Search logs for entries near the known crash time and for messages related to the failing component.
| Platform | Primary log locations / tools |
|---|---|
| Linux | /var/log/*, journalctl |
| macOS | Console app, dtrace |
| Windows | Event Viewer, Process Monitor |
When an error string is discovered, searching it online often surfaces documentation, bug reports, or community posts that clarify the meaning and possible fixes.
If logs are insufficient, enable higher verbosity (debug logging) via configuration or command-line switches to capture additional runtime details.
When log messages are missing or cryptic, tracing the program’s system calls or runtime interactions can reveal missing resources, permission failures, or incompatible interfaces.
| Technique | Purpose |
|---|---|
strace (Linux) | Observe system calls: file access, network connections, and errors |
dtrace (macOS) | Instrument kernel and user-level events |
| Process Monitor (Windows) | Trace file, registry, and process activity |
Example strace command to capture a run:
1# Run the program and record system calls to a file
2strace -o /tmp/app.strace -f /path/to/application --arg value
Traces help identify missing files, denied permissions, or unexpected environment assumptions (for example, attempts to use a GUI when running as a headless service).
If the application used to behave correctly, inspect recent changes that might have introduced a regression. Possible change vectors include:
Where configuration is managed via version control or configuration management, review the history to identify commits or deployments that coincide with the start of failures.
A minimal reproduction case is invaluable. Construct the smallest set of steps, inputs, and environment state that reliably triggers the crash. Useful strategies:
Small, repeatable reproduction cases speed diagnosis and improve the quality of bug reports submitted to maintainers.
| Item | Action |
|---|---|
| Logs available | Search for timestamped errors near crash time |
| Debug logging | Enable increased verbosity and reproduce the failure |
| Trace data | Capture system-call or process traces (e.g., strace) |
| Recent changes | Compare versions, libraries and config history |
| Minimal repro | Build and validate smallest reproducible case |
Remediation depends on ownership and feasibility:
Effective crash analysis relies on collecting detailed logs, tracing runtime behavior, reviewing recent changes, and producing a minimal reproduction case. These steps enable targeted fixes or clear bug reports to maintainers.