<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Monitoring on Ghafoor's Personal Blog</title><link>http://ghafoorsblog.com/tags/monitoring/</link><description>Recent content in Monitoring on Ghafoor's Personal Blog</description><generator>Hugo</generator><language>en</language><managingEditor>noreply@example.com (AG Sayyed)</managingEditor><webMaster>noreply@example.com (AG Sayyed)</webMaster><copyright>Copyright © 2024-2026 AG Sayyed. All Rights Reserved.</copyright><lastBuildDate>Sat, 16 May 2026 17:42:12 +0100</lastBuildDate><atom:link href="http://ghafoorsblog.com/tags/monitoring/index.xml" rel="self" type="application/rss+xml"/><item><title>Debug With Logging Module</title><link>http://ghafoorsblog.com/courses/google/it-automation-content/it-automation-python-pcert/04-troubleshooting-debugging/04-module/005-with-logging/</link><pubDate>Thu, 13 Nov 2025 14:28:51 +0000</pubDate><author>noreply@example.com (AG Sayyed)</author><guid>http://ghafoorsblog.com/courses/google/it-automation-content/it-automation-python-pcert/04-troubleshooting-debugging/04-module/005-with-logging/</guid><description>&lt;p class="lead text-primary"&gt;
This document explores debugging Python programs using the logging module, covering log levels (DEBUG, INFO, WARNING, ERROR, CRITICAL), configuration options, file output, custom formatters, handlers, and best practices for production-grade logging that replaces print statements with structured, filterable log messages.
&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Imagine working on an e-commerce site where business growth brings increasing customers and unexpected errors. While &lt;code&gt;print()&lt;/code&gt; statements have been the go-to debugging strategy, they now flood the console with messages, making it hard to discern critical issues from routine operations. A more robust solution is needed to track, categorize, and diagnose issues effectively.&lt;/p&gt;</description></item><item><title>Crashing Programs</title><link>http://ghafoorsblog.com/courses/google/it-automation-content/it-automation-python-pcert/04-troubleshooting-debugging/03-module/001-crashing-programs/</link><pubDate>Thu, 13 Nov 2025 09:59:07 +0000</pubDate><author>noreply@example.com (AG Sayyed)</author><guid>http://ghafoorsblog.com/courses/google/it-automation-content/it-automation-python-pcert/04-troubleshooting-debugging/03-module/001-crashing-programs/</guid><description>&lt;p class="lead text-primary"&gt;
This document explains how to troubleshoot and debug crashing programs, focusing on quick workarounds, monitoring strategies, and long-term fixes to prevent recurring issues.
&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;When faced with a crashing program, the first step is to find a quick workaround to restore functionality. For example, if a database server crashes due to insufficient disk space, adding an extra hard drive can resolve the issue temporarily. However, long-term solutions are essential to prevent recurrence.&lt;/p&gt;</description></item><item><title>Proactive Practices</title><link>http://ghafoorsblog.com/courses/google/it-automation-content/it-automation-python-pcert/04-troubleshooting-debugging/05-module/012-proactive-practices/</link><pubDate>Tue, 11 Nov 2025 18:21:25 +0000</pubDate><author>noreply@example.com (AG Sayyed)</author><guid>http://ghafoorsblog.com/courses/google/it-automation-content/it-automation-python-pcert/04-troubleshooting-debugging/05-module/012-proactive-practices/</guid><description>&lt;p class="lead text-primary"&gt;
This document describes proactive practices to reduce incidents and simplify troubleshooting: automated testing and CI, test environments and canary deployments, centralized logging and monitoring, ticket automation, documentation, and capacity planning.
&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="why-proactive-practices-matter"&gt;Why Proactive Practices Matter&lt;/h2&gt;
&lt;p&gt;Bugs and failures are unavoidable. Proactive practices reduce their frequency and impact by catching issues early and providing better diagnostic information when problems occur.&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Problem Area&lt;/th&gt;
 &lt;th&gt;Proactive Practice&lt;/th&gt;
 &lt;th&gt;Benefit&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Code regressions&lt;/td&gt;
 &lt;td&gt;Unit and integration tests + CI&lt;/td&gt;
 &lt;td&gt;Detects bugs before deployment&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Deployment risk&lt;/td&gt;
 &lt;td&gt;Test environments and canary releases&lt;/td&gt;
 &lt;td&gt;Limits blast radius&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Incident diagnosis&lt;/td&gt;
 &lt;td&gt;Centralized logging&lt;/td&gt;
 &lt;td&gt;Faster root-cause analysis&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Silent failures&lt;/td&gt;
 &lt;td&gt;Monitoring and alerting&lt;/td&gt;
 &lt;td&gt;Detects issues before users report them&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Repetitive requests&lt;/td&gt;
 &lt;td&gt;Ticket templates and automation&lt;/td&gt;
 &lt;td&gt;Saves triage time&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Knowledge gaps&lt;/td&gt;
 &lt;td&gt;Documentation and runbooks&lt;/td&gt;
 &lt;td&gt;Consistent on-call response&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 id="automated-testing-and-continuous-integration"&gt;Automated Testing and Continuous Integration&lt;/h2&gt;
&lt;p&gt;Automated tests serve as a safety net that catches regressions early. Continuous integration (CI) runs tests on every change, ensuring immediate feedback.&lt;/p&gt;</description></item></channel></rss>