Blog
🌐 中文Insights on software engineering, cloud architecture, SRE, and building reliable systems.
-
Who Does What When AI Is on the Team? Rethinking RACI for Human-AI CollaborationRACI was designed for humans. When multi-agent AI joins your workflow, adding an "AI" column isn't enough—it's a trap. Here's how to redesign accountability for a world where AI executes but humans must always own the outcome.
-
Layer 4: Human Oversight — Engineering the Judgment LayerIn autonomous systems, humans don't disappear—they move up. Learn how to design the judgment layer that governs automation and handles what systems cannot.
-
Layer 3: Immutable Infrastructure — Making Self-Healing SafeImmutable infrastructure is the prerequisite for safe automated remediation. Learn how containers, IaC, and GitOps create systems that can heal themselves.
-
Layer 2: SLOs as Control Loops — From Metrics to GovernanceTransform SLOs from reporting metrics into active control signals. Learn how SLO violations drive automatic decisions about deployments, scaling, and resource allocation.
-
Layer 1: Runbook Automation — From Documentation to ExecutionAutomate known failures with runbook automation. Learn how to design safe, reliable automated remediation that eliminates toil while maintaining human oversight.
-
Layer 0: Observability — The Foundation of Autonomous OperationsObservability is not monitoring. It's the architectural discipline that makes autonomous operations possible. Learn how to build observability with OpenTelemetry, Prometheus, ELK, and Grafana.
-
The Path to Zero-Touch Operations: An Architecture, Not a ProductZero-touch operations isn't a tool you buy—it's an architectural destination requiring five interdependent layers. Most organizations attempt them in the wrong order.
-
The 10% Engineer: Why AIOps Makes Site Reliability Talent More Critical, Not LessWhen AI handles 90% of operational events, the remaining 10% becomes exponentially more consequential—and so do the engineers responsible for it.
-
The End of the Traditional Software Engineer: The Rise of Human-AI EngineersThe traditional software engineer role is disappearing. In its place: Human-AI Engineers who co-create with AI—directing, reviewing, and architecting alongside machine intelligence.
-
From Factory Floors to AI Pipelines: What Software Can Learn from ManufacturingManufacturing's industrialization took two centuries. Software's AI transformation is happening in years. The parallels—and the lessons—are worth paying attention to.
-
First Principles Thinking in Software EngineeringMost software decisions are made by analogy: "other companies do it this way." First principles thinking strips away assumptions and rebuilds from fundamental truths.