Evaluating Design Conformance Through Trace Comparison
TLDR
This paper introduces a method to evaluate distributed system design conformance by comparing OpenTelemetry traces to design models, providing a quantitative metric.
Key contributions
- Addresses design-implementation divergence in distributed systems using a novel conformance checking approach.
- Leverages process mining's conformance checking by comparing OpenTelemetry application traces to design traces.
- Provides a quantitative conformance percentage metric to track implementation adherence to design over time.
Why it matters
System designs often drift from implementation over time, leading to incoherence. This paper offers a practical, quantitative method using industry-standard tools to continuously monitor and ensure that distributed systems adhere to their intended design principles.
Original Abstract
The design of a system and its implementation are two tasks often carried out by different individuals on a development team, and can occur weeks or months apart. This creates a potential for divergence between real behavior and the designed model that an implementation is intended to match. Particularly as time passes and individuals who were present for the original conception of the design leave, a system can lose coherence and drift from intended design principles. Even with a robust system design, more is needed to ensure that the key implementation details match the design and that adherence to a particular strategy is not lost over time. This paper proposes an approach to address that concern for distributed systems using conformance checking, a methodology borrowed from process mining. Distributed traces produced by instrumented applications are evaluated for conformance by comparison to design traces. The resulting conformance percentage is a quantitative metric that can be tracked over time to determine how closely a concrete implementation corresponds to the key attributes of the expected design model. This analysis is done using the dominant industry standard, OpenTelemetry, and so should apply to a wide range of distributed systems.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.