Software Blog

Distance-based conflict analysis

2025-02-10 11:14 by Simon Sigl

What to Do When an AI Doesn't Learn Satisfactorily?

Why do some AI models fail in critical situations despite following best practices? And what can be done when no model provides satisfactory results?

Let's assume we want to solve a regression or classification problem using machine learning. We have followed best practices: we understand the problem well, have sufficient and clean data, relevant input features, carefully created labels, an appropriate architecture, and suitable training methods and parameters. The overall performance is good, but there are specific cases where none of the trained models correctly predict the target parameter.

In the following example, machine learning models estimate the criticality of a driving scenario. None of the color-coded model outputs match the labeled target (black); in the time range between 0.4 and 1.4 seconds, all models significantly overestimate the criticality.

At this point, machine learning expertise alone is no longer enough—we need to dive deeper into engineering. But before that, let's take a step back.

What Is This About?

From an (embedded cyber-physical) system, we expect deterministic behavior: the same inputs (x) should lead to the same outputs (y). Mathematically, a deterministic system can be modeled as a function: f(x) = y.

In many cases, we even expect strong causality: small changes in the initial state should result in only small changes in the outcome, meaning x₁ ≈ x₂ ⇒ f(x₁) ≈ f(x₂). Within the measurement accuracy and reproducibility of experiments, consistent system behavior is generally required. While we expect deterministic behavior in cyber-physical systems, the real challenge often lies in inconsistent labeling and conflicting requirements.

Distance-based conflict analysis helps to uncover these contradictions. In the training dataset, pairs of samples are identified that have different labels—meaning different model outputs are expected—yet have very small distances between their input vectors.

Such samples represent a requirement conflict: when sensor data is nearly indistinguishable, different outputs are still expected. No algorithm can fulfill this requirement, whether it is based on analytical equations or machine learning.

For example, in autonomous driving, rain can cause similar sensor readings for different road conditions. If one scenario is labeled as ‘safe’ and another as ‘critical,’ this creates a requirement conflict that needs resolution.

Managing Requirement Conflicts

Requirement conflicts are unsolvable within the function f itself. To resolve a conflict, a system developer has exactly three possible options:

Modify or extend inputs (x): Improve parameterization or incorporate additional sensors or information so that input vectors x₁ and x₂ become distinguishable.
Adjust the required output (y): Accept "incorrect" system behavior in one of the samples and redefine the corresponding labels to ensure consistency.
Reduce the validity range: Exclude x₁, x₂, or both from the system’s Operational Design Domain (ODD). For example, the system could be disabled in rainy conditions or at low speeds.

Once a conflict is successfully resolved, it is advisable to continue iterating until no further requirement conflicts can be detected. The result of this process is a dataset that provides an example-based and contradiction-free representation of the system’s functional requirements.

Of course, each of these three options has implications—such as manufacturing costs, potential incorrect system behavior, or operational usefulness. Therefore, these decisions must be made consciously and based on evidence.

Both conflict analysis and impact assessment are integral to active requirement conflict management, which is available within ANDATA’s Scenario Management as a Service (SMaaS)

Tool Support

Conflict analysis has been a long-standing feature of ANDATA software, but it has now been completely redesigned. Users can now configure conflict analysis more intuitively, run calculations significantly faster, and gain deeper insights through improved visualizations

The updated distance-based conflict analysis will be included in the next spring release of the ANDATA tools, available in STIPULATOR and BRAINER.