8 September 2023 – Improving quality of OR data sets

Presented by Dr. Inci Yüksel-Ergün, Zuse Institute Berlin (ZIB)

Data is ubiquitous in the age of analytics. The reliability of decisions based on OR studies depends on the underlying data quality. However, identifying pertinent data and assessing its quality is challenging. It is inevitable to employ highly-connected and consistent real-world data sets to model complex decisions. When expert knowledge becomes obsolete with disruptive changes, we require more complex models to comprehend the impacts of these changes.

When conducting projects with industry using highly connected data, we encountered several cases where our analysis detected data errors that were too complex for humans to understand. Examples for our analysis include irreducible infeasible subsystems (IIS) of large mixed-integer programs (MIP) and bottlenecks in highly nonlinear networks. While detecting such errors is a significant achievement, removing them is extremely difficult.

In this presentation, we highlight our insights on data quality improvement. We report our results on data from the German high-pressure gas transport network using methods from data preprocessing and mathematical optimization.


EURO Practitioners’ Forum past and planned activities are available to the Forum members, as well as the wider public.

Visit the website and register as a member for free, to get the regular updates on all activities: EPF Member registration page. The recordings and details from previous webinars are also available on this website.

Follow the Forum on Twitter and LinkedIN , and feel free to get in touch.