Debugging Inputs

Lukas Kirschner, Ezekiel Soremekun, Andreas Zeller

Research output: Chapter in Book/Report/Conference proceedingConference contribution


When a program fails to process an input, it need not be the program code that is at fault. It can also be that the input data is faulty, for instance as result of data corruption. To get the data processed, one then has to debug the input data---that is, (1) identify which parts of the input data prevent processing, and (2) recover as much of the (valuable) input data as possible. In this paper, we present a general-purpose algorithm called ddmax that addresses these problems automatically. Through experiments, ddmax maximizes the subset of the input that can still be processed by the program, thus recovering and repairing as much data as possible; the difference between the original failing input and the "maximized" passing input includes all input fragments that could not be processed. To the best of our knowledge, ddmax is the first approach that fixes faults in the input data without requiring program analysis. In our evaluation, ddmax repaired about 69% of input files and recovered about 78% of data within one minute per input.
Original languageUndefined/Unknown
Title of host publicationProceedings of the ACM/IEEE 42nd International Conference on Software Engineering
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery (ACM)
Number of pages12
ISBN (Print)9781450371216
Publication statusPublished - 1 Oct 2020

Publication series

NameICSE '20
PublisherAssociation for Computing Machinery

Cite this