Shepherd - High-Precision Coverage Inference for Response-guided Blackbox Fuzzing (Registered Report)

Published in ISSTA Companion 2025 - 34th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2025

In recent years, fuzzing has gained attention as a primary means for the early detection of vulnerabilities. Although coverage-based greybox fuzzing utilizes internal coverage information to achieve high exploration efficiency, it remains difficult to employ the fuzzing framework in some restricted environments where we cannot instrument the program, such as firmware or smartphone applications. In contrast, blackbox fuzzing does not require runtime information and is thus more widely applicable, but suffers from lower efficiency because coverage cannot be measured. To address this issue, there is a growing demand for methods that can approximate coverage in blackbox environments to optimize fuzzing. One existing study proposes estimating coverage based on the relationship between program responses and strings embedded in its binary. However, this approach faces challenges with ambiguous matching algorithms and the non-uniqueness that occurs when a single string is shared by multiple basic blocks, leading to frequent misestimations. In this research, we propose a new coverage inference method, Shepherd, which combines high-precision string matching with context analysis to resolve these problems. Experimental results show that Shepherd significantly improves estimation accuracy compared to the existing approach.