White-box Fuzzing RPC-based APIs with EvoMaster: An Industrial Case Study
Peer reviewed, Journal article
MetadataShow full item record
Remote Procedure Call (RPC) is a communication protocol to support client-server interactions among services over a network. RPC is widely applied in industry for building large-scale distributed systems, such as Microservices. Modern RPC frameworks include, for example, Thrift, gRPC, SOFARPC, and Dubbo. Testing such systems using RPC communications is very challenging, due to the complexity of distributed systems and various RPC frameworks the system could employ. To the best of our knowledge, there does not exist any tool or solution that could enable automated testing of modern RPC-based services. To fill this gap, in this article we propose the first approach in the literature, together with an open source tool, for fuzzing modern RPC-based APIs. The approach is in the context of white-box testing with search-based techniques. To tackle schema extraction of various RPC frameworks, we formulate a RPC schema specification along with a parser that allows the extraction from source code of any JVM RPC-based APIs. Then, with the extracted schema we employ a search to produce tests by maximizing white-box heuristics and newly defined heuristics spe- cific to the RPC domain. We built our approach as an extension to an open source fuzzer (i.e., EvoMaster), and the approach has been integrated into a real industrial pipeline that could be applied to a real industrial development process for fuzzing RPC-based APIs. To assess our novel approach, we conducted an empirical study with two artificial and four industrial web services selected by our industrial partner. In addition, to further demonstrate its effectiveness and application in industrial settings, we report results of employing our tool for fuzzing another 50 industrial APIs autonomously conducted by our industrial partner in their test- ing processes. Results show that our novel approach is capable of enabling automated test case generation for industrial RPC-based APIs (i.e., 2 artificial and 54 industrial). We also compared with a simple gray-box technique and existing manually written tests. Our white-box solution achieves significant improvements on code coverage. Regarding fault detection, by conducting a careful review with our industrial partner of the tests generated by our novel approach in the selected four industrial APIs, a total of 41 real faults were identified, which have now been fixed. Another 8,377 detected faults are currently under investigation.