image

Preserving Reactiveness: Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Arooba Shahoor, Jooyong Yi, Dongsun Kim

Kyungpook National University, UNIST, Korea University

Automated Program Repair (APR)

Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Automated Program Repair (APR)

  • Most work focuses on repairing functional bugs -- e.g., fixing the bugs in Defects4J
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

How do we fix non-functional bugs?

Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Our Previous Work

LeakPair: Proactive Repairing of Memory Leaks in Single Page Web Applications (ASE'23)

  • Target software: SPAs written using Angular and React
  • Target bug type: Memory leaks
Multiple-page Applications Single-page Applications
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

This Work

  • Target software: Reactive Java programs using Reactor, RxJava, and Vert.x
Non-reactive Programs Reactive Programs
Synchronous Event-Driven Asynchronous
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Reactive Program Are Getting Popular

  • Web, mobile, and IoT applications
    • High throughput
    • Low latency
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Reactive Program Example image

InstrumentedPool<? extends Channel> newPool =                                                    
    PoolBuilder.from(connectionMono
      .flatMap(this::openChannel))
      .sizeBetween(1, configuration.maxChannel)
      .evictionPredicate((channel, metadata) -> {
          if (metadata.idleTime() > Duration.ofSeconds(30).toMillis()) {
             return true;
          }
          return false;
      })
      .destroyHandler(channel -> Mono.fromRunnable(Throwing.runnable(() -> {
          if (channel.isOpen()) {
             try {
                channel.close();
             } catch (ShutdownSignalException e) {}
          }
      }))
      .buildPool();

Most Common Bug Type in Reactive Programs?

  • We investigated 29 open-source GitHub projects that
    • use popular reactive libraries (Reactor, RxJava, Vert.x)
    • are well-maintained
    • have large enough contributors (>= 10) and commits (>= 100)
    • have more than 10 stars, watches, or forks
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Blocking-Call Bugs Are the Most Common

  • Out of the 29 open-source GitHub projects
    • 189 Reactiveness bugs (in comparison, 667 issues for NPE)
    • 103 Blocking-call bugs (103 / 189 = 54%)
  • Blocking-call bugs
    • Bugs that block the execution of the program to wait for time-consuming operations such as I/O operations to finish
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Blocking-Call Bug Example image

InstrumentedPool<? extends Channel> newPool =                                                                
    PoolBuilder.from(connectionMono
      .flatMap(this::openChannel))
      .sizeBetween(1, configuration.maxChannel)
      .evictionPredicate((channel, metadata) -> {
          if (metadata.idleTime() > Duration.ofSeconds(30).toMillis()) {
             return true;
          }
          return false;
      })
      .destroyHandler(channel -> Mono.fromRunnable(Throwing.runnable(() -> {
          if (channel.isOpen()) {
             try {
                channel.close(); // can block
             } catch (ShutdownSignalException e) {}
          }
      }))
      .buildPool();

Developers Often Ignore Blocking-Call Bugs

  • About 40% (42/103) of the reported blocking-call bugs remain unfixed.
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Why Are Blocking-Call Bugs Ignored?

  • "I'm still not able to test it successfully."
  • "Can you provide a reproducible sample?"
  • "Can you make a memory dump at that moment?"
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Why Are Blocking-Call Bugs Ignored?

  • "I'm still not able to test it successfully."
  • "Can you provide a reproducible sample?"
  • "Can you make a memory dump at that moment?"

Insufficient information to accept the patch!

Understanding and Improving the Debugging Practice of Blocking-Call Bugs

What We Learned

  1. Blocking-call bugs are common.
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

What We Learned

  1. Blocking-call bugs are common.
  2. Developers often leave blocking-call bugs unfixed.
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

What We Learned

  1. Blocking-call bugs are common.
  2. Developers often leave blocking-call bugs unfixed.
  3. It is likely that developers need concrete evidence to accept patches for blocking-call bugs.
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

What We Did

  1. Collected the developer-written patches for the blocking-call bugs.

Understanding and Improving the Debugging Practice of Blocking-Call Bugs

What We Did

  1. Collected the developer-written patches for the blocking-call bugs.
  2. Extracted five common patch patterns.

Understanding and Improving the Debugging Practice of Blocking-Call Bugs

What We Did

  1. Collected the developer-written patches for the blocking-call bugs.
  2. Extracted five common patch patterns.
  3. Applied the fix patterns to unresolved blocking-call bugs.

Understanding and Improving the Debugging Practice of Blocking-Call Bugs

What We Did

  1. Collected the developer-written patches for the blocking-call bugs we found.
  2. Extracted five common patch patterns.
  3. Applied the fix patterns to unresolved blocking-call bugs.
  4. Tested the following hypothesis:
    • Developers are more likely to accept patches when they are accompanied by improvement evidence than when they are not.

Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Improvement Evidence image

Generated by Java Flight Recorder (JFR), a code profiling tool

Understanding and Improving the Debugging Practice of Blocking-Call Bugs

To include, or not to include performance results?

Understanding and Improving the Debugging Practice of Blocking-Call Bugs

To include, or not to include performance results?

  1. We generated 30 patches for the unresolved blocking-call bugs across the 29 open-source projects we investigated.
  2. We randomly assigned 15 patches to include performance results and the other 15 to exclude performance results.
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

To include, or not to include performance results?

Accepted Rejected Ignored Total
w/ perf. results 8 (53.3%) 2 (13.3%) 5 (33.3%) 15 (100%)
w/o perf. results 3 (20.0%) 4 (26.7%) 8 (53.3%) 15 (100%)
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

To include, or not to include performance results?

Accepted Rejected Ignored Total
w/ perf. results 8 (53.3%) 2 (13.3%) 5 (33.3%) 15 (100%)
w/o perf. results 3 (20.0%) 4 (26.7%) 8 (53.3%) 15 (100%)
  • Accepted vs. Rejected + Ignored
  • p-value < 0.01 (Barnard’s exact test)
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Common Fix Patterns

  • FP1. Offloading to Separate Worker Threads
  • FP2. Lazy Method Call
  • FP3. Reactive Filtering
  • FP4. Non-blocking Chaining
  • FP5. Non-blocking Subscription
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Why Not Using an LLM?

Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Why Not Using an LLM?

  • To crack a nut, we do not need a sledgehammer. A nutcracker would be enough.
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Why Not Using an LLM?

  • To crack a nut, we do not need a sledgehammer. A nutcracker would be enough.
  • To fix a specific type of bug such as blocking-call bugs, a simple pattern-based repair can be sufficiently effective.
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Common Fix Patterns

FP1. Offloading to Separate Worker Threads

- 𝐸
+ 𝐸.subscribeOn(Schedulers.boundedElastic())
  • 𝜏(𝐸) <: Mono | Flux
  • 𝐸 involves a blocking operation
  • In the reactive pipeline to which 𝐸 belongs, subscribeOn is not invoked.
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

FP1: Offloading to Separate Worker Threads

InstrumentedPool<? extends Channel> newPool =                                                                            
    PoolBuilder.from(connectionMono
      .flatMap(this::openChannel))
      .sizeBetween(1, configuration.maxChannel)
      .evictionPredicate((channel, metadata) -> {
          if (metadata.idleTime() > Duration.ofSeconds(30).toMillis()) {
             return true;
          }
          return false;
      })
      .destroyHandler(channel -> Mono.fromRunnable(Throwing.runnable(() -> {
          if (channel.isOpen()) {
             try {
                channel.close(); // can block
             } catch (ShutdownSignalException e) {}
          }
      }))
+     .subscribeOn(Schedulers.boundedElastic()) // Added
      .buildPool();

Common Fix Patterns

FP2. Lazy Method Call

- Mono.just(𝐸) // eager evaluation of 𝐸
+ Mono.fromCallable(() -> 𝐸) // lazy evaluation of 𝐸
  • 𝐸 involves a blocking operation
  • Used in combination with FP1
- Mono.just(getSomething()).
-      subscribe(e -> doSomething(e));
+ Mono.fromCallable(() -> getSomething()). // FP2
+      subscribeOn(Schedulers.boundedElastic()). // FP1
+      subscribe(e -> doSomething(e));
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Common Fix Patterns

- Mono.just(getSomething()).
-      subscribe(e -> doSomething(e));

+ Mono.fromCallable(() -> getSomething()). // FP2
+      subscribeOn(Schedulers.boundedElastic()). // FP1
+      subscribe(e -> doSomething(e));

Experimental Results

Apache James

CPU Heap Latency Memory
  • No regression error found
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Experimental Results

Vert.x Micrometer Metrics

CPU Heap Latency Memory
  • No regression error found
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Experimental Results

Vert.x Kafka Client

CPU Heap Latency Memory
  • No regression error found
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Takeaways

  1. Non-functional bugs, such as memory leaks and blocking-call bugs, also need to be fixed.
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Takeaways

  1. Non-functional bugs, such as memory leaks and blocking-call bugs, also need to be fixed.
  2. Simple pattern-based repair can be sufficiently effective in fixing certain types of non-functional bugs.
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Takeaways

  1. Non-functional bugs, such as memory leaks and blocking-call bugs, also need to be fixed.
  2. Simple pattern-based repair can be sufficiently effective in fixing certain types of non-functional bugs.
  3. Providing concrete evidence that shows the effectiveness of a patch can increase the chance of the patch being accepted.
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Takeaways

  1. Non-functional bugs also need to be fixed.
    • As a community, how much do we work on fixing non-functional bugs?
  2. Simple pattern-based repair can be sufficiently effective in fixing certain types of non-functional bugs.
    • As a community, how much do we explore the possibility of using simple yet effective repair methods?
  3. Providing concrete evidence that shows the effectiveness of a patch can increase the chance of the patch being accepted.
    • As a community, how much do we work on providing concrete evidence to support generated patches?
Understanding and Improving the Debugging Practice of Blocking-Call Bugs

Before I begin, I would like to acknowledge that the main credit for this work goes to Arooba Shahoor, a former student of Dongsun Kim. Arooba has recently graduated, and Dongsun has since moved to Korea University. Alright, that was a brief update on the authors. This is Jooyong Yi from UNIST. I am going to tell you something about ...

I am aware that we are in the Debugging session. This is quite apparent when looking at who is chairing this session.

- Failing test β†’ Passing test

But, I am not going to tell you about another LLM-based APR tool evaluated with Defects4J which contains functional bugs. Instead, I am going to ask the following question.

![bg left:40% fit](./img/spa-vs-mpa-architecture.webp)

In fact, we already posed the same question in our previous work presented at ASE 2023. The title of the paper is "LeakPair: Proactive Repairing of Memory Leaks in Single Page Web Applications." The work was well-appreciated and earned a Distinguished Paper Award. In that work, we proposed a technique to fix memory leaks in single-page web applications. In traditional multiple-page applications, the browser reloads the entire page when the webpage needs to be updated, so memory leaks are less of a concern. However, single-page applications update only the necessary parts of the page, and hence memory leaks can pile up, if the program is not properly written.

In this work, we consider reactive Java programs written using reactive libraries such as Reactor, RxJava, and Vert.x. In traditional programs, let's say web applications, each user request is typically handled by a separate thread. If the user request involves time-consuming operations such as a database access, the corresponding thread is blocked until the operation completes. This can result in a waste of computing resources. For example, a new user request might not be processed, even if there are some idle threads simply waiting for their blocked operations to finish. Several reactive libraries such as Reactor and RxJava have been developed to address this issue. Essentially, those libraries support event-driven asynchronous programming. Using those libraries, developers can more easily write non-blocking applications that better utilize computing resources.

Because of such benefits, reactive programming is becoming increasingly popular in various domains such as web, mobile, and IoT applications. Those are the domains where high throughput and low latency are important.

Here is a concrete example of a reactive program taken from the James project, an open-source email server. This example is written using the Reactor library. At the end of this code, a pool of channels is built. And these lines of code specify how a channel is opened and destroyed. Here, the variable connectionMono is of type Mono<Connection>. Mono is a type in Reactor that represents a data stream containing at most a single value. Upon a triggering event, such as 'acquire', the connectionMono emits the value it holds, namely, a Connection object. Then, this emitted object is passed to the 'openChannel' method by this specification using 'flatMap'. The call of the openChannel method returns a value of type Mono<Channel> holding a 'Channel' object. This example also specifies that if a channel is idle for more than 30 seconds, it should be evicted. When a channel is evicted, the destroyHandler's callback function is called and channel.close() is executed.

So, what is the most common bug type in reactive programs? To answer this question, we investigated 29 open-source GitHub projects that meet these criteria. They use popular reactive libraries such as Reactor, RxJava, and Vert.x. They are well-maintained. They have large enough contributors and commits. And they have more than 10 stars, watches, or forks.

We found that blocking-call bugs are the most common bug type in reactive programs. We looked into 189 bug reports related to reactivity and found that more than half of them are blocking-call bugs. Blocking-call bugs are those that block the execution of the program to wait for time-consuming operations such as I/O operations to finish.

Here is a concrete example of a blocking-call bug. This is the same code snippet shown earlier. Suppose we have a channel that has been idle for more than 30 seconds. Then, this channel will be closed by executing channel.close(). Let's also assume that we receive a new request to open a channel. However, the new channel may not open immediately because the close operation, which can take a long time, may block the thread from opening the new channel.

In addition to the finding that blocking-call bugs are the most common in reactive programs, we also found that developers often ignore them. We found that about 40% of the reported blocking-call bugs remain unfixed.

To see why developers often ignore blocking-call bugs, we looked into the comments of the bug reports. Developers mentioned the following.

From the bug patching perspective, developers are essentially indicate that there is insufficient information to accept the patch.

Let's now summarize what we have learned from our investigation of the 29 open-source projects.

Based on these findings, we did the following for the second part of our work.

So, what kind of improvement evidence did we use?

We generated improvement evidence using Java Flight Recorder, a code profiling tool. This tool provides information about CPU usage and heap usage over time. The generated report also shows thread activity, where the green section indicates that the thread is running, the red section indicates that the thread is blocked, and the grey section indicates that the thread is waiting. The upper and lower screenshots show the results before and after applying the fix, respectively. We ran the same test for both cases. The CPU usage at the peak dropped from 92% to 85.6%. We made a pull request with this report to the Apache James project. And two days later, the developer accepted the patch without any further questions. But, is this just a coincidence?

To include, or not to include performance results, that is the question we asked.

To answer that question, we did the following. First, we generated 30 patches across the 29 open-source projects we investigated. I will explain a bit later how we generated those patches. Once we obtained the 30 patches, we randomly assigned 15 patches to include performance results and the other 15 to exclude performance results.

This table shows the results. When performance results were included, the developers accepted the patches in more than half of the cases. They rejected the patches in only two cases. However, when performance results were not included, the results were quite the opposite. The developers rejected the patches in more than half of the cases and accepted the patches in only three cases.

We performed Barnard's exact test and found that the difference between these two groups is statistically significant.

We generate patches using a pattern-based approach. We extracted five common fix patterns from the collected patches.

I know what you are thinking. Why do we not use an LLM to generate patches for blocking-call bugs? LLM might be working.

However, to crack a nut, we do not need a sledgehammer. A nutcracker would be enough.

Alright. This is the first fix pattern. This pattern replaces an expression E of type Mono or Flux with this expression. Both Mono and Flux are types in Reactor that represent a data stream. The difference between Mono and Flux is that Mono contains at most one value, whereas Flux can contain multiple values. This new expression specifies that the subscription to the data stream E should be conducted on a separate worker thread maintained by the boundedElastic scheduler.

![bg right:25% fit](./img/offloaded.png)

This fix pattern matches our running example. After applying this fix pattern, the close operation is offloaded to a separate worker thread, so the main thread can open a new channel without being blocked.

Now, let's look at the second fix pattern. Suppose E here involves a blocking operation such as I/O tasks. The 'just' operator here returns a Mono object containing the value of E. The difference between these two expressions is whether E is evaluated eagerly or lazily. This fix pattern is often used in combination with the first fix pattern. Note here that the 'subscribeOn' operator is added by the first fix pattern.

This slide explains the difference between before and after applying the fix pattern. Before the fix, the 'just' operator eagerly executes the getSomething method, which can block the thread. After the fix, the getSomething method is lazily executed in a separate worker thread. The main thread can be used to do other tasks without being blocked.

Here are the experimental results. We measured CPU usage, heap usage, latency and memory usage before and after applying the fix pattern. Since performance measures can vary even if the same test is run, we ran the same test 10 times. These plots show how the performance measures change at each iteration. The plink line indicate the performance measures before applying the fix pattern, and the purple line show the results after applying the fix pattern. In general, less resources are used after applying the fix pattern. We also ran the regression test suite and found no regression error. This is the result of Apache James.

For the other subjects, similar results were observed.

So, what are the takeaways?

They all sound obvious, aren't they?

Then, let me flip these takeaways into questions. I think the answers are not a definitive yes, which makes us think about the future direction of APR and also debugging research.

# Common Fix Patterns FP3. Reactive Filtering ```diff - 𝐸1.filter(𝐸2 -> 𝐸3) + 𝐸1.filterWhen(𝐸2 -> Mono.fromCallable(() -> 𝐸3)) ``` - 𝐸3 involves a blocking operation --- # Common Fix Patterns FP4. Non-blocking Chaining ```diff - 𝐸.block(); 𝑆 + 𝐸.then(Mono.fromRunnable(() -> 𝑆)) ``` - 𝜏(𝐸) <: Mono - 𝜏(𝑆) <: Mono | Flux --- # Common Fix Patterns FP5. Non-blocking Subscription ```diff - 𝐸.block(); + 𝐸.subscribe(); ``` - 𝜏(𝐸) <: Mono