public void testGcd() {
try {
MathUtils.gcd(Integer.MIN_VALUE, 0);
fail("expecting ArithmeticException");
} catch (ArithmeticException expected) { /* expected */ }
}
↓
public void testGcd(int i, int j) {
try {
boolean complement = !( (i==Integer.MIN_VALUE && j==0) || (i==0 && j==Integer.MIN_VALUE) );
final long actual = MathUtils.gcd(i, j);
preserveIf(complement, () −> new Long[] { actual });
} catch (ArithmeticException expected) {
preserveIf(!complement, () −> new String[] { e.toString() });
} catch (Exception e) {
failToPreserve();
}
}
preserveIf
and other APIs for preservation conditions.Question | Bug ID | Pattern |
---|---|---|
Q1 | Math73 | CC (Complementary Case) |
Q2 | Math105 | EGA (Existing General Assertion) |
Q3 | Math28 | UE (Unexpected Exception) |
Q4 | Lang58 | RI (Reference Implementation) |
Manual Group: Find a correct patch without using Poracle.
Semi-Automated Group: Find a correct patch using Poracle.
# Food for Thought - Ideally, incorrect patches are supposed to be identified by the test suite. ![width:1000px](./img/APR-pipeline.jpg) ---
# Things to Think About ![bg left:33% fit](./img/overfitting.jpg) - The large gap between the plausible patch space and the correct patch space suggests that the quality of the test suite is not good enough for patch validation. ---
# Difficulty of Test Generalization $\forall \vec{v}: T(\vec{v}) = \psi(\vec{v})$ - $\vec{v}$: inputs - $T(\vec{v})$: output of test $T$ when $\vec{v}$ is given - $\psi$: the oracle function ---
# Preservation Condition Example - Math105 bug: assertion --reused-> preservation condition ```java public void testSSENonNegative(double d1, ..., double d6) { try { double[]y={d1,d2,d3}; double[]x={d4,d5,d6}; SimpleRegression reg = new SimpleRegression(); for(inti=0;i<x.length;i++) { reg.addData(x[i], y[i]); } double ret = reg.getSumSquaredErrors(); // Original: assertTrue(ret >= 0.0); preserveIf(ret >= 0.0, () −> new Double[] { ret }); } catch (Exception e) { failToPreserve(); } } ``` # Preservation Condition Example - Lang58 bug: exploits a reference implementation ```java public void testLang300(int n, int m) { // NumberUtils.createNumber("1l"); // Original body // Test with a generalized input String s = "" + ((char) n) + ((char) m) + "l"; String actOut = ""; try { actOut = "" + NumberUtils.createNumber(s).longValue(); } catch (Exception e) { actOut = "Exception"; } // Use Long.valueOf as a reference String refOut = ""; try { refOut = "" + Long.valueOf(s); } catch (Exception e) { refOut = "Exception"; } preserveIf(actOut.equals(refOut), () −> new String[] { actOut }); } ```
# Patch Validation with Preservation Condition ![width:1500px](./img/workflow.jpeg) ---
- Patch reviewing cost reduction - The number of patches to be reviewed after filtering
# Patch Reviewing Cost Reduction ![bg left:33% fit](./img/APR-design-find-and-filter.jpg) - JAID returns a ranked list of plausible patches. - We applied Poracle to the obtained ranked list of plausible patches and compared the number of patches to be reviewed before and after filtering. --- # Patch Reviewing Cost Reduction ![width:1200px](./img/cmp-cost.jpg) ---
- For each question, participants were divided into two groups.
--- # Ablation Study ![width:700px](./img/CoincidentallyRejected.jpg) # Example - Failing test for Math95 of Defects4J ```java public void testSmallDegreesOfFreedom() { FDistributionImpl fd = new FDistributionImpl(1.0, 1.0); double p = fd.cumulativeProbability(0.975); double x = fd.inverseCumulativeProbability(p); assertEquals(/* expected output */ 0.975, x, /* delta */ 1e-5); } ``` --- # Example - Generalizing the failing test ```java public void testSmallDegreesOfFreedom() { FDistributionImpl fd = new FDistributionImpl(1.0, 1.0); double p = fd.cumulativeProbability(0.975); double x = fd.inverseCumulativeProbability(p); assertEquals(/* expected output */ 0.975, x, /* delta */ 1e-5); } ``` ↓ ```java public void testSmallDegreesOfFreedom(double d1, double d2, double d3) { FDistributionImpl fd = new FDistributionImpl(d1, d2); double p = fd.cumulativeProbability(d3); double x = fd.inverseCumulativeProbability(p); assertEquals(/* expected output */ ________, x, /* delta */ 1e-5); } ``` --- # Preservation Condition Example - Math95 bug: An unexpected exception occurs. ```java public void testSmallDegreesOfFreedom(double d1, double d2, double d3) { try { FDistributionImpl fd = new FDistributionImpl(d1, d2); double p = fd.cumulativeProbability(d3); double x = fd.inverseCumulativeProbability(p); preserveIf(/* preservation condition */ true, /* outputs to compare */ () -> new Double[] {x}) } catch (Exception e) { failToPreserve(); } } ``` ![width:900px](./img/preserveIf.png) --- # Classification Performance ![width:700px](./img/cmp-bert-lr.jpg) - BERT-LR: Haoye Tian et al., "Evaluating representation learning of code changes for predicting patch correctness in program repair", ASE 2020 --- # Developer Patches ![width:700px](./img/developer-patches.jpg) --- # Correct Answer Ratio | Top 50% Students | Bottom 50% Students | |:---:|:---:| | ![width:600px](./img/high_exp_scores.jpg) | ![width:600px](./img/low_exp_scores.jpg) | --- # Manual Time Cost | All students | Students who submitted correct answers | |:---:|:---:| | ![width:600px](./img/overall_time.jpg) | ![width:600px](./img/overall_time_only_correct.jpg) | --- # Sentiment | All students | Top 50% students | |:---:|:---:| | ![width:600px](./img/poracle_manual_experience_bar.jpg) | ![width:600px](./img/poracle_manual_experience_bar_grade.jpg) |