Programmable
Property-Based Testing

Alperen Keleş · Justine Frank · Ceren Mert
Harrison Goldstein · Leonidas Lampropoulos

University of Maryland

Property-Based Testing, in one slide

1. Write a property


prop_reverse_involutive :: [Int] -> Bool
prop_reverse_involutive xs =
  reverse (reverse xs) == xs
            

2. Generate inputs


quickCheck prop_reverse_involutive
-- +++ OK, passed 100 tests.
            

3. ...or find a counterexample


prop_sort_idempotent :: [Int] -> Bool
prop_sort_idempotent xs =
  sort xs == sort (sort xs)  -- pretend buggy
            

*** Failed! Falsified after 7 tests
and 4 shrinks:
[0, -1]
            

What's actually happening?

A property runner is a loop over a property.

generate, check, generate, check...
shrink, check, shrink, check...
print

...but there's a whole zoo of them

Simple generational

QuickCheck-style
gencheck

Integrated shrinking

Hedgehog / Hypothesis
gencheckshrink

Coverage-guided fuzzing

AFL-style
gencheckfeedbackmutator

Targeted PBT

custom feedback
gencheckfeedback

Combinatorial

systematic coverage

Parallel

many workers
×Ngencheck

Problem

  • Runners are hardcoded within frameworks.
  • QuickCheck does generate-and-check.
    Hypothesis does integrated shrinking.
    AFL does coverage-guided fuzzing...
  • Want to combine two? Rewrite from scratch.
  • Why? Properties are shallowly embedded: tightly coupled to how they run.

Three ways to embed a property

Shallow

status quo
forAll genList $ \xs ->
  forAll genInt  $ \n  ->
    n `elem` xs ==> lookup n xs == Just n
  • simple, fast
  • runner is hard-coded
  • no introspection

Deep

hypothetical
ForAll xs GenList $
  ForAll n GenInt $
    Implies (Elem n xs) (Eq (Lookup n xs) (Just n))
  • introspectable AST
  • painful to write
  • no host-language power

Deferred Binding Abstract Syntax

Turn the property into a data structure

PROPERTY (Rocq, DBAS-encoded)Definition prop_InsertValid :=ForAll "t" gen_bst mutate_bst shrink show (Implies (fun '(t,_) => isBST t) (ForAll "k" arbitrary arbitrary shrink show (ForAll "v" arbitrary arbitrary shrink show (Check (fun '(v,(k,(t,_))) => isBST (insert k v t)) ))))GeneratorMutatorShrinkerPrinterCheckerDERIVED COMPONENTS
diagram

With components at hand, we can write many different runners.

Simple runLoop

Generate, check, generate, check...

Fixpoint runLoop' fuel passed discards :=
  match fuel with
  | O      => ret (mkResult discards false passed [])
  | S fuel' =>
    res <- generate_and_run cprop (size_log2 ..);;
    match res with
    | Normal seed false =>
        ret (mkResult discards true (passed+1)
              (printer cprop (shrinker cprop seed)))
    | Normal _ true   => runLoop' fuel' (passed+1) discards
    | Discard _ _     => runLoop' fuel' passed (discards+1)
    end
  end.

Coverage-guided fuzzing fuzzLoop

AFL-style: seeds that hit new branches survive and breed.

Fixpoint fuzzLoop' fuel passed discards seeds :=
  match fuel with
  | O      => ret (mkResult discards false passed [])
  | S fuel' =>
    let directive := sample seeds in
    input <- generator cprop directive ..;;
    let '(result, fb) := instrumented_runner cprop input in
    match result with
    | Some false => (* fail: shrink + report *) ..
    | Some true  =>
        let seeds' := if useful seeds fb
                      then invest (input, fb) seeds
                      else seeds in
        fuzzLoop' fuel' (passed+1) discards seeds'
    | None       => fuzzLoop' fuel' passed (discards+1) seeds
    end
  end.

Targeted targetLoop

User-defined utility(input) instead of coverage. Hill-climb toward bad inputs.

Fixpoint targetLoop' fuel passed discards seeds :=
  match fuel with
  | O      => ret (mkResult discards false passed [])
  | S fuel' =>
    let directive := sample seeds in
    input <- generator cprop directive ..;;
    match runner cprop input with
    | Some false => (* fail: shrink + report *) ..
    | Some true  =>
        let fb := feedback_function input in
        let seeds' := if useful seeds fb
                      then invest (input, fb) seeds
                      else seeds in
        targetLoop' fuel' (passed+1) discards seeds'
    | None       => targetLoop' fuel' passed (discards+1) seeds
    end
  end.

Stateful statefulLoop

Property is a sequence of steps over a state. Each step's gen sees the current state.

Fixpoint runTrial state fuel step_idx trace :=
  match fuel with
  | O       => ret (None, ..)
  | S fuel' =>
    res <- generate_and_run cprop (state, ..);;
    match res with
    | Normal inputs false =>
        let shrunk := shrink_step 10 state inputs in
        ret (Some (List.rev (info :: trace)), ..)
    | Normal inputs true  =>
        let state' := transition state inputs in
        runTrial state' fuel' (S step_idx) (info :: trace)
    | Discard _ _         => runTrial state fuel' step_idx trace
    end
  end.

Two implementations, two type disciplines

Rocq (QuickChick)

  • Dependently typed
  • Properties carry their types in the AST
  • Runners check well-formedness statically

Inductive Prop' : Type :=
| ForAll : forall {A}, G A ->
           (A -> Prop') -> Prop'
| Check  : bool -> Prop'.
            

Racket (RackCheck)

  • Dynamically typed
  • Macros + reflection make runner-as-interpreter natural
  • Rapid prototyping

(define-property prop-rev
  ([xs (gen:list gen:nat)])
  (equal? (reverse (reverse xs))
          xs))
            

Evaluation: what does DBAS unlock?

  • All six runners, swappable, on the same property
  • Integrated vs external shrinking
  • Parallel PBT
  • Novel runner experiments: seed pools, energy heuristics, ...

Shallow vs Deep


BST shallow vs deep
BST
RBT shallow vs deep
RBT
STLC shallow vs deep
STLC

Seed pools: a DBAS-powered experiment
heading

Six pool strategies, swapped without touching the property.

Heap pool
Heap
FIFO pool
FIFO
FILO pool
FILO
Static singleton
Static singleton
Resetting singleton
Resetting singleton
Monotonic singleton
Monotonic singleton

Programmable PBT

Properties as data structures allow for component-extraction, which allows for writing property-runners in the user-land; essentially leading to Programmable Property-Based Testing libraries.

Thank you. Questions?