CSE271

CSE271: Principles of Programming Languages

Garbage Collection

Jooyong Yi (UNIST)

2024 Fall
CSE271

Question

  (value-of π‘Ÿπ‘Žπ‘‘π‘œπ‘Ÿ ρ₁ σ₁) = ((proc-val (procedure π‘£π‘Žπ‘Ÿ π‘π‘œπ‘‘π‘¦ ρ₂)), Οƒβ‚‚)                                                   
  (value-of π‘Ÿπ‘Žπ‘›π‘‘ ρ₁ Οƒβ‚‚) = (π‘£π‘Žπ‘™, σ₃)
---------------------------------------------------------------------------- 𝑙 βˆ‰ dom(σ₃)
  (value-of (call-exp π‘Ÿπ‘Žπ‘‘π‘œπ‘Ÿ π‘Ÿπ‘Žπ‘›π‘‘) ρ₁ σ₁) = (value-of π‘π‘œπ‘‘π‘¦ [π‘£π‘Žπ‘Ÿ=𝑙]ρ₂ [𝑙=π‘£π‘Žπ‘™]σ₃)
  • What happens to 𝑙 ∈ σ₃ once the procedure returns?
2024 Fall
CSE271

Manual Memory Management

[Concrete Syntax]
Expression ::= ...
            | newRef (Expression)
            | deref (Expression)
            | setRef (Expression, Expression)
            | freeRef (Expression)
[Abstract Syntax]
Expression ::= ...
            |  newref-exp (exp)
            |  deref-exp (exp)
            |  setref-exp (exp1 exp2)
            |  freeref-exp (exp)
2024 Fall
CSE271

Spec of freeRef

      (value-of 𝑒π‘₯𝑝 ρ Οƒβ‚€) = (𝑙, σ₁)
--------------------------------------------------
(value-of (freeRef-exp 𝑒π‘₯𝑝) ρ Οƒβ‚€) = (__, [𝑙=βŠ₯]σ₁)
  • ([𝑙=βŠ₯]σ₁)(𝑙) = βŠ₯ (not defined)
2024 Fall
CSE271

Difficulties of Manual Memory Management

  • Memory Leak: A memory location is not freed after it is no longer needed.
  • Double Free: A memory location is freed more than once.
  • Use After Free (Dangling Pointer): A memory location is accessed after it is freed.
2024 Fall
CSE271

Automatic Memory Management: Garbage Collection

  1. Pause the program.
  2. Identify garbage (memory locations that are no longer accessible).
  3. Free the garbage.
2024 Fall
CSE271

Example

begin 
(let f = proc (x) (+ x 1)
 in let a = f 1
    in (+ a 2));
...
end    
  • ρ = [f=𝑙₁][a=𝑙₃]
  • Οƒ = [𝑙₁=(proc-val (procedure x <<(+ x 1)>> βˆ…))][𝑙₂=1][𝑙₃=2]
2024 Fall
CSE271

Example

begin 
(let f = proc (x) (+ x 1)
 in let a = f 1
    in (+ a 2));
...
end    
  • ρ = [f=𝑙₁][a=𝑙₃]
  • Οƒ = [𝑙₁=(proc-val (procedure x <<(+ x 1)>> βˆ…))][𝑙₂=1][𝑙₃=2]
2024 Fall
CSE271

Mark-and-Sweep Garbage Collection

  1. Mark: Identify all accessible memory locations.
begin 
(let f = proc (x) (+ x 1)
 in let a = f 1
    in (+ a 2));
...
end    
  • ρ = [f=𝑙₁][a=𝑙₃]
  • Οƒ = [𝑙₁=(proc-val (procedure x <<(+ x 1)>> βˆ…))][𝑙₂=1][𝑙₃=2]
2024 Fall
CSE271

Mark-and-Sweep Garbage Collection

  1. Mark: Identify all accessible memory locations.
  2. Sweep: Free all unmarked memory locations.
begin 
(let f = proc (x) (+ x 1)
 in let a = f 1
    in (+ a 2));
...
end    
  • ρ = [f=𝑙₁][a=𝑙₃]
  • Οƒ = [𝑙₁=(proc-val (procedure x <<(+ x 1)>> βˆ…))][𝑙₂=1][𝑙₃=2]
2024 Fall
CSE271

Mutable Pair

(define-datatype mutpair mutpair?
  (a-pair (left-loc reference?) (right-loc reference?)))

(define make-pair
  (lambda (val1 val2)
    (a-pair (newref val1) (newref val2))))

(define left
  (lambda (p)
    (cases mutpair p
      (a-pair (left-loc right-loc)
        (deref left-loc)))))

(define set-left 
  (lambda (p val)
    (cases mutpair p
      (a-pair (left-loc right-loc)
        (setref! left-loc val)))))
2024 Fall
CSE271

GC Example

ρ = [x = 𝑙₁, y = 𝑙₂]

Οƒ = [𝑙₁ = 1, 
     𝑙₂ = (a-pair (𝑙₄ 𝑙₅)),
     𝑙₃ = 𝑙₄,
     𝑙₄ = (proc-val (procedure x 𝐸 [z = 𝑙₆])),
     𝑙₅ = 4,
     𝑙₆ = 5,
     𝑙₇ = π‘™β‚ˆ]
2024 Fall
CSE271

GC Example

ρ = [x = 𝑙₁, y = 𝑙₂]

Οƒ = [𝑙₁ = 1, 
     𝑙₂ = (a-pair (𝑙₄ 𝑙₅)),
     𝑙₃ = 𝑙₄,
     𝑙₄ = (proc-val (procedure x 𝐸 [z = 𝑙₆])),
     𝑙₅ = 4,
     𝑙₆ = 5,
     𝑙₇ = π‘™β‚ˆ]
GC(ρ, Οƒ) = [𝑙₁ = 1, 
            𝑙₂ = (a-pair (𝑙₄ 𝑙₅)),
            𝑙₄ = (proc-val (procedure x 𝐸 [z = 𝑙₆])),
            𝑙₅ = 4,
            𝑙₆ = 5]
2024 Fall
CSE271

GC Formalization

  • : the set of locations in that are reachable from the variables in
2024 Fall
CSE271

GC Formalization

  • : the smallest set of locations that satisfies the following rules:

-------------------- x ∈ dom(ρ)
 ρ(x) ∈ reach(ρ, Οƒ)
2024 Fall
CSE271

GC Formalization

  • : the smallest set of locations that satisfies the following rules:
                                      𝑙 ∈ reach(ρ, Οƒ)   Οƒ(𝑙) = 𝑙'
-------------------- x ∈ dom(ρ)     -----------------------------
 ρ(x) ∈ reach(ρ, Οƒ)                       𝑙' ∈ reach(ρ, Οƒ)
2024 Fall
CSE271

GC Formalization

  • : the smallest set of locations that satisfies the following rules:
                                      𝑙 ∈ reach(ρ, Οƒ)   Οƒ(𝑙) = 𝑙'
-------------------- x ∈ dom(ρ)     -----------------------------
 ρ(x) ∈ reach(ρ, Οƒ)                       𝑙' ∈ reach(ρ, Οƒ)


   𝑙 ∈ reach(ρ, Οƒ)     Οƒ(𝑙) = (a-pair (𝑙₁, 𝑙₂))
----------------------------------------------
        {𝑙₁, 𝑙₂} βŠ† reach(ρ, Οƒ)
2024 Fall
CSE271

GC Formalization

  • : the smallest set of locations that satisfies the following rules:
                                      𝑙 ∈ reach(ρ, Οƒ)   Οƒ(𝑙) = 𝑙'
-------------------- x ∈ dom(ρ)     -----------------------------
 ρ(x) ∈ reach(ρ, Οƒ)                       𝑙' ∈ reach(ρ, Οƒ)


   𝑙 ∈ reach(ρ, Οƒ)     Οƒ(𝑙) = (a-pair (𝑙₁, 𝑙₂))
----------------------------------------------
        {𝑙₁, 𝑙₂} βŠ† reach(ρ, Οƒ)


    𝑙 ∈ reach(ρ, Οƒ)     Οƒ(𝑙) = (proc-val (procedure x 𝐸 ρ'))
--------------------------------------------------------------
        reach(ρ', Οƒ) βŠ† reach(ρ, Οƒ)
2024 Fall
CSE271

Undecidability of GC

begin
𝐸₁;
𝐸₂
end
  • Assume that GC is performed after evaluating 𝐸₁ and (ρ, σ₁) = Οƒβ‚‚
  • Identifying garbage precisely is impossible. Given a location 𝑙, it is undecidable whether 𝑙 is garbage or not.
2024 Fall
CSE271

Undecidability of GC

begin
𝐸₁;
𝐸₂
end
  • Assume that GC is performed after evaluating 𝐸₁ and (ρ, σ₁) = Οƒβ‚‚
  • Identifying garbage precisely is impossible. Given a location 𝑙, it is undecidable whether 𝑙 is garbage or not.
    • Peforimg GC precisely is an undecidable problem.
2024 Fall
CSE271

Proving the Undecidability of GC

  • Halting Problem: Given a program 𝑃, decides whether 𝑃 halts.
2024 Fall
CSE271

Proving the Undecidability of GC

  • Suppose that we have an algorithm that performs garbage collection precisely.
  • Then, we can construct an algorithm as follows:
    1. takes a program as input and executes the following program:
    let x = newRef(1) in P; deref(x)
    
    1. Invoke right before evaluating and obtain a set that contains all locations that will be used in the future.
    2. If contains the location of , then we know that terminates. Otherwise, does not terminate.
2024 Fall