CSE271

CSE271: Principles of Programming Languages

Pointer Analysis

Jooyong Yi (UNIST)

2024 Fall
CSE271

Tests for Type Inference

(add-exp-1 "(+ 5 2)" int)
(less-than-2 "(< 11 6)" bool)
(app-2 "((lambda ((x ?)) (+ x 1)) (3))" int)
2024 Fall
CSE271

Question

...
*x = 10;
*y = -10;
z = *x;

printf("z: %d\n", z);
2024 Fall
CSE271

Question

x = y;
*x = 10;
*y = -10;
z = *x;

printf("z: %d\n", z);
2024 Fall
CSE271

Question #2

typedef struct _S {
    int a;
    struct _S *n;
} S;

...
x->n->a = 20;
y->n = NULL;
z = x->n->a;

printf("z: %d\n", z);
2024 Fall
CSE271

Question #2

typedef struct _S {
    int a;
    struct _S *n;
} S;

y = x;
x->n->a = 20;
y->n = NULL;
z = x->n->a;

printf("z: %d\n", z);
2024 Fall
CSE271

Points-to Analysis Example

int *x = (int*) malloc(sizeof(int));  // 𝑙₁ ∈ pts(x)
int *y = (int*) malloc(sizeof(int));  // 𝑙₂ ∈ pts(y)
2024 Fall
CSE271

Points-to Analysis Example

int *x = (int*) malloc(sizeof(int));  // 𝑙₁ ∈ pts(x)
int *y = (int*) malloc(sizeof(int));  // 𝑙₂ ∈ pts(y)
int z = 0;

*x = 10;  
*y = -10;
z = *x;
  • pts(x) = ?
2024 Fall
CSE271

Points-to Analysis Example

int *x = (int*) malloc(sizeof(int));  // 𝑙₁ ∈ pts(x)
int *y = (int*) malloc(sizeof(int));  // 𝑙₂ ∈ pts(y)
int z = 0;

x = y;    // pts(x) βŠ‡ pts(y)

*x = 10;  
*y = -10;
z = *x;
2024 Fall
CSE271

Points-to Analysis Example

int *x = (int*) malloc(sizeof(int));  // 𝑙₁ ∈ pts(x)
int *y = (int*) malloc(sizeof(int));  // 𝑙₂ ∈ pts(y)
int z = 0;

x = y;    // what if pts(x) = pts(y)?

*x = 10;  
*y = -10;
z = *x;
2024 Fall
CSE271

Points-to Analysis Example

int *x = (int*) malloc(sizeof(int));  // 𝑙₁ ∈ pts(x)
int *y = (int*) malloc(sizeof(int));  // 𝑙₂ ∈ pts(y)
int z = 0;

x = y;    // pts(x) = pts(y)

x = &z;  // loc(z) ∈ pts(x)

*x = 10;  
*y = -10;
z = *x;
2024 Fall
CSE271

Points-to Analysis Example

int *x = (int*) malloc(sizeof(int));  // 𝑙₁ ∈ pts(x)
int *y = (int*) malloc(sizeof(int));  // 𝑙₂ ∈ pts(y)
int z = 0;

x = y;    // pts(x) βŠ‡ pts(y)

x = &z;  // loc(z) ∈ pts(x)

*x = 10;  
*y = -10;
z = *x;
  • pts(x): a set of locations that x may point to when the program is executed
2024 Fall
CSE271

Points-to Analysis Example

int *x = (int*) malloc(sizeof(int));  // 𝑙₁ ∈ pts(x)
int *y = (int*) malloc(sizeof(int));  // 𝑙₂ ∈ pts(y)
int z = 0;

x = y;    // pts(x) βŠ‡ pts(y)

x = &z;  // loc(z) ∈ pts(x)

*x = 10;  
*y = -10;
z = *x;
  • Two steps:
    1. Extract pointer constraints from the program
    2. Solve the constraints to compute pts(x) for each pointer variable x
2024 Fall
CSE271

Inferring Type Constraints

proc (f) proc (x) -((f 3), (f x))
  1. t_0 = t_f β†’ t_1
  2. t_1 = t_x β†’ t_2
  3. t_3 = int, where t_3 is the type of (f 3)
  4. t_4 = int, where t_4 is the type of (f x)
  5. t_2 = int
  6. t_f = int β†’ t_3
  7. t_f = t_x β†’ t_4
2024 Fall
CSE271

Andersen’s Points-To Analysis

 p = malloc();  // 𝑙₁ ∈ pts(p) where 𝑙₁ represents the location returned from malloc.
 p = &x;        // loc(x) ∈ pts(p) 
 p = q;         // pts(p) βŠ‡ pts(q)
*p = q;         // pts(*p) βŠ‡ pts(q)
 p = *q;        // pts(p) βŠ‡ pts(*q)
2024 Fall
CSE271

Example

 p = malloc();  // 𝑙₁ ∈ pts(p) where 𝑙₁ represents the location returned from malloc.
 p = &x;        // loc(x) ∈ pts(p) 
 p = q;         // pts(p) βŠ‡ pts(q)
*p = q;         // pts(*p) βŠ‡ pts(q)
 p = *q;        // pts(p) βŠ‡ pts(*q)
p = malloc();
x = y;
x = z;
*p = z;
p = q;
q = &y;
x = *p;
p = &z;
2024 Fall
CSE271

Example

 p = malloc();  // 𝑙₁ ∈ pts(p) where 𝑙₁ represents the location returned from malloc.
 p = &x;        // loc(x) ∈ pts(p) 
 p = q;         // pts(p) βŠ‡ pts(q)
*p = q;         // pts(*p) βŠ‡ pts(q)
 p = *q;        // pts(p) βŠ‡ pts(*q)
p = malloc();   // 𝑙₁ ∈ pts(p)
x = y;          // pts(x) βŠ‡ pts(y)
x = z;          // pts(x) βŠ‡ pts(z)
*p = z;         // pts(*p) βŠ‡ pts(z)
p = q;          // pts(p) βŠ‡ pts(q)
q = &y;         // loc(y) ∈ pts(q)
x = *p;         // pts(x) βŠ‡ pts(*p)
p = &z;         // loc(z) ∈ pts(p)
2024 Fall
CSE271

Constraint Propagation

pts(p) βŠ‡ pts(q)  𝑙 ∈ pts(q)                                                             
-----------------------------      
        𝑙 ∈ pts(p)               
2024 Fall
CSE271

Constraint Propagation

  pts(*p) βŠ‡ pts(q)  loc(y) ∈ pts(p)  𝑙 ∈ pts(q)     
------------------------------------------------------                                 
               𝑙 ∈ pts(y)        
2024 Fall
CSE271

Constraint Propagation

            pts(p) βŠ‡ pts(*q)  loc(y) ∈ pts(q)  𝑙 ∈ pts(y)        
         -------------------------------------------------------                                  
                         𝑙 ∈ pts(p)
2024 Fall
CSE271

Constraint Propagation

pts(p) βŠ‡ pts(q)  𝑙 ∈ pts(q)        pts(*p) βŠ‡ pts(q)  loc(y) ∈ pts(p)  𝑙 ∈ pts(q)     
-----------------------------    ------------------------------------------------------   
        𝑙 ∈ pts(p)                                   𝑙 ∈ pts(y)        


               pts(p) βŠ‡ pts(*q)  loc(y) ∈ pts(q)  𝑙 ∈ pts(y)        
            -------------------------------------------------------   
                            𝑙 ∈ pts(p)
2024 Fall
CSE271

Example

pts(p) βŠ‡ pts(q)  𝑙 ∈ pts(q)        pts(*p) βŠ‡ pts(q)  loc(y) ∈ pts(p)  𝑙 ∈ pts(q)     
-----------------------------    ------------------------------------------------------   
        𝑙 ∈ pts(p)                                   𝑙 ∈ pts(y)        


               pts(p) βŠ‡ pts(*q)  loc(y) ∈ pts(q)  𝑙 ∈ pts(y)        
            -------------------------------------------------------   
                            𝑙 ∈ pts(p)
p = malloc();   // 𝑙₁ ∈ pts(p)                      pts(p) βŠ‡ pts(q)  loc(y) ∈ pts(q)
x = y;          // pts(x) βŠ‡ pts(y)                ----------------------------------
x = z;          // pts(x) βŠ‡ pts(z)                          loc(y) ∈ pts(p)
*p = z;         // pts(*p) βŠ‡ pts(z)
p = q;          // pts(p) βŠ‡ pts(q)
q = &y;         // loc(y) ∈ pts(q)
x = *p;         // pts(x) βŠ‡ pts(*p)
p = &z;         // loc(z) ∈ pts(p)
                // loc(y) ∈ pts(p)
2024 Fall
CSE271

Example

p = malloc();   // 𝑙₁ ∈ pts(p)                      
x = y;          // pts(x) βŠ‡ pts(y)                
x = z;          // pts(x) βŠ‡ pts(z)                          
*p = z;         // pts(*p) βŠ‡ pts(z)
p = q;          // pts(p) βŠ‡ pts(q)
q = &y;         // loc(y) ∈ pts(q)
x = *p;         // pts(x) βŠ‡ pts(*p)
p = &z;         // loc(z) ∈ pts(p)
                // loc(y) ∈ pts(p)
pts(p) = {𝑙₁, loc(y), loc(z)}
pts(q) = {loc(y)}
pts(x) = pts(y) = pts(z) = βˆ…
2024 Fall
CSE271

With Fields

x = malloc();                                                                                          
x->n = malloc();
y = malloc();

y = x;
x->n->a = 20;
y->n = NULL;
z = x->n->a;
x = malloc();                                                                                            
t1 = *x;       
t2 = t1.n;
t2 = malloc();
y = malloc();

y = x;
t3 = *t2;
t3.a = 20;
t4 = *y;
t4.n = NULL;
t5 = *x;
t6 = t5.n;
t7 = *t6;
z = t7.a;
2024 Fall
CSE271

With Fields

 p = malloc();  // 𝑙₁ ∈ pts(p) where 𝑙₁ represents the location returned from malloc.                          
 p = &x;        // loc(x) ∈ pts(p) 
 p = q;         // pts(p) βŠ‡ pts(q)
*p = q;         // pts(*p) βŠ‡ pts(q)
 p = *q;        // pts(p) βŠ‡ pts(*q)
 p.f = q;       // pts(p.f) βŠ‡ pts(q)
 p = q.f;       // pts(p) βŠ‡ pts(q.f)
x = malloc();  // 𝑙₁ ∈ pts(x)                                                                                           
t1 = *x;       // pts(t1) βŠ‡ pts(*x)
t2 = t1.n;     // pts(t2) βŠ‡ pts(t1.n)
t2 = malloc(); // 𝑙₂ ∈ pts(t2)
y = malloc();  // 𝑙₃ ∈ pts(y)

y = x;         // pts(y) βŠ‡ pts(x)
t3 = *t2;      // pts(t3) βŠ‡ pts(*t2)
t3.a = 20;     // pts(t3.a) βŠ‡ βˆ…
t4 = *y;       // pts(t4) βŠ‡ pts(*y)
t4.n = NULL;   // pts(t4.n) βŠ‡ βˆ…
t5 = *x;       // pts(t5) βŠ‡ pts(*x)
t6 = t5.n;     // pts(t6) βŠ‡ pts(t5.n)
t7 = *t6;      // pts(t7) βŠ‡ pts(*t6)
z = t7.a;      // pts(z) βŠ‡ pts(t7.a)
2024 Fall
CSE271

With Fields

x = malloc();  // 𝑙₁ ∈ pts(x)                                                                                           
t1 = *x;       // pts(t1) βŠ‡ pts(*x)
t2 = t1.n;     // pts(t2) βŠ‡ pts(t1.n)
t2 = malloc(); // 𝑙₂ ∈ pts(t2)
y = malloc();  // 𝑙₃ ∈ pts(y)

y = x;         // pts(y) βŠ‡ pts(x)
t3 = *t2;      // pts(t3) βŠ‡ pts(*t2)
t3.a = 20;     // pts(t3.a) βŠ‡ βˆ…
t4 = *y;       // pts(t4) βŠ‡ pts(*y)
t4.n = NULL;   // pts(t4.n) βŠ‡ βˆ…
t5 = *x;       // pts(t5) βŠ‡ pts(*x)
t6 = t5.n;     // pts(t6) βŠ‡ pts(t5.n)
t7 = *t6;      // pts(t7) βŠ‡ pts(*t6)
z = t7.a;      // pts(z) βŠ‡ pts(t7.a)
  • pts(t4.n) ∩ pts(t7) β‰  βˆ…
2024 Fall
CSE271

Something to Think About

p = malloc();   // 𝑙₁ ∈ pts(p)                      
x = y;          // pts(x) βŠ‡ pts(y)                
x = z;          // pts(x) βŠ‡ pts(z)                          
*p = z;         // pts(*p) βŠ‡ pts(z)
p = q;          // pts(p) βŠ‡ pts(q)
q = &y;         // loc(y) ∈ pts(q)
x = *p;         // pts(x) βŠ‡ pts(*p)
p = &z;         // loc(z) ∈ pts(p)
p = malloc();   // 𝑙₁ ∈ pts(p)                      
x = y;          // pts(x) βŠ‡ pts(y)                
x = z;          // pts(x) βŠ‡ pts(z)                          
p = q;          // pts(p) βŠ‡ pts(q)
q = &y;         // loc(y) ∈ pts(q)
x = *p;         // pts(x) βŠ‡ pts(*p)
p = &z;         // loc(z) ∈ pts(p)
*p = z;         // pts(*p) βŠ‡ pts(z)
  • Would the analysis result be the same?
2024 Fall
CSE271

Flow-Insensitive Analysis

  • Andersen’s points-to analysis is flow-insensitive
2024 Fall
CSE271

Flow-Sensitive Analysis

  • Consider the order of statements
  • More expensive (slower) than flow-insensitive analysis
  • More precise than flow-insensitive analysis
2024 Fall
CSE271

Consider

z = &x;
w = &a;
a = 42;
if (a > b) {
  *z = &a;
  y = &b;
} else {
  x = &b;
  y = w;
}
2024 Fall