**Data-flow Analysis, Part I**

**Tamás Szabó**

EMBEDDED SYSTEMS

For a given program:

Does the variable x always have the same value?

Will the value of x be read in the future?

Can the pointer p be null?

Which variables can p point to?

...

Use this information to:

Write safer/faster code

Use less memory

Compile faster

...

**Motivation**

**Problem**

**Practical solution**

Rice’s theorem

states that all interesting questions about the behavior of programs are undecidable.

An

undecidable problem

is a decision problem for which it is known to be impossible to construct a single algorithm that always leads to a correct yes-or-no answer.

Instead of aiming at the exact solution, we settle for

approximations

which are often

conservative

as well.

Practical solution

Lattices

Fixed-point

computation

Control flow

graph

Data Flow

Analyses:

Ingredients:

A lattice

L

that is specific to the analysis

Every node in the CFG is assigned a variable

v

(its value comes from

L

)

Monotone function(s) to relate the value of

v

to the other variables

Based on this we can get a solution which is

Approximate

Conservative

As precise as possible (the least solution)

Liveness analysis

A

variable is live

at a program point if its current value may be

read during the remaining execution

of the program.

Liveness is a property which is considered in the

future

, thus it needs

backward analysis

.

Lattice:

S: power set over the prog. vars

Binary relation: subset

Monotone function:

For every CFG node v we introduce a constraint variable

[[v]]

denoting the set of variables that are live at the program point before that node.

Conservative, because the set may be too large.

Auxiliary definition:

Liveness

Uninitialized

read

Points to

Uninitialized read analysis

Ensures that variables are initialized before they are read.

Initialization is a property which happened in the

past

, thus it needs

forward analysis

.

Lattice:

S: power set over the prog. vars

Binary relation: subset

Monotone function:

For every CFG node v we introduce a constraint variable

[[v]]

denoting the set of variables that are initialized at the given program point.

Conservative, because the set may be too small.

Points-to analysis

Derives the set of possible targets of a given pointer.

Points-to is a property which happened in the

past

, thus it needs

forward analysis

.

Often used as additional information for further analyses.

Here we discuss Andersen's algorithm.

Lattice:

S: the set of possible mappings between prog. vars

Binary relation: subset (wrt. maps)

Monotone function:

For every CFG node v we introduce a constraint variable

[[v]]

denoting points-to mapping between the prog. vars.

Conservative, because the mapping may be too small.

http://www.itu.dk/people/brabrand/UFPE/Data-Flow-Analysis/static.pdf