Compilers: liveness analysis
by Andrew Chang-DeWitt, Wed Jun 04 2025
a variable is live when its value is needed, e.g.:
int f(int x) {
<---- x is live
int a = x + 2;
<---- a & x are live
int b = a * a;
<---- b & x are live
int c = b + x;
<---- c is live
return c;
}
observing this example, we see that each variable lives from the point after it is initialized to the last point where it is used.
[NOTE!] liveness !== scope
why does this matter?
- because if two variables share the same scope, but are not alive at the same time, then this means we can optimize by having these variables share registers
- some other analyses are possible when thinking in terms of liveness, such as defining constraints of non-lexical lifetimes in rustc's borrow checker implementation
a simple overview
to begin, we model the program being compiled as a control flow graph (CFG), composed of individual instructions as nodes & control flow "directions" as edges:
/->[ Move ]<-\
| | |
| v |
| [ Binop ] |
| | |
| v |
| [ If ]----/
| |
| v
| [ Unop ]
| |
| v
\--[ Jump ]
liveness is then associated with edges in our graph. e.g., from a = b + 1
, we
can derive a CFG as:
|
| live: b
v
[ Mov a, b ]
|
| live: a
v
[ Add a, 1 ]
|
| live: a (maybe)
v
we can naively compile this to:
Move rax, rax
Add rax, 1
however, as line 1 should make painfully clear, it would be missing an
important optimization. as seen by the liveness annotation for a
& b
above,
neither variable is alive at the same time as the other, thus we can optimize
by simply skipping the Move rax, rax
instruction entirely.
use[s]
& def[s]
liveness is based on uses & definitions, modeled as sets of variable names related to another variable.
use[s]
: set of variables used by sdef[s]
: set of variables defined by s (most instructions write only 1 variable, a convention enforced by llvm to be exactly one for every instruction)
e.g.:
∵ a = b + c
use[a] = {b,c}
def[a] = {a}
∵ a = a + 1
use[a] = {a}
def[a] = {a}
formal def of liveness
a variable v
is live on edge e
if:
there is:
- a node
n
in the CFG such thatuse[n]
containsv
&& - a directed path from
e
ton
such that for every statements'
on the path,def[s']
does not containv
a naive algorithm
for every variable v:
for every n in use[v]:
walk CFG backward from n until:
a def[v] is found
OR a previously visited node has been reached
AND mark v as live on each edge traversed
this works, but is inefficient (time complexity of O(num edges _ |use[v]| _ num of variables) ~= O(n3)). how can we make it better?