To Be Determined - Exploring Dataflow Analysis in the Rust Compiler

Recently I’ve been working in static analysis land and as a part of that have been familiarizing myself with data flow analysis. I look at a fair amount of MIR and so decided to delve into the rustc_mir_dataflow crate to see how these things are handled in the rust compiler. There is a helpful introduction to this topic in the rustc dev guide, and this post fleshes things out a bit.

Dataflow Analysis

Briefly, dataflow analysis helps one understand the way data flows throughout a program. There are intraprocedural and interprocedural approaches. Some types of applications dataflow analysis include:

constant propagation / folding: for constant variables, replacing uses of those variables, and expressions containing those variables, with constants
live variable analysis: determining when a variable may be used in the future – useful for identifying dead code
available expression analysis: determining which sets of expressions need not be recomputed
reaching analysis: determining which definitions may reach a given point

Dataflow analysis is commonly performed on a control flow graph (CFG). Rust’s MIR representation is basically a control flow graph structure, and so rustc_mir_dataflow employs common analysis algorithms to determine some of your program’s properties.

Dataflow analysis on CFG’s is usually performed using fixed point iteration:

initialize some start state associated with the entry point of the CFG
for each node in the CFG, compute an input state using a meet (or join) function to combine incoming states from predecessor nodes and computing an output state using a transfer function, applied to the input state
once all states of all nodes are unchanged (a fixed point has been reached), terminate

Once iteration has terminated, now you have a mapping from each node to its input and output states (also called facts) describing the analysis results at each point in the program.

Some examples of states:

which variables are live?
which expressions are available and precomputed?
what definitions have established, constant values?

Some dataflow analysis problems lend themselves to efficient transfer function representations, and a technique called gen-kill analysis may be utilized in these cases.

Whereas before input and output were defined with:

input = meet([facts[predecessor] for predecessor in node.preds])
output = transfer(input)

Gen-kill problems define input and output as:

input = union(facts[predecessor] for predecessor in node.preds)
output = union((input - kill[current_node]), gen[current_node])

where kill and gen map to the portions of state that are removed and added at a particular point in the flow. For example, in live variable analysis gen[statement] would describe the set of variables that are used at statement and kill[statement] would describe the set of variables defined at statement.

Let’s take a look at what is perhaps the simplest instance of a gen-kill analysis in the rust compiler – determining MaybeStorageDead. StorageDead describes when a variable can have its memory freed. Knowing when a variable is dead is important for the compiler to know so that it may avoid using memory that is unallocated.

First, MaybeStorageDead keeps track of the variables that are always live; this includes the return value of a function and all of its arguments:

#[derive(Clone)]
pub struct MaybeStorageDead {
	always_live_locals: BitSet<Local>,
}

impl MaybeStorageDead {
	pub fn new(always_live_locals: BitSet<Local>) -> Self {
		MaybeStorageDead { always_live_locals }
	}
}

A BitSet is used as it is a space-efficient and performant representation of the state (or domain) of the analysis problem. Bit sets and bit vectors are common ways to represent the domain in gen-kill analysis problems.

Next, MaybeStorageDead implements the AnalysisDomain trait, which defines what the starting state value is for the analysis problem. In this case, it’s an empty bit set (all zeros) where each bit represents one of the locals in the MIR body. The trait also requires one to define mutations to the state upon entry to the starting block. This flips the bits representing every non-argument and non-return value to 1 – they’re presumed-dead by default:

impl<'tcx> crate::AnalysisDomain<'tcx> for MaybeStorageDead {

	type Domain = BitSet<Local>;
	const NAME: &'static str = "maybe_storage_dead";

	fn bottom_value(&self, body: &mir::Body<'tcx>) -> Self::Domain {
		// bottom = live
		BitSet::new_empty(body.local_decls.len())
	}

	fn initialize_start_block(
        &self, body: &mir::Body<'tcx>, on_entry: &mut Self::Domain) {
		assert_eq!(
            body.local_decls.len(), 
            self.always_live_locals.domain_size()
        );
		// Do not iterate on return place and args,
                // as they are trivially always live.
		for local in body.vars_and_temps_iter() {
			if !self.always_live_locals.contains(local) {
				on_entry.insert(local);
			}
		}
	}
}

Next, it’s time to implement the necessary functions for the GenKillAnalysis trait. This involves defining the effects of particular statements, terminators, and calls. Some of these methods have default implementations.

impl<'tcx> crate::GenKillAnalysis<'tcx> for MaybeStorageDead {
    type Idx = Local;

    fn statement_effect(
        &self,
        trans: &mut impl GenKill<Self::Idx>,
        stmt: &mir::Statement<'tcx>,
        _: Location,
    ) {
        match stmt.kind {
            StatementKind::StorageLive(l) => trans.kill(l),
            StatementKind::StorageDead(l) => trans.gen(l),
            _ => (),
        }
    }

    fn terminator_effect(
        &self,
        _trans: &mut impl GenKill<Self::Idx>,
        _: &mir::Terminator<'tcx>,
        _: Location,
    ) {
        // Terminators have no effect
    }

    fn call_return_effect(
        &self,
        _trans: &mut impl GenKill<Self::Idx>,
        _block: BasicBlock,
        _return_places: CallReturnPlaces<'_, 'tcx>,
    ) {
        // Nothing to do when a call returns successfully
    }
}

Here, the only interesting part is in statement_effect. When a variable becomes live, it’s added to the kill set. When it becomes dead, it’s added to the gen set.

And that’s all there is to it. Now we can take an arbitrary program and run analysis on it, and inspect the state at each program point. Some of these comments originate from the compiler code:

// The set of locals in a MIR body that do not have 
// `StorageLive`/`StorageDead` annotations.
// These locals have fixed storage for the duration of the body.
// This is a bit set with all locals that don't have an explicit `StorageLive`
// or `StorageLocal` set removed from the bit set (i.e. always live, or 0)
let always_live_locals = always_storage_live_locals(&mir);

let mut maybe_dead = MaybeStorageDead::new(always_live_locals)
    .into_engine(tcx, &mir)
    .iterate_to_fixpoint()
    .into_results_cursor(&mir);

for (bb, _block) in mir.basic_blocks.iter_enumerated() {
    maybe_dead.seek_after_primary_effect(mir.terminator_loc(bb));
    let state = maybe_dead.get();
    println!("maybe dead: {:?}", state);
}

Let’s test this on a contrived program:

fn test() {
    let mut x = 2;
    let y = &mut x;
    if *y < 10 {
        *y = *y + 1;
    }
    *y = 3;
}

Let’s look at the MIR for this program block by block, along with the results of our dataflow analysis:

fn test() -> () {
	// Live (_0 is always the MIR return value, and is always live)
    let mut _0: (); 

	// Presumed dead to start (via `initialize_start_block`)
    let mut _1: i32;
    let _3: ();
    let mut _4: bool;
    let mut _5: i32;
    let mut _6: i32;

    // Will have no associated `StorageDead` or `StorageLive` annotation,
    // so always presumed live
    let mut _7: (i32, bool);
    
    scope 1 {
        debug x => _1;
	    // Presumed dead to start (via `initialize_start_block`)
        let _2: &mut i32;
        scope 2 {
            debug y => _2;
        }
    }
    // ...

This is the set of local declarations for the program. A combination of always-live locals and presumed-dead locals yield the state [_1, _2, _3, _4, _5, _6] when entering bb0.

bb0: {
	StorageLive(_1);
	_1 = const 2_i32;
	FakeRead(ForLet(None), _1);
	StorageLive(_2);
	_2 = &mut _1;
	FakeRead(ForLet(None), _2);
	StorageLive(_3);
	StorageLive(_4);
	StorageLive(_5);
	_5 = (*_2);
	_4 = Lt(move _5, const 10_i32);
	StorageDead(_5);
	switchInt(move _4) -> [0: bb2, otherwise: bb1];
}
// bb0 (output): [_5, _6]

Here we can see that all previously-presumed dead locals except _5 and _6 have their own StorageLive statement not paired by a StorageDead statement, and so upon leaving this block only those locals are presumed dead (after having been added to the kill set).

In one branch of the bb0 control flow (bb1), _6 becomes live, and so is part of the kill set and removed from the state:

bb1: {
	StorageLive(_6);
	_6 = (*_2);
	_7 = CheckedAdd(_6, const 1_i32);
	assert(
            !move (_7.1: bool), 
            "attempt to compute `{} + {}`, which would overflow", 
            move _6, const 1_i32) -> [success: bb3, unwind: bb6];
}
// bb1 (output): [_5]

but in the other branch (bb2), it does not become live, and so remains in the set of locals presumed dead:

bb2: {
 goto -> bb4;
}
// bb2 (output): [_5, _6]

Coming from bb1 where _6 became live, it now becomes dead in bb3 and so is added to the gen set and the state leaving that block:

bb3: {
	(*_2) = move (_7.0: i32);
	StorageDead(_6);
	_3 = const ();
	goto -> bb5;
}
// bb3 (output): [_5, _6]

Coming from bb2, nothing changes in bb4 because no variables are declared live or dead:

bb4: {
	_3 = const ();
	goto -> bb5;
}
// bb4 (output): [_5, _6]

Now, at the end of the function, all of the remaining variables _1 through _4 are declared dead and so are added to the gen set and outgoing state:

bb5: {
	StorageDead(_4);
	StorageDead(_3);
	(*_2) = const 3_i32;
	_0 = const ();
	StorageDead(_2);
	StorageDead(_1);
	return;
}
// bb5 (output): [_1, _2, _3, _4, _5, _6]

Finally, if the checked addition in bb1 fails (where the output state is [_5]), bb6 is entered where no variables are declared live are dead, and so the state remains unchanged:

bb6 (cleanup): {
	resume;
}
// bb6 (output): [_5]

I’ve skipped over a bunch of the terminology. Those are covered in depth in compiler courses that touch upon dataflow analysis; you can search for terms like bottom, lattice, semilattice, joins, transfer functions, and more. The Wikipedia page on the topic is a decent place to dig into more.

As a way of wrapping up, I wanted to give a list of where else the Rust compiler utilizes this form of dataflow analysis.

In addition to this (incomplete) list of gen-kill problems, the rust compiler also has some analyses modeled using the more generic albeit less-performant Analysis trait. For example, as of writing this post, the latest version of constant propagation, ConstAnalysis, is built on top of ValueAnalysis (something surprising to me is that this isn’t also set up as a gen-kill analysis, particularly the portion formulating reachability, which is a go-to use-case for gen-kill analysis. Please let me know why, if you know).

The code for the post is available here.