Using Seahorn
SeaHorn is an automated analysis framework for LLVM-based languages. For users SeaHorn provides a push-button verification tool, and for researchers its modular design provides an extensible and customizable framework for experimenting with new software verification techniques. In contrast to KLEE (see previous post) which uses symbolic execution, SeaHorn uses software model checking and abstract interpretation. This, we hope, will provide better results in some cases.
In this post we will do a walk-through example of using SeaHorn to verify a Rust program. We will:
- Compile the program to generate bitcode (LLVM IR);
- Run rvt-patch-llvm on the bitcode;
- Run SeaHorn on the patched bitcode; and
- Run SeaHorn again to produce a trace of the bug.
(This post borrows heavily from the Using KLEE post)
Note:
The recommended way to use SeaHorn is with the propverify
library and the
cargo-verify
tool as described here.
This note describes how to use the SeaHorn directly in case you wonder how
cargo-verify
works or want to add support for a different tool.
This is going to be a fairly low-level description and most people will be
happier not knowing how the sausage is made.
(Since there are so many low-level details, there is a risk that we will forget
to update this document so you may have to use the source.)
A small test program
This code is in demos/simple/seahorn/src/main.rs and the shell commands in this file are in demos/simple/seahorn/verify.sh.
use verification_annotations::prelude::*;
fn main() {
let a = u32::abstract_value();
let b = u32::abstract_value();
verifier::assume(1 <= a && a <= 1000);
verifier::assume(1 <= b && b <= 1000);
let r = a*b;
verifier::assert!(1 <= r && r < 1000000);
}
The Rust compiler, SeaHorn, and everything else we need is already installed in the Docker image provided in the RVT git repo, so start a Docker container from the image by running:
docker/run
# You should now be inside the Docker container
cd demos/simple/seahorn
All remaining commands in this file will be run in this docker container.
(It is usually easiest to run this in one terminal while using another terminal, or a separate editor, to edit the files.)
Compiling Rust for verification
The Rust compiler works in four stages: first, it compiles Rust source code to HIR (the high-level IR); HIR is converted to MIR (the mid-level IR); MIR is converted to LLVM IR (the low-level virtual machine IR); and, finally, it compiles LLVM IR down to machine code and links the result.
When using an LLVM-based tool like SeaHorn, we need to modify this behaviour
so that the LLVM IR (instead of machine code) is linked and saved to a file.
This can be done by passing extra flags to rustc
, instructing it to link the
bitcode files.
When using cargo
these flags are passed through the RUSTFLAGS
environment
variable.
export RUSTFLAGS="-Clto -Cembed-bitcode=yes --emit=llvm-bc $RUSTFLAGS"
We also have to pass some configuration flags to cargo
and rustc
to
configure the verification-annotations
crate correctly.
export RUSTFLAGS="--cfg=verify $RUSTFLAGS"
Our goal is to find bugs so we turn on some additional error checking in the compiled code to detect arithmetic overflows.
export RUSTFLAGS="-Warithmetic-overflow -Coverflow-checks=yes $RUSTFLAGS"
And, when we find a bug, we want to report it as efficiently as possible so we make sure that the program will abort if it panics.
export RUSTFLAGS="-Zpanic_abort_tests -Cpanic=abort $RUSTFLAGS"
With all those definitions, we can now run
cargo build --features=verifier-seahorn
Depending on which platform you are running on, the resulting LLVM bitcode file
may be placed placed in target/debug/deps/try_seahorn.bc
(OSX) or in a file with
a name like target/debug/deps/try_seahorn-*.bc
(Linux).
At least for simple cases, we can refer to both files as
target/debug/deps/try_seahorn*.bc
and not worry about the exact filename.
Running SeaHorn
Having built a bitcode file containing the program and all the libraries that it depends on, we now need to fix a few things in the bitcode file to make it suitable for SeaHorn:
- Delete the
main
function rustc added. In the LLVM IR code that the Rust compiler generates themain
function from main.rs is renamed to something like_ZN11try_seahorn4main17h8733453d83f64f5aE
. The entry point to the LLVM IR code is a newmain
function that does some initialisation and then calls the original main function. This initialisation is not handled very well by SeaHorn at the moment, so we just delete themain
function and tell SeaHorn the entry point is the original main function with the mangled name. Note that it is not enough to just tell SeaHorn to use the mangled name as the entry point, we also need to deletemain
, otherwise SeaHorn gets confused. - Remove the body of
_eprint
and_print
. Those two functions are the internal implementations of Rust’s std printing macros. SeaHorn can’t handle those functions at the moment. - Replace panic handling.
Instead of the regular panic handling we call
__VERIFIER_error
which SeaHorn will report as an error (if reachable).
The rvt-patch-llvm
tool can do all that for us:
rvt-patch-llvm -o try_seahorn.patch.bc --seahorn -vv target/debug/deps/try_seahorn-*.bc
The bitcode file try_seahorn.patch.bc
is now ready to be processed by
SeaHorn.
To run SeaHorn we use a configuration file that we checkout from
verify-c-common.
The Docker image is configured so that SEAHORN_VERIFY_C_COMMON_DIR
points to
that checkout.
In addition, because we deleted the main
function rustc added to the bitcode
(which does some runtime initialisation and then calls the original main
function) we have to tell SeaHorn the name of the entry function.
To find this name execute this:
llvm-nm --defined-only try_seahorn.patch.bc | grep main | cut -d ' ' -f3
The output should look something like this:
_ZN11try_seahorn4main17h8733453d83f64f5aE
(the suffix after “17h” might be different)
We can now put everything together and run SeaHorn like this:
sea yama -y ${SEAHORN_VERIFY_C_COMMON_DIR}/seahorn/sea_base.yaml bpf --entry="_ZN11try_seahorn4main17h8733453d83f64f5aE" try_seahorn.patch.bc
This command will produce a long output that should end like this:
sat
************** BRUNCH STATS *****************
[...]
************** BRUNCH STATS END *****************
The “sat” indicates that SeaHorn was able to find an execution of the program that violates the assertion (or panics). SeaHorn can produce a trace of the execution like this:
sea yama -y ${SEAHORN_VERIFY_C_COMMON_DIR}/seahorn/sea_cex_base.yaml bpf --entry="_ZN11try_seahorn4main17h8733453d83f64f5aE" try_seahorn.patch.bc
As SeaHorn is an LLVM-based tool, the trace is an LLVM IR trace. Moreover, SeaHorn does a few transformations and optimisations to the program, and the trace is over the resulting program. Fortunately, SeaHorn preserves some debug information that allows it to print source code line-numbers along the trace (in Cyan).
The end of the trace is usually code from rust/src/libcore/fmt/mod.rs that
handles the formatting for a print function that reports the assertion that
failed.
Scroll up the trace until you find the first line-number that is from main.rs.
This should be the line where the assertion that failed is.
In this case it is “[src/main.rs:10]”.
To find the value of r
that caused the assertion to fail, keep scrolling up
until you find “[src/main.rs:9]” (the line where r
is assigned its value):
%_18.0.i.i = extractvalue { i32, i1 } %_9, 0, !dbg !180
%_18.0.i.i (0xf4240:bv(32)) [src/main.rs:9]
r = _18.0.i.i (0xf4240:bv(32))
This violates the r < 1000000
part of the assertion (0xf4240 is 1000000).
If you keep scrolling up you can also find the values that were assigned to a
and b
:
[...]
%_5 = call i32 @__VERIFIER_nondet_u32() #12, !dbg !156
%_5 (1000:bv(32)) [/home/rust-verification-tools/verification-annotations/src/seahorn.rs:92]
a = _5 (1000:bv(32))
enter: abstract_value<u32>
enter: default
self = i32 0 0
%_6 = call i32 @__VERIFIER_nondet_u32() #12, !dbg !164
%_6 (1000:bv(32)) [/home/rust-verification-tools/verification-annotations/src/seahorn.rs:92]
b = _6 (1000:bv(32))
[...]
Unfortunately, SeaHorn reports a line number from the verification-annotations
crate for those assignments.
In general, you can look for calls to __VERIFIER_nondet_<type>
in the trace to
find the concrete values SeaHorn picked for
verifer_nondet
/abstract_value
/symbolic
.
Handling larger programs
The -Clto
flag we pass to the Rust compiler conflicts with some default
configurations of the compiler.
If you add the following dependency on serde
to Cargo.toml
serde = { version = "1.0", features = ["derive"] }
you will likely get a linking error
error: cannot prefer dynamic linking when performing LTO
The (incredibly obscure) workaround for this is to specify a target explicitly. (I have no idea why this helps!)
The first step is to figure out what target you are currently using. You want the “default host” reported by “rustup show”
$ rustup show | grep Default
Default host: x86_64-unknown-linux-gnu
Now that we know the target, we add --target
to the cargo build
command
cargo build --features=verifier-seahorn --target=x86_64-unknown-linux-gnu
This changes where the bitcode file is put.
Instead of
target/debug/deps/try_seahorn*.bc
it is put in
target/x86_64-unknown-linux-gnu/debug/deps/try_seahorn*.bc
.
From here the process is the same.