GitHub - romnn/microgpusim at 2f2dca407431f84bf5e83b1592a9ce4efc4fdbbb

box

Prerequisites

Install the latest CUDA 11 toolkit.

wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
# this will not attempt to also install the CUDA driver
sudo sh cuda_11.8.0_520.61.05_linux.run --toolkit --silent --override

Building

cargo build --release --workspace --all-targets
cargo build -p trace --release # single package

Note: To speed up build across workspaces, we strongly recommend to use sccache. Instructions for installing and setting up sccache can be found here.

Trace an application

# using our box memory tracer
LD_PRELOAD=./target/release/libtrace.so <executable> [args]
LD_PRELOAD=./target/release/libtrace.so ./test-apps/vectoradd/vectoradd 100 32

# using the accelsim tracer
./target/release/accelsim-trace ./test-apps/vectoradd/vectoradd 100 32

See the accelsim instructions.

Profile an application

cargo build --release --workspace --all-targets
sudo ./target/release/profile <executable> [args]
sudo ./target/release/validate ./test-apps/simple_matrixmul/matrixmul 32 32

./accelsim/gtx1080/accelsim_mem_debug_trace.txt

Run simulation

cargo run -- --path test-apps/vectoradd/traces/vectoradd-100-32-trace/

Python package

python setup.py develop --force

Testing

cargo test --workspace -- --test-threads=1

Performance profiling

First, configure permissions for running perf on linux.

Check this on how to setup flamegraphs. Here is a TLDR for x86 linux:

sudo apt install linux-tools-common linux-tools-generic linux-tools-$(uname -r)
echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid
echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
cargo install flamegraph
cargo flamegraph --bin=gpucachesim -- --path ./results/vectorAdd/vectorAdd-10000-32/trace

cargo install cargo-criterion
cargo criterion -- vectoradd

valgrind --tool=drd --exclusive-threshold=10 ./target/release/gpucachesim --parallel --non-deterministic 2 ./results/vectorAdd/vectorAdd-dtype-32-length-100/trace/commands.json

Coverage

# install coverage tooling
rustup component add llvm-tools-preview
cargo install grcov

# collect code coverage in tests (todo)
cargo xtask coverage

cargo xtask accelsim convert-config -c ./accelsim/gtx1080/gpgpusim.config -c ./accelsim/gtx1080/gpgpusim.trace.config -o output.config

Publishing traces (used by CI)

rclone sync ./results drive:gpucachesim

Missing features and current limitations

only traces and executes memory instructions and exit instructions
- note: divergent control flow is still captured during the trace by the active thread mask
currently lacks write hit handlers
currently lacks a cycle accurate interconnect
currently lacks texture and constant caches (will panic on the latter instructions)

Goals

step 1: we want to count memory accesses to L1, L2, DRAM
step 2: we want to count cache hits and misses

Name		Name	Last commit message	Last commit date
Latest commit History 468 Commits
.cargo		.cargo
.github/workflows		.github/workflows
CuAssembler		CuAssembler
accelsim		accelsim
benches		benches
benchmarks		benchmarks
examples		examples
exec		exec
gpucachesim		gpucachesim
lit		lit
notebooks		notebooks
playground		playground
plot		plot
profile		profile
src		src
stats		stats
test-apps		test-apps
trace		trace
utils		utils
validate		validate
xtask		xtask
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.tokeignore		.tokeignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
Pipfile		Pipfile
README.md		README.md
WIP.md		WIP.md
available_nsight_metrics.txt		available_nsight_metrics.txt
build.rs		build.rs
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

box

Prerequisites

Building

Trace an application

Profile an application

Run simulation

Python package

Testing

Missing features and current limitations

Goals

About

Releases

Languages

License

romnn/microgpusim

Folders and files

Latest commit

History

Repository files navigation

box

Prerequisites

Building

Trace an application

Profile an application

Run simulation

Python package

Testing

Missing features and current limitations

Goals

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Languages