DNNMark: A Deep Neural Network Benchmark Suite for GPUs

DNNMark is a highly configurable, extensible, and flexible benchmark suite consisting of a rich set of Deep Neural Network primitives as benchmarks. It not only provides individual benchmarks of DNN layers such as convolution, pooling, LRN, and etc., but also allow users to build their own preferred complicated models with combinations of different layers for GPU-specific benchmarking. All the DNN primitives are implemented using cuDNN and cuBlas library and tested on Nvidia K40 with CUDA 8.0.

DNNMark is available HERE.

Hetero-Mark: A Benchmark Suite for OpenCL 2.0

Hetero-Mark is a benchmarking suite that exploits the upcoming features of Heterogenous System Architecture (HSA) with OpenCL 2.0 through applications from various domains including signal processing, cybersecurity, machine learning, BigData etc. All the applications in the suite include an OpenCL 1.2 and an OpenCL 2.0 implementation. The OpenCL 1.2 implementations are used as a baseline to compare performance. The performance is tested on AMD A10-7850K Radeon R7, Kaveri APU with GPU Driver version 1642.5 (VM).

More info about Hetero-Mark is available HERE.

NUPAR: A Benchmark Suite for Modern GPU Architectures

NUPAR is a benchmark suite for modern GPU architectures. The NUPAR applications are specifically designed to stress new hardware and software features that include: nested parallelism, concurrent kernel execution, shared host-device memory and new instructions for precise computation and data movement. These applications belong to a number of different scientific and commercial computing domains and are written in OpenCL and CUDA.

NUPAR is available HERE.

Multi2Sim: a CPU-GPU Simulator for Heterogeneous Computing

Multi2Sim is a simulation framework for CPU-GPU heterogeneous computing written in C. It includes models for superscalar, multithreaded, and multicore CPUs, as well as GPU architectures.

Please visit to download the latest version of the simulator, user guide, and benchmark packages.

clSurf: OpenCL implementation of the Speeded Up Robust Features (SURF) algorithm

This project provides an OpenCL implementation of the Speeded Up Robust Features (SURF) algorithm. This implementation of SURF has been developed by the NUCAR group at Northeastern University and AMD. A high level overview of the implementation of SURF has been described in the paper: Analyzing program flow within a many-kernel OpenCL application

clSurf is available HERE

Panda: Platform for Architecture-Neutral Dynamic Analysis

PANDA is the Platform for Architecture-Neutral Dynamic Analysis. It is a platform based on QEMU 1.0.1 and LLVM 3.3 for performing dynamic software analysis, abstracting architecture-level details away with a clean plugin interface. It is currently being developed in collaboration with MIT Lincoln Laboratory, Georgia Tech, and Northeastern University.

PANDA is available at HERE