Getting Started With Open|SpeedShop™ 2.0.2
LA-UR-07-7012
Welcome To OpenSpeedShop Live
Welcome to OpenSpeedShop Live, a live Linux distribution with a
pre-configured OpenSpeedShop installation.
Test applications:
- mpi: smg2000
- sequential programs: smg2000,forever, mutatee, threads, and
matmul.
- multi-process: omp_stress
- hybrid: bt database file
These test programs are located in /home/openssuser/sequential
and /home/openssuser/mpi.
Open|SpeedShop Tutorials
Open|SpeedShop Additional Information
Brief Open|SpeedShop Overview
Open|SpeedShop is a community effort by The Krell Institute with current direct
funding from DOE’s NNSA and Office of Science. It is building on top
of a broad list of community infrastructures, most notably Dyninst
and MRNet from the Universities of Wisconsin and Maryland,
libmonitor from Rice University, and PAPI from the University of
Tennessee at Knoxville. Open|SpeedShop is an open source multi
platform Linux performance tool which is initially targeted to
support performance analysis of applications running on both single
node and large scale IA64, IA32, EM64T, AMD64, and IBM Power PC
platforms. Support for the Cray XT platform was added in release
1.9.3.4 and support for the IBM Blue Gene platforms was added in the
2.0.0 release. Further updates to the Blue Gene and Cray platforms
were delivered in the 2.0.1 release. Support for
shared/dynamic executables on the Cray-XE platforms are included in
the 2.0.1 release.
Open|SpeedShop is explicitly designed with usability in mind and
is for application developers and computer scientists. The base
functionality include:
- Sampling Experiments
- Support for Callstack Analysis
- Hardware Performance Counters
- MPI Profiling and Tracing
- I/O Profiling and Tracing
- Floating Point Exception Analysis
In addition, Open|SpeedShop is designed to be modular and
extensible. It supports several levels of plug-ins which allow
users to add their own performance experiments.
Open|SpeedShop development is hosted by the Krell Institute. The
infrastructure and base components of Open|SpeedShop are released
as open source code primarily under LGPL.
Highlights
- Comprehensive performance analysis for sequential,
multithreaded, and MPI applications
- No need to recompile the user’s application.
- Supports both first analysis steps as well as deeper analysis
options for performance experts
- Easy to use GUI and fully scriptable through a command line
interface and Python
- Supports Linux Systems and Clusters with Intel and AMD
processors
- Extensible through new performance analysis plugins ensuring
consistent look and feel
- In production use on all major cluster platforms at LANL,
LLNL, and SNL
Features
- Four user interface options: batch, command line interface,
graphical user interface and Python scripting API.
- Supports multi-platform single system image(SSI) and
traditional clusters.
- Scales to large numbers of processes, threads, and ranks.
- Ability to automatically create and attach to both sequential
and parallel jobs from within Open|SpeedShop.
- View performance data using multiple customizable views.
- View intermediate performance measurement data while the
experiment is running.
- Save and restore performance experiment data and symbol
information for post experiment performance analysis
- View performance data for all of application’s lifetime or
smaller time slices.
- Compare performance results between processes, threads, or
ranks between a previous experiment and current experiment.
- GUI Wizard facility and context sensitive help.
- Interactive CLI help facility which lists the CLI commands,
syntax, and typical usage.
- Python Scripting API accesses Open|SpeedShop functionality
corresponding to CLI commands.
- Option to automatically group like performing processes,
threads, or ranks.
- Create traces in OTF (Open Trace Format).
- Comprehensive installation scripts.
- Support for hybrid MPI and openMP applications.
View the performance of the openMP threads that are running
inside the MPI ranks.
The Open|SpeedShop web site is located at: www.openspeedshop.org.
Downloads are available from the Open|SpeedShop sourceforge web
site:
www.sourceforge.net/projects/openss
About Getting Started
The primary focus of this document will be on introducing many of
Open|SpeedShop's key features. In particular we will be covering:
- Common Terminology
- Available Experiments
- Open|SpeedShop Invocation
- Using Open|SpeedShop's Interactive Command Line Interface
(CLI)
- Using Open|SpeedShop's Graphical User Interface (GUI)
- Getting More Information
Common Terminology
Technical terms can have multiple and/or context sensitive meanings,
therefore this section attempts to explain and clarify the meanings
of the terms used in this document.
- Experiment: A set of collectors and executables bound
together to generate performance metrics.
- Focused Experiment: The current experiment commands
operate on. The user may run or view multiple experiments
simultaneously and unless a particular experiment is specified
directly, the focused experiment will used. Experiments are
given an enumeration (expId) for identification.
- Component(s): A somewhat self-contained section of the
Open|SpeedShop performance tool. This section of code does a set
of specific related tasks for the tool. For example, the GUI
component does all the tasks related to displaying
Open|SpeedShop wizards, experiment creation, and results using a
graphical user interface. The CLI component does similar
functions but uses the interactive command line delivery method.
- Collector: The portion of the tool containing logic
that is responsible for the gathering of the performance metric.
A collector is a portion of the code that is included in the
experiment plugin.
- Metric: The entity, which the collector/experiment is
gathering. A time, occurrence counter, or other entity, which
reflects in some way on the applications performance and is
gathered by a performance experiment (by the collector).
- Param: Each collector allows the user to set certain
values that control the way a collector behaves. The parameter
or param may cause the collector to perform various operations
at certain time intervals or it may cause a collector to measure
certain types of data. Although Open|SpeedShop provides a
standard way to set a parameter, it is up to the individual
collector to decide what to do with that information. Detailed
documentation about the available parameters is part of the
collector's documentation.
- Framework: The set of API functions that allows the
user interface to manage the creation and viewing of performance
experiments. It is the interface between the user interface and
the cluster support and dynamic instrumentation components.
- Plugin: A portion of the performance tool that can be
loaded and included in the tool at tool start-up time.
Development of the plugin uses a tool specific interface (API)
so that the plugin, and the tool it is to be included in, know
how to interact with each other. Plugins are normally placed in
a specific directory so that the tool knows where to find the
plugins.
- Target: This is the application or part of the
application one is running the experiment on. In order to fine
tune what is being targeted, Open|SpeedShop gives target options
that describes file names, host names, thread identifiers, rank
identifiers and process identifiers.
Available Experiments
Table: Summary of
Experiments
- fpe: Collects all floating-point exceptions with the
call stack and the time of exception.
- hwc: Counts at the source line, machine instruction,
and function levels of various hardware events.
- hwctime: Similar to hwc, except that callstack
sampling is used.
- hwcsamp: Similar to
pcsamp, except that up to 6 hardware counter registers are read
in addition to the program counter sampling.
- io: Times I/O system calls. The time reported is wall
clock time.
- iot: Traces and times I/O system calls. The time
reported is wall clock time.
- mpi: Times calls to various MPI routines. The time
reported is wall clock time.
- mpiotf: Traces ant time to various MPI routines - OTF
format. The time reported is wall clock time.
- mpit: Traces and times calls to various MPI routines.
The time reported is wall clock time.
- pcsamp: Actual CPU time at the source line, machine
instruction, and function level.
- usertime: Inclusive and exclusive CPU time for each
function.
Open|SpeedShop Invocation
Open|SpeedShop Invocation
Details
- openss [-gui]: Start Open|SpeedShop's graphical
interface.
- openss -cli: Start Open|SpeedShop's interactive command
line interface.
- openss -offline: Start an offline (link in collectors)
performance experiment. (e.g. openss -offline -f "mpirun -np
1024 ./smg2000 -n 40 40 40" pcsamp). This is default,
"-offline" is optional.
- Now have osspcsamp, ossusertime, osshwc, osshwctime, etc.
convenience routines.
- Example syntax: osspcsamp "mpirun -np 1024 ./smg2000 -n 40
40 40"
- openss -online: Start an online (dynamic) performance
experiment. (e.g. openss -online -f "mpirun -np 1024 ./smg2000
-n 40 40 40" pcsamp)
- openss -batch: Start a performance experiment.
specified by additional arguments, directly without user
interaction
- oss<experiment-type> convenience scripts. A
script for each experiment that hides the openss syntax and
gives easy access to additional options.