perf-eh_elf/Documentation/perf-bench.txt

perf-bench(1)
=============

NAME
----
perf-bench - General framework for benchmark suites

SYNOPSIS
--------
[verse]
'perf bench' [<common options>] <subsystem> <suite> [<options>]

DESCRIPTION
-----------
This 'perf bench' command is a general framework for benchmark suites.

COMMON OPTIONS
--------------
-r::
--repeat=::
Specify amount of times to repeat the run (default 10).

-f::
--format=::
Specify format style.
Current available format styles are:

'default'::
Default style. This is mainly for human reading.
---------------------
% perf bench sched pipe                      # with no style specified
(executing 1000000 pipe operations between two tasks)
        Total time:5.855 sec
                5.855061 usecs/op
		170792 ops/sec
---------------------

'simple'::
This simple style is friendly for automated
processing by scripts.
---------------------
% perf bench --format=simple sched pipe      # specified simple
5.988
---------------------

SUBSYSTEM
---------

'sched'::
	Scheduler and IPC mechanisms.

'mem'::
	Memory access performance.

'numa'::
	NUMA scheduling and MM benchmarks.

'futex'::
	Futex stressing benchmarks.

'all'::
	All benchmark subsystems.

SUITES FOR 'sched'
~~~~~~~~~~~~~~~~~~
*messaging*::
Suite for evaluating performance of scheduler and IPC mechanisms.
Based on hackbench by Rusty Russell.

Options of *messaging*
^^^^^^^^^^^^^^^^^^^^^^
-p::
--pipe::
Use pipe() instead of socketpair()

-t::
--thread::
Be multi thread instead of multi process

-g::
--group=::
Specify number of groups

-l::
--nr_loops=::
Specify number of loops

Example of *messaging*
^^^^^^^^^^^^^^^^^^^^^^

---------------------
% perf bench sched messaging                 # run with default
options (20 sender and receiver processes per group)
(10 groups == 400 processes run)

      Total time:0.308 sec

% perf bench sched messaging -t -g 20        # be multi-thread, with 20 groups
(20 sender and receiver threads per group)
(20 groups == 800 threads run)

      Total time:0.582 sec
---------------------

*pipe*::
Suite for pipe() system call.
Based on pipe-test-1m.c by Ingo Molnar.

Options of *pipe*
^^^^^^^^^^^^^^^^^
-l::
--loop=::
Specify number of loops.

Example of *pipe*
^^^^^^^^^^^^^^^^^

---------------------
% perf bench sched pipe
(executing 1000000 pipe operations between two tasks)

        Total time:8.091 sec
                8.091833 usecs/op
                123581 ops/sec

% perf bench sched pipe -l 1000              # loop 1000
(executing 1000 pipe operations between two tasks)

        Total time:0.016 sec
                16.948000 usecs/op
                59004 ops/sec
---------------------

SUITES FOR 'mem'
~~~~~~~~~~~~~~~~
*memcpy*::
Suite for evaluating performance of simple memory copy in various ways.

Options of *memcpy*
^^^^^^^^^^^^^^^^^^^
-l::
--size::
Specify size of memory to copy (default: 1MB).
Available units are B, KB, MB, GB and TB (case insensitive).

-f::
--function::
Specify function to copy (default: default).
Available functions are depend on the architecture.
On x86-64, x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported.

-l::
--nr_loops::
Repeat memcpy invocation this number of times.

-c::
--cycles::
Use perf's cpu-cycles event instead of gettimeofday syscall.

*memset*::
Suite for evaluating performance of simple memory set in various ways.

Options of *memset*
^^^^^^^^^^^^^^^^^^^
-l::
--size::
Specify size of memory to set (default: 1MB).
Available units are B, KB, MB, GB and TB (case insensitive).

-f::
--function::
Specify function to set (default: default).
Available functions are depend on the architecture.
On x86-64, x86-64-unrolled, x86-64-stosq and x86-64-stosb are supported.

-l::
--nr_loops::
Repeat memset invocation this number of times.

-c::
--cycles::
Use perf's cpu-cycles event instead of gettimeofday syscall.

SUITES FOR 'numa'
~~~~~~~~~~~~~~~~~
*mem*::
Suite for evaluating NUMA workloads.

SUITES FOR 'futex'
~~~~~~~~~~~~~~~~~~
*hash*::
Suite for evaluating hash tables.

*wake*::
Suite for evaluating wake calls.

*wake-parallel*::
Suite for evaluating parallel wake calls.

*requeue*::
Suite for evaluating requeue calls.

*lock-pi*::
Suite for evaluating futex lock_pi calls.


SEE ALSO
--------
linkperf:perf[1]
Merge tag 'afs-fixes-20180514' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: "Here's a set of patches that fix a number of bugs in the in-kernel AFS client, including: - Fix directory locking to not use individual page locks for directory reading/scanning but rather to use a semaphore on the afs_vnode struct as the directory contents must be read in a single blob and data from different reads must not be mixed as the entire contents may be shuffled about between reads. - Fix address list parsing to handle port specifiers correctly. - Only give up callback records on a server if we actually talked to that server (we might not be able to access a server). - Fix some callback handling bugs, including refcounting, whole-volume callbacks and when callbacks actually get broken in response to a CB.CallBack op. - Fix some server/address rotation bugs, including giving up if we can't probe a server; giving up if a server says it doesn't have a volume, but there are more servers to try. - Fix the decoding of fetched statuses to be OpenAFS compatible. - Fix the handling of server lookups in Cache Manager ops (such as CB.InitCallBackState3) to use a UUID if possible and to handle no server being found. - Fix a bug in server lookup where not all addresses are compared. - Fix the non-encryption of calls that prevents some servers from being accessed (this also requires an AF_RXRPC patch that has already gone in through the net tree). There's also a patch that adds tracepoints to log Cache Manager ops that don't find a matching server, either by UUID or by address" * tag 'afs-fixes-20180514' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix the non-encryption of calls afs: Fix CB.CallBack handling afs: Fix whole-volume callback handling afs: Fix afs_find_server search loop afs: Fix the handling of an unfound server in CM operations afs: Add a tracepoint to record callbacks from unlisted servers afs: Fix the handling of CB.InitCallBackState3 to find the server by UUID afs: Fix VNOVOL handling in address rotation afs: Fix AFSFetchStatus decoder to provide OpenAFS compatibility afs: Fix server rotation's handling of fileserver probe failure afs: Fix refcounting in callback registration afs: Fix giving up callbacks on server destruction afs: Fix address list parsing afs: Fix directory page locking 2018-05-15 19:48:36 +02:00			`perf-bench(1)`
			`=============`

			`NAME`
			`----`
			`perf-bench - General framework for benchmark suites`

			`SYNOPSIS`
			`--------`
			`[verse]`
			`'perf bench' [<common options>] <subsystem> <suite> [<options>]`

			`DESCRIPTION`
			`-----------`
			`This 'perf bench' command is a general framework for benchmark suites.`

			`COMMON OPTIONS`
			`--------------`
			`-r::`
			`--repeat=::`
			`Specify amount of times to repeat the run (default 10).`

			`-f::`
			`--format=::`
			`Specify format style.`
			`Current available format styles are:`

			`'default'::`
			`Default style. This is mainly for human reading.`
			`---------------------`
			`% perf bench sched pipe # with no style specified`
			`(executing 1000000 pipe operations between two tasks)`
			`Total time:5.855 sec`
			`5.855061 usecs/op`
			`170792 ops/sec`
			`---------------------`

			`'simple'::`
			`This simple style is friendly for automated`
			`processing by scripts.`
			`---------------------`
			`% perf bench --format=simple sched pipe # specified simple`
			`5.988`
			`---------------------`

			`SUBSYSTEM`
			`---------`

			`'sched'::`
			`Scheduler and IPC mechanisms.`

			`'mem'::`
			`Memory access performance.`

			`'numa'::`
			`NUMA scheduling and MM benchmarks.`

			`'futex'::`
			`Futex stressing benchmarks.`

			`'all'::`
			`All benchmark subsystems.`

			`SUITES FOR 'sched'`
			`~~~~~~~~~~~~~~~~~~`
			`messaging::`
			`Suite for evaluating performance of scheduler and IPC mechanisms.`
			`Based on hackbench by Rusty Russell.`

			`Options of messaging`
			`^^^^^^^^^^^^^^^^^^^^^^`
			`-p::`
			`--pipe::`
			`Use pipe() instead of socketpair()`

			`-t::`
			`--thread::`
			`Be multi thread instead of multi process`

			`-g::`
			`--group=::`
			`Specify number of groups`

			`-l::`
			`--nr_loops=::`
			`Specify number of loops`

			`Example of messaging`
			`^^^^^^^^^^^^^^^^^^^^^^`

			`---------------------`
			`% perf bench sched messaging # run with default`
			`options (20 sender and receiver processes per group)`
			`(10 groups == 400 processes run)`

			`Total time:0.308 sec`

			`% perf bench sched messaging -t -g 20 # be multi-thread, with 20 groups`
			`(20 sender and receiver threads per group)`
			`(20 groups == 800 threads run)`

			`Total time:0.582 sec`
			`---------------------`

			`pipe::`
			`Suite for pipe() system call.`
			`Based on pipe-test-1m.c by Ingo Molnar.`

			`Options of pipe`
			`^^^^^^^^^^^^^^^^^`
			`-l::`
			`--loop=::`
			`Specify number of loops.`

			`Example of pipe`
			`^^^^^^^^^^^^^^^^^`

			`---------------------`
			`% perf bench sched pipe`
			`(executing 1000000 pipe operations between two tasks)`

			`Total time:8.091 sec`
			`8.091833 usecs/op`
			`123581 ops/sec`

			`% perf bench sched pipe -l 1000 # loop 1000`
			`(executing 1000 pipe operations between two tasks)`

			`Total time:0.016 sec`
			`16.948000 usecs/op`
			`59004 ops/sec`
			`---------------------`

			`SUITES FOR 'mem'`
			`~~~~~~~~~~~~~~~~`
			`memcpy::`
			`Suite for evaluating performance of simple memory copy in various ways.`

			`Options of memcpy`
			`^^^^^^^^^^^^^^^^^^^`
			`-l::`
			`--size::`
			`Specify size of memory to copy (default: 1MB).`
			`Available units are B, KB, MB, GB and TB (case insensitive).`

			`-f::`
			`--function::`
			`Specify function to copy (default: default).`
			`Available functions are depend on the architecture.`
			`On x86-64, x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported.`

			`-l::`
			`--nr_loops::`
			`Repeat memcpy invocation this number of times.`

			`-c::`
			`--cycles::`
			`Use perf's cpu-cycles event instead of gettimeofday syscall.`

			`memset::`
			`Suite for evaluating performance of simple memory set in various ways.`

			`Options of memset`
			`^^^^^^^^^^^^^^^^^^^`
			`-l::`
			`--size::`
			`Specify size of memory to set (default: 1MB).`
			`Available units are B, KB, MB, GB and TB (case insensitive).`

			`-f::`
			`--function::`
			`Specify function to set (default: default).`
			`Available functions are depend on the architecture.`
			`On x86-64, x86-64-unrolled, x86-64-stosq and x86-64-stosb are supported.`

			`-l::`
			`--nr_loops::`
			`Repeat memset invocation this number of times.`

			`-c::`
			`--cycles::`
			`Use perf's cpu-cycles event instead of gettimeofday syscall.`

			`SUITES FOR 'numa'`
			`~~~~~~~~~~~~~~~~~`
			`mem::`
			`Suite for evaluating NUMA workloads.`

			`SUITES FOR 'futex'`
			`~~~~~~~~~~~~~~~~~~`
			`hash::`
			`Suite for evaluating hash tables.`

			`wake::`
			`Suite for evaluating wake calls.`

			`wake-parallel::`
			`Suite for evaluating parallel wake calls.`

			`requeue::`
			`Suite for evaluating requeue calls.`

			`lock-pi::`
			`Suite for evaluating futex lock_pi calls.`


			`SEE ALSO`
			`--------`
			`linkperf:perf[1]`