16c00db4bb
Pull AFS fixes from David Howells: "Here's a set of patches that fix a number of bugs in the in-kernel AFS client, including: - Fix directory locking to not use individual page locks for directory reading/scanning but rather to use a semaphore on the afs_vnode struct as the directory contents must be read in a single blob and data from different reads must not be mixed as the entire contents may be shuffled about between reads. - Fix address list parsing to handle port specifiers correctly. - Only give up callback records on a server if we actually talked to that server (we might not be able to access a server). - Fix some callback handling bugs, including refcounting, whole-volume callbacks and when callbacks actually get broken in response to a CB.CallBack op. - Fix some server/address rotation bugs, including giving up if we can't probe a server; giving up if a server says it doesn't have a volume, but there are more servers to try. - Fix the decoding of fetched statuses to be OpenAFS compatible. - Fix the handling of server lookups in Cache Manager ops (such as CB.InitCallBackState3) to use a UUID if possible and to handle no server being found. - Fix a bug in server lookup where not all addresses are compared. - Fix the non-encryption of calls that prevents some servers from being accessed (this also requires an AF_RXRPC patch that has already gone in through the net tree). There's also a patch that adds tracepoints to log Cache Manager ops that don't find a matching server, either by UUID or by address" * tag 'afs-fixes-20180514' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix the non-encryption of calls afs: Fix CB.CallBack handling afs: Fix whole-volume callback handling afs: Fix afs_find_server search loop afs: Fix the handling of an unfound server in CM operations afs: Add a tracepoint to record callbacks from unlisted servers afs: Fix the handling of CB.InitCallBackState3 to find the server by UUID afs: Fix VNOVOL handling in address rotation afs: Fix AFSFetchStatus decoder to provide OpenAFS compatibility afs: Fix server rotation's handling of fileserver probe failure afs: Fix refcounting in callback registration afs: Fix giving up callbacks on server destruction afs: Fix address list parsing afs: Fix directory page locking
108 lines
3.3 KiB
Text
108 lines
3.3 KiB
Text
Overhead calculation
|
|
--------------------
|
|
The overhead can be shown in two columns as 'Children' and 'Self' when
|
|
perf collects callchains. The 'self' overhead is simply calculated by
|
|
adding all period values of the entry - usually a function (symbol).
|
|
This is the value that perf shows traditionally and sum of all the
|
|
'self' overhead values should be 100%.
|
|
|
|
The 'children' overhead is calculated by adding all period values of
|
|
the child functions so that it can show the total overhead of the
|
|
higher level functions even if they don't directly execute much.
|
|
'Children' here means functions that are called from another (parent)
|
|
function.
|
|
|
|
It might be confusing that the sum of all the 'children' overhead
|
|
values exceeds 100% since each of them is already an accumulation of
|
|
'self' overhead of its child functions. But with this enabled, users
|
|
can find which function has the most overhead even if samples are
|
|
spread over the children.
|
|
|
|
Consider the following example; there are three functions like below.
|
|
|
|
-----------------------
|
|
void foo(void) {
|
|
/* do something */
|
|
}
|
|
|
|
void bar(void) {
|
|
/* do something */
|
|
foo();
|
|
}
|
|
|
|
int main(void) {
|
|
bar()
|
|
return 0;
|
|
}
|
|
-----------------------
|
|
|
|
In this case 'foo' is a child of 'bar', and 'bar' is an immediate
|
|
child of 'main' so 'foo' also is a child of 'main'. In other words,
|
|
'main' is a parent of 'foo' and 'bar', and 'bar' is a parent of 'foo'.
|
|
|
|
Suppose all samples are recorded in 'foo' and 'bar' only. When it's
|
|
recorded with callchains the output will show something like below
|
|
in the usual (self-overhead-only) output of perf report:
|
|
|
|
----------------------------------
|
|
Overhead Symbol
|
|
........ .....................
|
|
60.00% foo
|
|
|
|
|
--- foo
|
|
bar
|
|
main
|
|
__libc_start_main
|
|
|
|
40.00% bar
|
|
|
|
|
--- bar
|
|
main
|
|
__libc_start_main
|
|
----------------------------------
|
|
|
|
When the --children option is enabled, the 'self' overhead values of
|
|
child functions (i.e. 'foo' and 'bar') are added to the parents to
|
|
calculate the 'children' overhead. In this case the report could be
|
|
displayed as:
|
|
|
|
-------------------------------------------
|
|
Children Self Symbol
|
|
........ ........ ....................
|
|
100.00% 0.00% __libc_start_main
|
|
|
|
|
--- __libc_start_main
|
|
|
|
100.00% 0.00% main
|
|
|
|
|
--- main
|
|
__libc_start_main
|
|
|
|
100.00% 40.00% bar
|
|
|
|
|
--- bar
|
|
main
|
|
__libc_start_main
|
|
|
|
60.00% 60.00% foo
|
|
|
|
|
--- foo
|
|
bar
|
|
main
|
|
__libc_start_main
|
|
-------------------------------------------
|
|
|
|
In the above output, the 'self' overhead of 'foo' (60%) was add to the
|
|
'children' overhead of 'bar', 'main' and '\_\_libc_start_main'.
|
|
Likewise, the 'self' overhead of 'bar' (40%) was added to the
|
|
'children' overhead of 'main' and '\_\_libc_start_main'.
|
|
|
|
So '\_\_libc_start_main' and 'main' are shown first since they have
|
|
same (100%) 'children' overhead (even though they have zero 'self'
|
|
overhead) and they are the parents of 'foo' and 'bar'.
|
|
|
|
Since v3.16 the 'children' overhead is shown by default and the output
|
|
is sorted by its values. The 'children' overhead is disabled by
|
|
specifying --no-children option on the command line or by adding
|
|
'report.children = false' or 'top.children = false' in the perf config
|
|
file.
|