-LuaJIT has an integrated statistical profiler with very low overhead. It -allows sampling the currently executing stack and other parameters in -regular intervals. -
--The integrated profiler can be accessed from three levels: -
--
-
- The bundled high-level profiler, invoked by the --jp command line option. -
- A low-level Lua API to control the profiler. -
- A low-level C API to control the profiler. -
High-Level Profiler
--The bundled high-level profiler offers basic profiling functionality. It -generates simple textual summaries or source code annotations. It can be -accessed with the -jp command line option -or from Lua code by loading the underlying jit.p module. -
--To cut to the chase — run this to get a CPU usage profile by -function name: -
--luajit -jp myapp.lua --
-It's not a stated goal of the bundled profiler to add every -possible option or to cater for special profiling needs. The low-level -profiler APIs are documented below. They may be used by third-party -authors to implement advanced functionality, e.g. IDE integration or -graphical profilers. -
--Note: Sampling works for both interpreted and JIT-compiled code. The -results for JIT-compiled code may sometimes be surprising. LuaJIT -heavily optimizes and inlines Lua code — there's no simple -one-to-one correspondence between source code lines and the sampled -machine code. -
- --jp=[options[,output]]
--The -jp command line option starts the high-level profiler. -When the application run by the command line terminates, the profiler -stops and writes the results to stdout or to the specified -output file. -
--The options argument specifies how the profiling is to be -performed: -
--
-
- f — Stack dump: function name, otherwise module:line. -This is the default mode. -
- F — Stack dump: ditto, but dump module:name. -
- l — Stack dump: module:line. -
- <number> — stack dump depth (callee ← -caller). Default: 1. -
- -<number> — Inverse stack dump depth (caller -→ callee). -
- s — Split stack dump after first stack level. Implies -depth ≥ 2 or depth ≤ -2. -
- p — Show full path for module names. -
- v — Show VM states. -
- z — Show zones. -
- r — Show raw sample counts. Default: show percentages. -
- a — Annotate excerpts from source code files. -
- A — Annotate complete source code files. -
- G — Produce raw output suitable for graphical tools. -
- m<number> — Minimum sample percentage to be shown. -Default: 3%. -
- i<number> — Sampling interval in milliseconds.
-Default: 10ms.
-Note: The actual sampling precision is OS-dependent.
-
-The default output for -jp is a list of the most CPU consuming -spots in the application. Increasing the stack dump depth with (say) --jp=2 may help to point out the main callers or callees of -hotspots. But sample aggregation is still flat per unique stack dump. -
--To get a two-level view (split view) of callers/callees, use --jp=s or -jp=-s. The percentages shown for the second -level are relative to the first level. -
--To see how much time is spent in each line relative to a function, use --jp=fl. -
--To see how much time is spent in different VM states or -zones, use -jp=v or -jp=z. -
--Combinations of v/z with f/F/l produce two-level -views, e.g. -jp=vf or -jp=fv. This shows the time -spent in a VM state or zone vs. hotspots. This can be used to answer -questions like "Which time consuming functions are only interpreted?" or -"What's the garbage collector overhead for a specific function?". -
--Multiple options can be combined — but not all combinations make -sense, see above. E.g. -jp=3si4m1 samples three stack levels -deep in 4ms intervals and shows a split view of the CPU consuming -functions and their callers with a 1% threshold. -
--Source code annotations produced by -jp=a or -jp=A are -always flat and at the line level. Obviously, the source code files need -to be readable by the profiler script. -
--The high-level profiler can also be started and stopped from Lua code with: -
--require("jit.p").start(options, output) -... -require("jit.p").stop() -- -
jit.zone — Zones
--Zones can be used to provide information about different parts of an -application to the high-level profiler. E.g. a game could make use of an -"AI" zone, a "PHYS" zone, etc. Zones are hierarchical, -organized as a stack. -
--The jit.zone module needs to be loaded explicitly: -
--local zone = require("jit.zone") --
-
-
- zone("name") pushes a named zone to the zone stack. -
- zone() pops the current zone from the zone stack and -returns its name. -
- zone:get() returns the current zone name or nil. -
- zone:flush() flushes the zone stack. -
-To show the time spent in each zone use -jp=z. To show the time -spent relative to hotspots use e.g. -jp=zf or -jp=fz. -
- -Low-level Lua API
--The jit.profile module gives access to the low-level API of the -profiler from Lua code. This module needs to be loaded explicitly: -
-local profile = require("jit.profile") --
-This module can be used to implement your own higher-level profiler. -A typical profiling run starts the profiler, captures stack dumps in -the profiler callback, adds them to a hash table to aggregate the number -of samples, stops the profiler and then analyzes all of the captured -stack dumps. Other parameters can be sampled in the profiler callback, -too. But it's important not to spend too much time in the callback, -since this may skew the statistics. -
- -profile.start(mode, cb) -— Start profiler
--This function starts the profiler. The mode argument is a -string holding options: -
--
-
- f — Profile with precision down to the function level. -
- l — Profile with precision down to the line level. -
- i<number> — Sampling interval in milliseconds (default -10ms). -Note: The actual sampling precision is OS-dependent. - -
-The cb argument is a callback function which is called with -three arguments: (thread, samples, vmstate). The callback is -called on a separate coroutine, the thread argument is the -state that holds the stack to sample for profiling. Note: do -not modify the stack of that state or call functions on it. -
--samples gives the number of accumulated samples since the last -callback (usually 1). -
--vmstate holds the VM state at the time the profiling timer -triggered. This may or may not correspond to the state of the VM when -the profiling callback is called. The state is either 'N' -native (compiled) code, 'I' interpreted code, 'C' -C code, 'G' the garbage collector, or 'J' the JIT -compiler. -
- -profile.stop() -— Stop profiler
--This function stops the profiler. -
- -dump = profile.dumpstack([thread,] fmt, depth) -— Dump stack
--This function allows taking stack dumps in an efficient manner. It -returns a string with a stack dump for the thread (coroutine), -formatted according to the fmt argument: -
--
-
- p — Preserve the full path for module names. Otherwise -only the file name is used. -
- f — Dump the function name if it can be derived. Otherwise -use module:line. -
- F — Ditto, but dump module:name. -
- l — Dump module:line. -
- Z — Zap the following characters for the last dumped -frame. -
- All other characters are added verbatim to the output string. -
-The depth argument gives the number of frames to dump, starting -at the topmost frame of the thread. A negative number dumps the frames in -inverse order. -
--The first example prints a list of the current module names and line -numbers of up to 10 frames in separate lines. The second example prints -semicolon-separated function names for all frames (up to 100) in inverse -order: -
--print(profile.dumpstack(thread, "l\n", 10)) -print(profile.dumpstack(thread, "lZ;", -100)) -- -
Low-level C API
--The profiler can be controlled directly from C code, e.g. for -use by IDEs. The declarations are in "luajit.h" (see -Lua/C API extensions). -
- -luaJIT_profile_start(L, mode, cb, data) -— Start profiler
--This function starts the profiler. See -above for a description of the mode argument. -
--The cb argument is a callback function with the following -declaration: -
--typedef void (*luaJIT_profile_callback)(void *data, lua_State *L, - int samples, int vmstate); --
-data is available for use by the callback. L is the -state that holds the stack to sample for profiling. Note: do -not modify this stack or call functions on this stack — -use a separate coroutine for this purpose. See -above for a description of samples and vmstate. -
- -luaJIT_profile_stop(L) -— Stop profiler
--This function stops the profiler. -
- -p = luaJIT_profile_dumpstack(L, fmt, depth, len) -— Dump stack
--This function allows taking stack dumps in an efficient manner. -See above for a description of fmt -and depth. -
--This function returns a const char * pointing to a -private string buffer of the profiler. The int *len -argument returns the length of the output string. The buffer is -overwritten on the next call and deallocated when the profiler stops. -You either need to consume the content immediately or copy it for later -use. -
--