lua-users wiki: Optimisation Coding Tips (original) (raw)
![]() |
---|
Lua 5.1 Notes
- Locals are faster than globals. See OptimisingUsingLocalVariables.
- Memory allocation from the heap--e.g. repeatedly creating tables or closures--can slow things down.
- Short inline expressions can be faster than function calls.
t[#t+1] = 0
is faster thantable.insert(t, 0)
. - Constant folding:
x * (1/3)
is just as fast asx * 0.33333333333333333
and is generally faster thanx/3
on most CPUs (see multiplication note below).1 + 2 + x
, which is the same as(1+2) + x
should be just as fast as3 + x
orx + (1 + 2)
but faster thanx + 1 + 2
, which is the same as(x + 1) + 2
as is not necessary equivalent to the former. Note that addition of numbers on computers is generally not associative when overflow occurs, and the compiler doesn't even know whetherx
is a number or some other type with a non-associative__add
metamethod. - LuaList:2006-03/msg00363.html . It's been reported that Roberto is seriously thinking about removing constant folding from Lua 5.2 since constant folding has been a source of bugs in Lua (though some of us really like constant folding -- DavidManura). - Multiplication
x*0.5
is faster than divisionx/2
. x*x
is faster thanx^2
- Factoring expressions:
x*y+x*z+y*z
-->x*(y+z) + y*z
. Lua will not do this for you, particularly since it can't assume distributive and other common algebraic properties hold during numerical overflow.
Note that Roberto Ierusalimschy's article Lua Performance Tips from the excellent [Lua Programming Gems] book is [available online].
Lua 4 Notes
The following information concerns optimization of Lua 4 and is kept here for historical reference.
General tips on coding
(Joshua Jensen) These are some optimization strategies I use (off the top of my head):
- Local variables are very quick, since they are accessed by index. If possible, make global variables local (weird, eh?). Seriously, it works great and indexed access is always going to be faster than a hash lookup. If a variable, say
GameState
, needs global scope for access from C, make a secondary variable that looks like 'local GSLocal = GameState
' and useGSLocal
within the module. This technique can also be used for functions that are called repetitively, too. (see OptimisingUsingLocalVariables) - for loops are quite a bit faster than while loops, since they have specialized virtual machine instructions.
- In your C callback functions, use
lua_rawcall()
to call other functions. The overhead of asetjmp()
call for exceptions (and a few other things) is avoided. I would not recommend usinglua_rawcall()
outside of a callback in case something goes wrong during execution. Without the setjmp() call, the error handler that exits the application is called. - If possible, in your C functions, try and use
lua_rawget()
andlua_rawgeti()
for table access, since it avoids the tag method checks. Be sure to uselua_rawgeti()
for indexed access. It's still a hash lookup, but it's probably the fastest way to get there by index. - In C, use
lua_ref()
wherever possible.lua_ref()
behaves similarly to a local variable in terms of speed. - Know that C strings passed into a Lua function (such as
lua_getglobal()
) from C are translated to a Lua string on entry. If a string is to be reused across multiple frames of the game, do alua_ref()
operation on it, too.
This information was written for Lua, pre v4.0 -- Nick Trout
Assertions
Using the standard assert function with a non-trivial message expression will negatively impact script performance. The reason is that the message expression is evaluated even when the assertion is true. For example in
assert(x <= x_max, "exceeded maximum ("..x_max..")")
regardless of the condition (which usually will be true), a float to string conversion and two concatenations will be performed. The following replacement uses printf-style message formatting and does not generate the message unless it is used:
function fast_assert(condition, ...) if not condition then if getn(arg) > 0 then assert(condition, call(format, arg)) else assert(condition) end end end
Now the example becomes:
fast_assert(x <= x_max, "exceeded maximum (%d)", x_max)
This is the VM code generated:
assert(x <= x_max, "exceeded maximum ("..x_max..")") GETGLOBAL 0 ; assert GETGLOBAL 1 ; x GETGLOBAL 2 ; x_max JMPLE 1 ; to 6 PUSHNILJMP PUSHINT 1 PUSHSTRING 3 ; "exceeded maximum (" GETGLOBAL 2 ; x_max PUSHSTRING 4 ; ")" CONCAT 3 CALL 0 0 fast_assert(x <= x_max, "exceeded maximum (%d)", x_max) GETGLOBAL 5 ; fast_assert GETGLOBAL 1 ; x GETGLOBAL 2 ; x_max JMPLE 1 ; to 17 PUSHNILJMP PUSHINT 1 PUSHSTRING 6 ; "exceeded maximum (%d)" GETGLOBAL 2 ; x_max CALL 0 0
Edit: April 23, 2012 By Sirmabus The code above will not actually work with 5.1 Also added some enhancements like pointing back to the actual assert line number, and a fall through in case the assertion msg arguments are wrong (using a "pcall()").
function fast_assert(condition, ...) if not condition then if next({...}) then local s,r = pcall(function (...) return(string.format(...)) end, ...) if s then error("assertion failed!: " .. r, 2) end end error("assertion failed!", 2) end end
Fast Unordered List Iteration
Frequently in Lua we build a table of elements such as:
table = { "harold", "victoria", "margaret", "guthrie" }
The "proper" way to iterate over this table is as follows:
for i=1, getn(table) do -- do something with table[i] end
However if we aren't concerned about element order, the above iteration is slow. The first problem is that it calls getn(), which has order O(n) assuming as above that the "n" field has not been set. The second problem is that bytecode must be executed and a table lookup performed to access each element (that is, "table[i]").
A solution is to use a table iterator instead:
for x, element in pairs(table) do -- do something with element end
The getn() call is eliminated as is the table lookup. The "x" is a dummy variable as the element index is normally not used in this case.
There is a caveat with this solution. If library functions tinsert() or tremove() are used on the table they will set the "n" field which would show up in our iteration.
An alternative is to employ the list iteration patch listed in LuaPowerPatches.
Table Access
Question: It's not the performance of creating the tables that I'm worried about, but rather all the accesses to the table contents.
(lhf) Tables are the central data structure in Lua. You shouldn't have to worry about table performance. A lot of effort is spent trying to make tables fast. For instance, there is a special opcode for a.x
. See the difference between a.x
and a[x]
... but, like you said, the difference here is essentially an extra GETGLOBAL
.
a,c = {},"x" CREATETABLE 0 PUSHSTRING 2 ; "x" SETGLOBAL 1 ; c SETGLOBAL 0 ; a b=a.x GETGLOBAL 0 ; a GETDOTTED 2 ; x SETGLOBAL 3 ; b b=a["x"] GETGLOBAL 0 ; a GETDOTTED 2 ; x SETGLOBAL 3 ; b b=a[c] GETGLOBAL 0 ; a GETGLOBAL 1 ; c GETTABLE SETGLOBAL 3 ; b END
See also: VmMerge (used to format the merged Lua source and VM code), OptimisationTips , OptimisingUsingLocalVariables
RecentChanges · preferences
edit · history
Last edited April 24, 2012 1:42 am GMT (diff)