Jump to content
  • Sky
  • Blueberry
  • Slate
  • Blackcurrant
  • Watermelon
  • Strawberry
  • Orange
  • Banana
  • Apple
  • Emerald
  • Chocolate
  • Charcoal
mpmxyz

The Debugging Guide

Recommended Posts

Content

  • Introduction
  • Reproducing an Error
  • Error Messages
  • Debug Output
  • Appendix
    • List of Error Messages
    • Using a Debugger
    • Ad alert: cbrowse

Introduction
Writing code is one thing, making it work is a completely different story. More often than not something is not working as intended and sometimes your program just crashes the instant you run it.
This can be fun and inspiring but most often it is just frustrating.
This guide is meant to introduce you to several useful debugging techniques with the primary focus being OpenComputers and Lua.

I could use your help to improve this guide. If you spot an error or miss some content: Please tell me!

Reproducing an Error
Even with maxed out debugging skills you are going to have a hard time fixing a bug if you can't see it.
So step one in debugging is finding a way to reproduce the bug. This is something that does not necessarily require programming.

You can even do that as a user!

It is very good manners to add a step by step instruction to your bug report:
Write down every step of what you did to reach the bug. Test this instruction yourself and try to find ways to make it as short as possible.
  Maybe the work you did before opening *this* file did not have anything to do with your bug.
Also try some variations to find out if this bug needs a specific setup or if it happens in a more general use case.
  If the program had no trouble on a T3 screen but crashes on a T1 screen there might be something involving the resolution or color depth. Does it work with a T2 screen?
This makes it very easy to see the bug in action. Even if it is just 1 in 10 times. It helps.
  Beware of the Heisenbug which only appears when you are not watching it! ;-)

Error Messages

 


This is a program that should print 4 numbers:


function draw(v, ...)
  if v then
    print(("%i"):format(v))
    draw(...)
  end
end
draw(1, 2, 3, {4.5})

It gives you normal output...


1
2
3

...and this error message:


/test.lua:3: bad argument #1 to 'format' (number expected, got table):
stack traceback:
	[C]: in function 'format'
	/test.lua:3: in function 'draw'
	/test.lua:4: in function 'draw'
	/test.lua:4: in function 'draw'
	/test.lua:4: in function 'draw'
	/test.lua:7: in main chunk
	(...tail calls...)
	[C]: in function 'xpcall'
	machine:751: in function 'xpcall'
	/lib/process.lua:84: in function </lib/process.lua:80>

The first line is enough for most bugs.
It contains the location where the error message comes from:
  /test.lua:3 - file /test.lua, third line
It also contains the error reason:
  bad argument #1 to 'format' (number expected, got table) - You called the function 'format' but the type of its first argument - 'v' - was 'table' and not 'number' as the function expected.
  v shouldn't be a table. Where does it come from?
To understand that it might be good to read the stack traceback.
The stack is a part of the memory Lua uses to remember where to jump back to after returning from a function. (It also stores local functions.)
A traceback is a human readable printout of this data. *You need to read it from the bottom to the top.*


[C]: in function 'format'
	The value is prepared within the C function 'format'. This function throws the error because it detected the wrong type of its parameters. *You can continue to read left->right/top->bottom.*
/test.lua:3: in function 'draw'
	This is the line where the (originally) fourth value is supposed to be printed. *move up*
/test.lua:4: in function 'draw'
	It is a very powerful concept in programming. *move up*
/test.lua:4: in function 'draw'
	This process of solving part of the problem and calling itself to solve the remaining problem is called 'recursion'. *move up*
/test.lua:4: in function 'draw'
	'draw' which displays its first argument and calls itself with the remaining arguments. *move up*
/test.lua:7: in main chunk
	This is the point where our file has been executed. In line 7 we called function... *move up*
(...tail calls...)
	Lua optimizes so called 'tail calls'. You don't get additional information. *move up*
[C]: in function 'xpcall'
	it called a function also called 'xpcall'. But this one is written in C and not in Lua. (Therefore we don't get much info.) Of course this function also calls other functions. *move up*
machine:751: in function 'xpcall'
	It called the named function 'xpcall' defined in the builtin file 'machine'. In line 751.... *move up*
/lib/process.lua:84: in function </lib/process.lua:80>
	This is where it all started. The process library of OpenOS called a function without name created in file /lib/process.lua, line 80. In line 84 it called another function. *move up*

It is possible to get a traceback without an error by running debug.traceback() within Lua.

 

Debug Output

 


Staring at the code and error message didn't solve the problem. We still don't know where this value really came from.
Next step is adding debug output. For simple things you can use print. For more complicated projects it might be a good idea to write to a file or even create a whole library around it.
This is the updated program:


function draw(v, ...)
  print(v) --debug output
  if v then
    print(("%i"):format(v)) --normal output
    draw(...)
  end
end
draw(1, 2, 3, {4.5})

This is the output:


1					<-debug output
1					<-normal output
2
2
3
3
table: 0x689d10

v indeed is a table on the fourth call before it is used in function 'format'.

For complete information you should print the other function arguments, too.
With so many numbers being thrown around its best if you add the name of each value and describe what is happening:


function draw(v, ...)
  print("begin draw")
  print("v=", v)    --debug output
  print("...=", ...) --debug output
  print("#...=", select("#", ...))
  if v then
    print(("%i"):format(v))
    draw(...)
  end
  print("end draw")
end
print("before draw")
draw(1, 2, 3, {4.5})
print("after draw")

The output:


before draw
begin draw
v=	1
...=	2	3	table: 0xab54d0
#...=	3
1
begin draw
v=	2
...=	3	table: 0xab54d0
#...=	2
2
begin draw
v=	3
...=	table: 0xab54d0
#...=	1
3
begin draw
v=	table: 0xab54d0
...=
#...=	0

Here we can see that the table is already part of the four original arguments. We should better check the line where 'draw' is initially called:


draw(1, 2, 3, {4.5})

That is easily fixed:


draw(1, 2, 3, 4.5)

Now the output is:


before draw
begin draw
v=	1
...=	2	3	4.5
#...=	3
1
begin draw
v=	2
...=	3	4.5
#...=	2
2
begin draw
v=	3
...=	4.5
#...=	1
3
begin draw
v=	4.5
...=
#...=	0
4
begin draw
v=	nil
...=
#...=	0
end draw
end draw
end draw
end draw
end draw
after draw

Looks good, doesn't it? Let's remove the debug output again:


function draw(v, ...)
  if v then
    print(("%i"):format(v))
    draw(...)
  end
end
draw(1, 2, 3, 4.5)

Output:


1
2
3
4

There is another bug. Can you spot it?

 

4.5 is printed as 4.
In the previously removed debug output we could see that the value of v is 4.5 but on the screen it is 4.
There is either a formatting error or print() is not printing everything.
assert() can help here: When the first argument is true, it does nothing, else it throws an error with its second argument as a message.
So we can split line 3 in two parts and add a test in the middle which tells us if the formatting code works as intended.



function draw(v, ...)
  if v then
    local text = ("%i"):format(v)
    assert(tonumber(text) == v, "Formatting didn't work!")
    print(text)
    draw(...)
  end
end
draw(1, 2, 3, 4.5)

Output + Error Message:



1
2
3
/test.lua:4: Formatting didn't work!

It is a problem with formatting. The format is "%i" which is only outputting integer numbers. One could use "%f" instead to allow for floats.
Now everything works. You can remove the assert.

It is good practice to write modular code and to avoid too long lines doing all at once.
That way you can always squeeze in some debugging code.

It isn't always practical to write print() in every corner of your program.
If your bug originates from a single error you can use the binary search algorithm:

  1. The error could be anywhere from the beginning of the program to the point where you noticed it. (e.g. crash, invalid output, wrong redstone signal etc.)
  2. Add a test in the middle of the area of code that could contain the error.
  3. If the test is passed the error is in the following half. Otherwise it is in the half before the test.
  4. Unless you found the code that caused the error, you continue with step 2 but with a smaller area than before.

Here is an example:
 



function a(x)
  return x * 2
end
function b(x)
  return x * 3
end
function c(x)
  return x-x
end
function d(x)
  return -x
end

x = 15
x = a(x)
x = b(x)
x = d(x)
x = a(x)
x = c(x)
x = a(x)
x = d(x)

assert(x~=0, "x is never allowed to be 0")
print(1 / x)

The area of attention lies in this area:



x = 15   --x It could be any line marked with an x.
x = a(x) --x
x = b(x) --x
x = d(x) --x
x = a(x) --x
x = c(x) --x
x = a(x) --x
x = d(x) --x

Insert a test in the middle:



x = 15
x = a(x)
x = b(x)
x = d(x)
assert(x~=0, "It is the previous half.") --passed
x = a(x) --x We halved the search area.
x = c(x) --x
x = a(x) --x
x = d(x) --x

And again:



x = 15
x = a(x)
x = b(x)
x = d(x)
x = a(x) --x We halved the search area again.
x = c(x) --x
assert(x~=0, "It is the previous half.") --failed
x = a(x)
x = d(x)

And one last time:



x = 15
x = a(x)
x = b(x)
x = d(x)
x = a(x)
assert(x~=0, "It is the previous half.") --passed
x = c(x) --x This is the only line remaining. What is c doing?
x = a(x)
x = d(x)

If c was a more complex function you could do another binary search over the whole function.
In this example it is quite obvious that x-x results in a value of 0.

 

 

Appendix

List of error messages

  • Syntax Error: detected when loading code
    • syntax error near 'x'
    • 'x' expected near 'y'
    • 'a' expected (to close 'b' at line c) near 'c'
    • Most of the time it's just a missing 'end' or bracket. That's easy to see if you use indentation.
      Sometimes it is a bit harder to find, but b()=2 clearly is not correct.
    • <eof> means the end of file
  • Runtime Error: detected when the code is executed
    • Runtime errors aren't as easy. They can hide in rarely used code and there is a variety of them:
    • bad argument #1 to 'func' ('type' expected, got 'wrongType')
      • You called a function but one of the parameters you've given has the wrong type.
        You should check the functions documentation and compare to how you use it in your program.
    • attempt to index local/global/field 'name' (a 'type' value)
    • attempt to index a 'type' value
      • You tried to access a table field but there actually was no table you could access. (but nil, a number, a function etc.)
        Check the return value/variable you are using and work backwards to find out where a check has failed, the value has been overwritten or not written.
    • table index is NaN/nil
      • You tried to access a table using an invalid index. Everything is allowed to be an index except nil and NaN.
    • attempt to call local/global/field 'name' (a 'type' value)
    • attempt to call a 'type' value
      • The return value/variable you tried to call as a function hasn't been a function or a value with a __call metamethod.

Using a Debugger
In other environments it is possible to use an external program to help debugging, a 'debugger'.
This would display and modify the values of variables and would allow to stop the program at previously defined breakpoints.
After stopping it is possible to step through the program line by line to check what it is doing.
Just to be clear: This is possible in normal Lua but OpenComputers already occupies the same feature to limit execution time.

Ad Alert: cbrowse
You can easily add advanced debug in/output by installing my program cbrowse.

The debug interface works by calling this:

local a, b, c = 1, "A", {test = true}
require"cbrowse".view(a, b, c)

Tables can be manipulated. If you want to change the value of a variable you have to temporarily use a table and manually export the result back to the local variable.

local a = 1
local _t = {a = a}
require"cbrowse".view(_t)
a = _t.a

But there is one drawback: None of the events happening while cbrowse runs reach your event.pull in your program. Event listeners work though.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use and Privacy Policy.