How the Python Tutor visualizer can help students in your C or C++ courses (original) (raw)
Summary: This article is meant for professors who teach C or C++ programming courses. Despite its name, Python Tutor is a widely-used web-based visualizer for C and C++. It is meant to help students in introductory and intermediate-level C/C++ courses. Notably, it uses Valgrind to perform memory-safe run-time traversal of data structures, which lets it display data more accurately than gdb or printf debugging. For instance, it can precisely visualize critical concepts such as pointers, uninitialized memory, out-of-bounds errors, nested arrays/structs/unions, type punning, and bit manipulation. Both the C visualizer and C++ visualizer will always remain free to use.
Python Tutor is a free tool that has been used by tens of millions of people since 2010 to visualize and debug code step-by-step, mostly for introductory courses (e.g., CS1/CS2). Despite its name, it also visualizes C and C++ code (in addition to Java and JavaScript) to help students understand critical concepts and debug homework assignments.
This article shows instructors how Python Tutor can illustrate key concepts from a wide range of C and C++ courses. If you think this tool may be helpful for your staff or students, please share these linksin relevant course materials, chat groups, mailing lists, discussion forums, or social media:
- C visualizer: https://pythontutor.com/c.html
- C++ visualizer: https://pythontutor.com/cpp.html
(Also, if you teach in Java, check out what the Java visualizer can do as well.)
Memory-Accurate Representations of Data
One of the most distinctive yet challenging aspects of learning C or C++ (rather than, say, Python or Java) is that we actually care about where data resides in memory.
This example visualization shows data in the globals area, stack, heap, and read-only memory regions (denoted by the red "this is in read-only storage" label). You can step back and forth using the slider or buttons under the code:
The visualizer renders C strings as null-terminated char
arrays (with a '\0'
at the end of each). Note how s1
is a pointer to a string in read-only memory (because it's a string literal), whereas s3
is a pointer to a string on the heap due tostrdup(). And s2
is an inline char
array within the stack frame of main
. Theprintf
line prints all three as strings, so without this visualization it's hard to tell where each resides in memory.
Now toggle the "C/C++ details" selector at the bottom-right corner (under the stack frame of main
) to "show memory addresses." The visualization now shows the memory address where each global/stack/heap value resides. Notice how s1
has the pointer value 0x400644
and s3
has the value of 0x5402040
; these are the memory addresses of thechar
arrays they each point to.
For more detail, choose "byte-level view of data" to see the contents of each raw byte of data in both hex and binary. This is useful when teaching low-level memory operations such as bit shifting or masking. See "Binary-Level View of Data" for another example.
Accurately Showing Uninitialized Memory
If you use gdb
or print statements to display run-time data, you will see nonsense garbage values if a block of memory is uninitialized(i.e., not assigned to a value yet). This can be misleading to novices who may think those are real values. In contrast, the visualizer usesValgrind to track exactly which bytes are uninitialized so garbage values aren't shown.
Using the same example as above, if you rewind back to Step 2, you'll see a bunch of ?
representing uninitialized values on the stack when main()
first starts running:
Then if you step forward by clicking "Next >", each ?
will progressively fill in with the data initialized at each execution step.
Arrays and Pointers
Here's an example showing a stack array and three pointers into the middle of it:
To see the exact pointer values, toggle the "C/C++ details" selector at the bottom right to "show memory addresses."
Structs, Unions, and C++ Objects
We've already seen arrays in the above example. Structs and unions render similarly, and can themselves contain nested arrays/structs/unions. For instance, here is a pointer to a heap-allocated array of 3 Person structs, each containing an inline character array (firstName
) and a nested struct (birthday
):
For unions, the visualizer shows how all members share the same memory address. Here's an example contrasting structs and unions:
Now click "Next >" once to run line 18. Since all union fields share the_same memory address_ (0xFFF000BD4
), when line 18 is run, all those fields get initialized at once. In contrast, note how each struct field has its own separate address, so initializing one does not automatically initialize all the others.
And as always, toggle "C/C++ details" to "byte-level view of data" to see more details about what is going on at the binary level.
Here's an example of C++ classes showing two Rectangle
objects (one global and one on the stack), a call to the copy constructor (triggered by rect2 = rect1
in Step 2), and the this
pointer in the display()
member function in Step 10:
Different Kinds of Function Parameter Passing
Novices may struggle to understand the different ways that parameters can be passed to functions. To clarify, here is a visualization of passing parameters by value, by pointer, and by C++ reference:
Note how each function call is visualized as a stack frame underneathmain
, and it shows whether the x
parameter is a copy of or pointer to myNumber
.
Accurately Showing Array Out-of-Bounds Errors
The visualizer uses Valgrind to detect and report out-of-bounds errors. For instance, let's say you're walking a pointer along a heap array ofints
. What happens when you overflow or underflow? The pointer ends up pointing to a skull emoji 💀 next to the array since the address is in unallocated memory. Step through this example to see:
Here's the exact same example, except now I've toggled the "C/C++ details" selector to "show memory addresses." This lets you see the memory address of each heap array element above it in gray (e.g., the first element is at 0x5402040
) ...
... and when you take each step to do pointer arithmetic, you can see exactly what value the pointer p
holds and when it under/overflows the array.
Now what happens when p
points to a global array instead of a heap-allocated one? Sometimes when you do pointer arithmetic, it can overflow into the spot where a neighboring global variable resides! Step through this example to see how:
The first overflow shows a skull emoji 💀 at address 0x601044
but thenp++
becomes 0x601048
, which happens to be the start of the arr2
global array in memory. This example can show students both the raw power and danger of using pointers.
Type Punning and Misaligned Memory Accesses
Here's a more advanced example that shows the level of detail that the visualizer captures. Here arr
is initialized to a heap array with values 65, 66, and 67 (which correspond to the ASCII values for the characters A
, B
, C
, respectively). Now we perform type punning by assigning char* s
to arr
so that both point to the same block of heap memory:
Now you can view this block of memory as either an array of int
(through the arr
pointer) or an array of char
(through the s
pointer). Click the "type punning: [switch views]" link right below the array to switch between the two views.
If you're in the int
array view then try to step to execute s++
repeatedly, you'll see a bunch of skulls since it's often pointing into the middle of an int
element. But if you switch views to thechar
array view, then each s++
is always properly aligned with thechar
-sized array elements.
Here's the same example as above, except with "C/C++ details" toggled to "byte-level view of data." Now if you step through s++
note how it points into the middle of each int
array element since s++
advances 1 byte at a time whereas each int
is 4 bytes wide:
Binary-Level View of Data
To illustrate low-level concepts such as bit shifting, masking, and integer over/underflows, you can toggle "C/C++ details" to "byte-level view of data." This will show all bytes in memory as both hex and binary. Here is an example that shows an unsigned int
wrapping around from UINT_MAX
(4294967295) back to 0 when you run x++
This binary-level view makes it clear what's happening under the hood. There's a lot displayed on-screen in this view, so here's a brief guide:
- The memory address where
x
resides in the stack is0xFFF000BDC
- Since an
unsigned int
is 4 bytes on this machine, each of the four bytes is displayed at the bottom, with labels0xFFF000BDC:
,0xFFF000BDD:
,0xFFF000BDE:
, and0xFFF000BDF:
, which are four contiguous bytes starting at0xFFF000BDC
. - At Step 2, nothing has been initialized yet, so the data in all bytes are labeled as
?
. - At Step 3,
x
gets assigned the value ofUINT_MAX
, which fills up all bytes with 1's. The hex value for each byte is0xFF
and the binary is11111111
. - At step 5,
x++
forces the bytes to wrap around from all 1's to all 0's, which is why you see0x00 00000000
for all bytes (again hex + binary). - At step 7,
x++
increments the value by 1, so the lowest byte now has the value0x01
in hex and00000001
in binary.
BONUS: Pointer Inception!
Pointers can point to other pointers, even through multiple levels of function calls. How many levels deep can we go?
Please Help Spread The Word!
The C and C++ visualizers in Python Tutor can help your students understand and debug a variety of code that they encounter in introductory or intermediate-level courses.
Feel free to share these links in relevant course materials, chat groups, mailing lists, discussion forums, social media, or anywhere else:
- C visualizer: https://pythontutor.com/c.html
- C++ visualizer: https://pythontutor.com/cpp.html