What is your favorite debugging anecdote? (original) (raw)
My favorites are from people who do not understand the operating system they are programming for. Two incidents from my consulting days that I remember well.
Customer needed to make decisions on machine operation (running on a priority based real-time operating system) based on state of DIO. They wrote a program which checked the DIO at 25 ms intervals and then “signaled” (actually, an OS based mechanism which was implemented as a counter and received as a message) another program when particular DIO values changed. Since the programmer knew the initial state, any signal that was received was interpreted as a change in actual value and a decision made. The problem – the decision was made on the basis of the signal being received rather than checking the current state of the DIO. The customer had not adjusted the priority of their program properly, so it would (occasionally) be interrupted for seconds at a time. During that time, several changes could occur to the same DIO port, multiple “signals” queued indicating change of port state, however when the program woke up it only received the first signal and made a decision based on what the programmer THOUGHT the current value of the DIO was rather than the actual value at that time. Resulted in some “unusual” (and often damaging) behavior of the equipment. Bad design in the first place (check actual state rather than assuming you did not miss a signal indicating state change causing your own internal record of state to be off), however they also failed to understand how the underlying OS signalling mechanism and how it worked (plus the scheduling mechanism).
Customer wrote a program which controlled a motor. Again, customer was running on a real-time, priority driven OS, however one that defaulted to a decay based scheduling. Customer did not realize this, so the motor would work fine for 5 minutes, then shut down for 5 minutes as control program dropped in priority, then work again, etc. A failure to understand the underlying OS and how it worked (although it is a valid argument that the OS should not have defaulted to a decay based scheduling, it was also well documented that it did along with recommendations change this scheduling for any program needing real-time scheduling).
]]>