Image courtesy of Jonathan Lidbeck

Self-Explaining Code

What Just Happened?

Ever see that classic scene is a movie where the main character is trying to impress someone at a restaurant and the waitress comes back to the table only to make a big show of cutting up his credit card? And when prompted why she cut it up the waitress responds "I don't know. The machine only said to cut up the card." That scene signals either the start or a very poignant event during a downward spiral for the character. It signifies the character's loss of control and the journey to the bottom so they can overcome things later.

Now imagine that feeling happening to your users every time they try to use your software. When they are confronted with something that doesn't match their expectation they feel a loss of control at the situation. Now when they call you for support they are upset and dealing with some uncomfortable feelings. And the situation starts all over when the customer service rep tries to use the internal system to view the customers info and feels a similar loss of control when they can't explain why the info is not matching what is expected.

The systems we build impact people and their emotional state constantly. As a software developer you are occasionally brought in to help explain the discrepancies. This often leads to users placing their frustration on us as we act as a proxy for the system. Maybe instead of spending time on production support we can help the software grow up and answer for itself.

When I Was A Child

The default mode for current software systems is that of a toddler. You can ask it a question or tell it to do something but when you ask it why it gave the answer it did it doesn't know. Or answers "because".

Let's look at a concrete example of something everyone loves. Below is a screen for a loan repayment website.

Loan Repayment UI

Spot the problem yet?

The outstanding loan amount is still $950 despite having made 3 $25 payments. The user would obviously be upset and might call our customer support to find out why they still owe $950 instead of $925. If you don't think this is a problem then add zeros to the end of all the number until you get uncomfortable.

Here is the code for that screen.

Now this example is purposely over-simplified but for the most part this is standard code. Control flow and looping were the developer's main concerns and making it work was the definition of done.

When brought in to explain what the code is doing the developer might wave his hands and talk about the posted status flag or some other technical view of the business domain. And that's great that we got the answer for the user but that whole process was expensive and provided a terrible customer service experience.

I Put Away Childish Things

The root of the problem is that the software has no concept of what it is doing. It is acting purely as programmed and provides no explanation when things go into unexpected territory. As the demands on systems have grown we still find ourselves building them out of the same control flow and logic statements. Let's take a look at an extremely simple example of having the system grow up a bit.

Now when the querystring why=True is sent to the page the code will execute that provides an explanation of what it is doing. The explanations are then attached to the result and in this case displayed on the page along with the loan data.

The UI now looks like:

Loan Repayment UI

Now we can see that the transaction on 4/14 hasn't posted so it won't affect the balance. You might say we should have just added the transaction status to the interface so the user could see that for themselves but that's not the point. It's not always possible or useful for the UI to display all data involved in a decision. There's a reason that your bank account balance is one field on the page and not a list of transaction that you have to add up for yourself. This example is a bit contrived but I bet you can think of at least one time when you wished you could ask a service what the heck it was doing in stead of trying to play detective with the code on one monitor and the bug report screen shot on the other.

Isn't This Just Good Logging?

Yes. This example was built from with constraints in mind. First it required no changes to the language or runtime. Second it had to be lightweight and somewhat performant. Third previous experiences influence future ideas so yes it looks like familiar logging code. The difference here is that instead of tucking it into some obscure file and making you find a way to correlate it based on time or user id it is returned with the call you make. This makes troubleshooting much simpler since you don't have to have a way to get a file off a production server or entries from a database. It also saves you from complicated correlation.

And it doesn't just help in production. Imagine if a failing test printed out the explanation for the request instead of a cryptic "975 does not equal 950". Imagine if the compiler could raise an error that there was no explanation for PENDING transactions and that the system would not be able to explain how it handles them. If this concept was built into the language and runtime itself then we start to get closer to Leslie Lamport's specifications and reasoning about systems.

Just The Beginning

The example above is really just a start. It's a poke at the current state of systems that do what they are told but tell us nothing about why they do it. It happened to be a web UI in this case but it could also apply to troubleshooting a misbehaving car brake or a closed source tractor's software. Software needs to be more transparent and self explaining. There need to be more options than saying it's a black box or you can read the source if you want.

Let's help software grow up.



comments powered by Disqus

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

© 2017 Frank Meola

Back to top