# Problems & [Meta] Problem Solving

My sparse thoughts on problem solving.

Safe to Fail

Solving a problem with a high risk & high cost of failure is not advisable. The solution needs to be right first time, with no way of being sure.

Needing to be right in one go is much slower than rapid, low-cost iteration.

Finding ways to reduce or remove that risk creates a feedback loop, and an appetite to try.

[Example] Making failure safe: Upgrading a database client library

What is a database client library? For the purposes of this example, a piece of software critical to the application’s function.

For a long time a team I worked with had not been able to upgrade their databse client library, leaving them several versions behind with known vulnerabilities and reliability concerns.

It was critical for them to upgrade. But every time they tried, it went badly, memory shot through the roof, and the application crashed.

Here’s the approach they were taking:

There are a few evident issues here:

The wasted work comes from having just one version or another, and is therefore linked to the big-bang approach. Big bang approaches are always risky.

I’m not an expert on database client libraries. I knew less about them, and less about the specifics of this problem, than anyone else in the team. But what I did know is how to break this problem into something safer and more manageable.

I refactored the way the current library was used, to encapsulate it into a single place. This abstracted which library was being used from the rest of the application. (This is, besides all else, good design practice.)

Next I added a “client manager” which could, query by query, container by container, user by user, decide which library to use. This could be controlled by a feature flag and updated instantly.

I pushed this change, which had no impact whatsoever to the customer or application. I had not solved the problem of upgrading the library. I hadn’t done any investigation into the upgrade at all.

I showed the most junior engineer on the team the feature flag, the client manager, and how to use a memory profiler. Then I left them to it.

Within a week the engineer had plugged in the new library alongside the old, rolled it out for his user one query at a time, and used the profiler to identify the memory leak.

They shipped the new client library within another 2 days to all customers, rolling out gradually, with no further incidents. The reliability issues they had been battling for more than 2 years were gone.

Tight Feedback Loops

Tight feedback loops are more effective than knowing theory, for learning.

Neural networks today are proving this. AI with no knowledge of the rules of a game, that get reward/cost feedback quickly, are able to excel at the game.

Build feedback loops half as tight and you’ll progress rapidly.

[Example] Running things locally

As a software engineer, being able to run your program immediately on your machine means you can get feedback rapidly.

Reading a doc on how the system works or thinking for a long time about whether your change will work is hundreds or thousands of times slower than this feedback loop for developing understanding.

If this feedback loop slows down, that soon becomes untrue. If you have to push your changes to a shared server, with slow CI checks and a deploy queue, the price of each change goes up and suddenly it makes sense to be more methodical with each change.

[Example] Playing tennis

A tennis coach that helps you to understand your mistakes as you’re playing is many times more useful than a coach that takes to share at the end.

Every adjustment you can make in the game is a new experiment in getting better.

[Takeaway] Invest in feedback speed

With a good feedback loop things become self-documenting. You don’t need to share knowledge. The system itself is the teacher.

You don’t need to invest as much time in making this shot, or this code change, perfect. Adapt based on what you observed and go again.

High Fidelity Feedback

High fidelity feedback contains all the information you need to correct yourself.

Low fidelity feedback contains only a vague idea of the direction to travel.

[Example] Running form correction

Low fidelity feedback would be someone telling you your form is off.

Decent fidelity would be an in-depth description of what to change.

High fidelity feedback would be a resistance band around your thighs as your run so that you can feel your knees folding inward.

[Example] Debugging a program

If your program doesn’t do what you want, that’s low fidelity.

A log of the error is better.

A breakpoint with full context to inspect is high fidelity.

[Takeaway] A better compass

Feedback is to correct your course. The more information you have, the more precise that correction.