Spherical Cows in Software Development

11 min readJun 27, 2018

The Joke, as recounted in this blog:

Milk production at a dairy farm was low, so a farmer wrote to the local university to ask for help. A multidisciplinary team of professors was assembled, headed by a theoretical physicist, and two weeks of intensive on-site investigation took place. The scholars then returned to the university, notebooks crammed with data, where the task of writing the report was left to the team leader. Shortly thereafter the physicist returned to the farm and advised the farmer, “I have the solution, but it only works in the case of spherical cows in a vacuum.”

I was reminded of the old joke by a recent article by John Regehr, Professor of Computer Science at the University of Utah, entitled “Closing the Loop: The Importance of External Engagement in Computer Science Research.” He observes a loop that begins with researchers dropping “irrelevant details” and investigating an idea in the “abstract problem domain.” Their results are then “adapt[ed] to engineering context” so they can be applied in the “concrete problem domain.” Academics observe things happening in the world and feed those observations back into the loop, removing “irrelevant details.”

Feedback Loop Between Academia and the Field

Dr. Regehr observes the loop from the perspective of an academic. I had observed the same loop from the perspective of a practitioner, years ago. And yet, the loop appears to be an ideal rather than a reality; at least, it seems to operate only at a fraction of its potential value.

When I took an interest in this loop, I quickly ran afoul of academics. In 2012, I wrote a blog post entitled “Delivering provably-correct code.” In it, I suggested several approaches developers could use to gain high confidence that their code would be suitable for release.

You probably spotted my error immediately. My informal use of the term “provably-correct” triggered a number of responses from readers. None of the responses was positive. The most courteous of them was along the lines of, “Why do you want to redefine the word proof?” Some of the less-courteous examples ran to several times the length of the original post, and if you’ve read any of my writing you know that can be rather long. I don’t have a PhD, but if I did it would likely be in the field of Windbaggery.

Is This a Rant?

No. I’ve long been interested in the potential value of the feedback loop between academia and practitioners. For just as long, I’ve been frustrated by the fact it seems barely to function. I’m still exploring the idea.

It occurs to me that the two communities have very different mindsets, assumptions, vocabularies, and communication styles. When they talk about the “correctness” of software, they’re talking about two different things. When they get into a discussion, their aims are different.

In my experience, when practitioners get into a discussion they are usually trying to find a workable solution to an immediate problem. They aren’t thinking about developing a general model for a class of problems. They throw out ideas and try things without worrying too much about whose idea ended up working, provided they get something working.

To contrast that with what academics do, I can only go by what I see and hear from them, as I’m not an academic myself. When they discuss something, academics seem to be competing with one another to sound smarter than their peers. This may only be an impression caused by their customary manner of interacting, but whether real or perceived, it’s a turn-off for practitioners.

So, there’s a natural communication barrier that makes it difficult for the two communities to share knowledge.

Steadfastly Missing the Point

The first reply to the post about delivering correct code came from the same John Regehr whose article rekindled my memory of the experience. His comments at that time provide a good summary of the fundamental divide between academic thinking and practical thinking:

Regehr: “…one problem is that nobody has ever (that I know of) written a formal specification for something like a web service. […] …I think you will notice a large gap between your example specifications and what the specs for a serious real-world API would look like.”

Me: “For Web Services specifically, the specification is the WSDL.”

Regehr: “WSDL has no mathematical meaning that I know of. Until it does, we can’t do proofs about it.”

He’s right. WSDL has no mathematical meaning. And from the practical side of the world: So what?

That’s the divide, right there.

When a customer asks me to write a service, they aren’t asking for a mathematical model; they’re asking for working software. Is a WSDL a formal specification? Yes, by definition, it is exactly that, and it’s perfectly adequate for working with real cows in the field.

APIs are pretty common these days. A “serious, real-world API” is nothing at all like a mathematical proof. An API’s purpose is not to be a perfect model, but to define the intended interactions between a requester and a provider of a service. What about unintended interactions? Out of scope. To a practitioner, a “real-world API” describes (a) what to pass to a service and (b) what to expect the service to return. That’s all.

Is that a complete model? No. Is it sufficient to enable us to provide value to customers? Yes.

That’s only one small example of the difficulty of communication between the two communities. It often seems as if every single word is understood differently and requires extended conversation about definitions. I suspect few people have the time or patience do indulge in that more than a couple of times before they give up.

Reality Emerges from Research…Or Does It?

The academic view appears to be (and I may be reading too much into this) that a thing doesn’t exist at all unless and until it has been published in an academic study.

The impression was reinforced strongly when I attempted to bridge the divide by participating in the academic tracks of agile and technical conferences in the 2007–2008 time frame. I wanted to see how we could tie the academic and practical worlds together in a useful way. I was surprised to discover many academics are uninterested in any practical results.

A report given at Agile 2007 on a study of Test-Driven Development, which I described in a blog post entitled“All evidence is anecdotal,” found that TDD can be useful for controlling cyclomatic complexity.

In an informal discussion after the presentation, I learned the researchers were extremely pleased with themselves, although their results seemed obvious to me. They believed they had “discovered” something that no one had ever noticed before. I suggested it wasn’t true that no one had ever noticed this correlation. I told them that practitioners wouldn’t bother using TDD if they hadn’t experienced benefits like that one. They said, well, no one has published it before.

I think that is not a useful way to view academic work. It inherently separates the theoretical from the practical at a fundamental level that is difficult to overcome when we want to share research findings or practical experiences between the two communities. A healthier approach might have been for them to ask themselves, “Why do practitioners favor this technique? Does it have benefits we can measure?” Simply to assume nothing exists unless an academic has published it seems foolish to me.

Not All Research is Equal

Also noted in that blog post, I found a lot of the studies that are reported at conferences are carried out by students who have (a) no experience developing software, (b) no experience in conducting controlled studies, and © no experience making public presentations. They carried out their studies as a way to practice the techniques of setting up and running controlled studies, writing up their findings, and speaking in public.

All those skills are useful and it’s great for students to practice them. The problem is the results themselves are often nonsense, and yet many people out in the world trust such studies to tell them what will “work” or “not work” for real projects. The student researchers lack deep understanding of what they are observing. As a consequence, they often correlate observations that don’t actually correlate, evidently in order to beef up the number of data points they can input into their analysis tools.

In a conversation with the presenters of a study of pair programming given at XP 2008, I learned that the things they had observed under the heading of “pair programming” were random behaviors that didn’t resemble pair programming much at all. And how could they have known better? They weren’t professional software developers, and they were pretty new at the whole process of doing research, as well.

Anecdotal Evidence is code for Unproven Bullshit

There’s a general attitude that “anecdotal” reports are not to be trusted. Somehow, our society has come to a place where people distrust their own eyes and believe the latest “study” to come along. Last week, coffee was bad for you. This week, it’s good for you. Next week?

Be wary of those who disparage real experience in favor of mathematical models of spherical cows. They use the word “anecdotal” as a passive-aggressive pejorative term. “Anecdotal” sounds like “someone heard from a friend of a friend’s uncle’s ex-wife’s cousin’s bartender.” It isn’t that. It’s experience delivering value. Real experience delivering value that customers pay for. In the dirty, smelly field. Where the real cows live.

Practitioners Ask for Studies but Don’t Read Them

Sometimes when I’m coaching teams on contemporary techniques like TDD and pairing, developers will demand to see a study that “proves” the technique “works.” They insist they will not change their habits until they see such a study. Why should they listen to advice from some stranger? They’re doing just fine.

I ask them to show me the study that convinced them to work in whatever way they currently do. After calling me a few names, they admit that they did not demand to see any studies when they were learning their current methods of work. They didn’t rely on a study to teach them how to build software. They listened to advice from some stranger. And they’re doing just fine.

They’re asking for a study now as a way to avoid learning something new. Even if I showed them a study, they would find a reason to reject its conclusions, because they have already made up their minds.

How Good is the Research?

Laurent Bossavit, a highly respected developer who also has a firm footing in the academic world, responded to the post about anecdotal evidence. He wrote:

“It’s not that the anecdotal evidence is good, but more that the research is bad. This is not an isolated case, and most of what passes for research in software engineering is horrible.

“Research in this field has two huge problems, relevance and validity. On relevance, let me just quote the words of one academic who rejected a colleague’s proposal for a workshop on TDD: ‘TDD is of great interest to the software engineering community, but it seems a topic too close to industrial practice to be interesting to a research conference’. (Yeah, I WTFed at that too.)”

Yeah.

Statistical Information is Useful, but is Not “Truth”

If we observe 100,000 developers over a period of 10 years, we can compile a good deal of raw data about how they built software and how valuable the results were. We can feel comforted by the sheer volume of information, just as a thick coverlet with lots of soft stuffing warms us on a cold winter’s night.

But the averages and means and what-not that we extract from that data won’t describe any single real-world situation in a way that we can use on the ground. Statistical analysis provides useful clues, but also a false sense of security about what “works” in practice.

Why? Because the differences among situations are significant enough that just about any and every technique we might employ has to be tweaked and customized to “work” in any one context. We can try and try to distill out the commonalities, and when we do so we’ve abstracted things to such an extent that the resulting abstraction is meaningless.

It’s a spherical cow in a vacuum. Unmilkable.

A more practical outcome of research would be general guidelines for how to tweak practices to fit broad categories of situations. Ultimately, practitioners on the ground must use their judgment and creativity to make things “work.”

If we don’t trust in research at all, our methods will stagnate. If we trust in research too much, we’ll never deliver anything because we’ll wait forever for the perfect model to emerge.

Can We Bridge the Divide?

Getting back to the loop Dr. Regehr describes: If academics are uninterested in the needs of industry, then where’s the loop? How can feedback flow between the two communities? As things stand today, academics tend to ignore anything practitioners come up with because it isn’t based on a mathematical model, and practitioners tend to ignore anything academics publish because it isn’t grounded in practical reality.

Can we bridge the gap between academics and practitioners in a way that benefits both communities? I have a few ideas:

Practitioners: Help your academic friends understand why you do things the way you do. What problems led you to do things that way? How do the methods you use solve or mitigate those problems? I think you’ll find that by considering how you would explain your work to someone who wants to study it will help you understand why you do things the way you do. You might discover you’re doing things out of habit without really understanding why. Pondering this can be healthy, even if you never talk to an academic about it.
Academics: Don’t close your mind the moment you see a word you don’t like. It’s just possible the practitioner who mis-spoke has something of value to offer, but doesn’t speak your language. Disparaging their solid, real-world experience as “merely anecdotal” doesn’t help anyone, and harping on the mis-used term at great length doesn’t make you sound clever. You do not have and will not find a better source of information about what “works” than the real-world experiences of practitioners.
Normal people: Don’t assume a software development practice has to be “proven” by an academic study before it’s useful. Academics can only study what they observe. They can only observe what practitioners do. Practitioners do things that help them get work done. When they try something that doesn’t help them, they stop doing it. Therefore, the question, “Does this work?” is not valid. If enough people are doing a thing that academics have noticed it, then it’s already working, at least in some context. That’s the reason people are doing it, and it’s the reason the thing is worthy of study. Questions like “How does this help?” or “Under what conditions can this help?” or “What are effective ways to do this?” would be more useful.

If you have more ideas on how to bridge the gap, please share them.

About the Author:

Dave Nicolette has been an IT professional since 1977. He has served in a variety of technical and managerial roles. He has worked mainly as a consultant since 1984, keeping one foot in the technical camp and one in the management camp…Read More.

Originally published at www.leadingagile.com on June 27, 2018.