Lukas Püttmann    About    Blog

Do the programming languages we use influence how we think?

This article says that which programming languages people or organizations use might influence their thinking and culture.

I’m reminded of the discussion in linguistics on the idea that language structure might shape how we think. This idea is controversial and Steven Pinker, who’s skeptical of it, writes:

And supposedly there is a scientific basis for these assumptions: the famous Sapir-Whorf hypothesis of linguistic determinism, stating that people’s thoughts are determined by the categories made available by their language, and its weaker version, linguistic relativity, stating that differences among languages cause differences in the thoughts of their speakers. […]

But it is wrong, all wrong. The idea that thought is the same thing as language is an example of what can be called a conventional absurdity: a statement that goes against all common sense but that everyone believes because they dimly recall having heard it somewhere and because it is so pregnant with implications. (p57, “The Language Instinct”)

This idea has made inroads into economic research. Keith Chen argues, for example, that how languages encode references to the future has implications on how people think about the future. So a German speaker who says “Ich gehe morgen in die Kirche” (“I go tomorrow to church”) without having to add any explicit grammatical marker for the future tense, is thus more patient and saves more for the future.

No matter if the Sapir-Whorf hypothesis makes sense for natural languages or not, might there be something in it for programming languages? In the article we read:

If you want to know why Facebook looks and works the way it does and what kinds of things it can do for and to us next, you need to know something about PHP, the programming language Mark Zuckerberg built it with.

Among programmers, PHP is perhaps the least respected of all programming languages.


You wouldn’t have built Google in PHP, because Google, to become Google, needed to do exactly one thing very well – it needed search to be spare and fast and meticulously well engineered. It was made with more refined and powerful languages, such as Java and C++. Facebook, by contrast, is a bazaar of small experiments, a smorgasbord of buttons, feeds, and gizmos trying to capture your attention. PHP is made for cooking up features quickly.

However, people and organizations get to choose which language to use, unlike natural languages that you have little say over.

But people don’t switch immediately when there’s a better language available. There’s a lot of sluggishness in how organizations change their systems. And if everybody else in your area of work uses some language, then you probably stick to it as well. And the structure and capabilities of that language might then shape how people think about problems.

This idea is actually mentioned in the good Wikipedia article on this topic which also refers to this Paul Graham essay in which he describes Lisp as his secret weapon.

But there’s also another viewpoint on this. The original article goes on to introduce OCaml, an exotic functional programming language used by the Hedge Fund Jane Street. Such a programming language demands more of the people who use it, but it’s easier to ensure correctness of the programs.

The culture of competitive intelligence and the use of a fancy programming language seem to go hand in hand.

So what if it’s mostly about signaling? Like those tough interview questions in consulting interviews or intelligence tests during applications for investment banks that might hold little predictive value of how good somebody will be at their job, but instead serve as a marketing tool to convince new hires that one must be really clever to work at this firm.

Maybe the reason some people are so make more productive is not the programming languages they use and they’d be similarly productive in other environments.

Do as I did

When I was finishing school, I went to two career events by the local Lions Club and Rotary Club in my town. In both, a group of stately men sat at tables scattered around the room and we went from table to table and asked them about their professions.

One of them was a former manager and had just started an executive search firm. One was an economist and senior member of the Bundesbank. One was a lawyer working as a lobbyist at the European institutions in Brussels. One was a computer scientist working as management consultant. Several others were business executives.

And they all said variations of the same: “Do as I did.”

  • “Computer science is the best way to learn how to think in a procedural way.”

  • “If you want a good career with a 100,000 euro starting salary, you have to study law.”

  • “Only studying economics can teach you where phenomena like inflation come from.”

  • “I was the president of the student organization in Berkeley and that was very important in my career. These extra-curricular activities are very valuable.”

The only types that didn’t say this were the people who had studied business and management. Instead they said:

  • “You could also study something like aeronautic engineering.”

  • “You could backpack around Asia.”

A lot of the advice is good, but the most important thing I learned was this: There’s a limit on the breadth of career advice somebody is able to give, as most people can only really pass judgment on the decisions they themselves made. They post-rationalize their choices and try to get you to follow the same path.

Keeping records

I was thinking that few of us actually keep records of our written conversations. But then I remembered Stephen Wolfram’s “The Personal Analytics of My Life”:

I actually assumed lots of other people were [collecting personal data] too, but apparently they were not. And so now I have what is probably one of the world’s largest collections of personal data.

I have a complete archive of all my email going back to 1989.

Check out the figures.

Collected links

  1. Nature on the IPython notebook.

  2. FRED Adds 1,993 Banking and Monetary Statistics Series, 1914-1941 that is.

  3. Good post by Ricardo Hausmann on group identity.

  4. Ben Bernanke: “How do people really feel about the economy?”:

    In summary, the University of Michigan’s survey of consumer attitudes has shown a normal cyclical pattern of improvement in recent years, both in how people feel about their own economic prospects and in their expectations for the economy as a whole. In contrast, measures of the national “mood,” like Gallup’s “way things are going” question or questions about the “direction of the country,” show a high level of dissatisfaction.

    To an increasing extent, Americans are self-selecting into non-overlapping communities (real and virtual) of differing demographics and ideologies, served by a fragmented and partisan media.

  5. Corpus-based judicial opinions.

  6. Paul Krugman reviews the book by the former Bank of England Governor Mervyn King [source: The Browser]:

    In fact, King not-so-subtly mocks the authors of such books, which “share the same invisible subtitle: ‘how I saved the world.’”

    […] it is mainly an extended meditation on monetary theory and the methodology of economics.

    The more or less standard account of the 2008 crisis, which King shares, is that the combination of stability-fostered complacency and deregulation led to an accumulation of financial vulnerabilities. Private debt was on a steady upward trend before the crisis, […].

    People cope with this uncertainty by settling on “narratives” that are conventionally accepted at any given moment, but can suddenly change.

"What Is Code?", by Paul Ford

One of my favorite long-reads last year was “What Is Code?” by Paul Ford (emphasis added):

Your diligent decentralized team frequently writes new code that runs on the servers. So here’s a problem: What’s the best way to get that code onto those 50 computers? Click and drag with your mouse? God, no. What are you, an animal?

And that’s why everyone gets excited about GitHub. You should go to GitHub, you really should.

How Do You Pick a Programming Language? […] These are different problems. What do we need to do, how many times do we need to do it, and what existing code can we use to help us do it that many times? Ask those questions.

This is why the choice [of a programming language] is so hard. Everything can do everything, and people will tell you that you should use everything to do everything.

Related posts:

The Economist on GDP measurement and progress

Great article: “Measuring economies: The trouble with GDP”. (If it’s behind a paywall, try googling the title.)

Hans-Joachim Voth also discussed this.

I guess in the end, we need a certain fatalism. There are many ways we can try to adapt our estimates, but as Angus Deaton writes, the further two countries are away in time or structure (say Germany vs. Switzerland or Thailand vs. Kenya) the harder it becomes to compare the two in terms of production and prices in any meaningful way. It does not mean that we should stop trying to do better, but some fundamental gaps might simply not be bridged.

When testing isn't enough

Scientific code should be held to higher standards than other software. So it would help to write test cases that check if the outputs of our programs look plausible. But for some people, that’s not enough.

Yaron Minsky, who introduced the exotic programming language OCaml at the financial trading firm Jane Street, explains how they go about writing their sensitive systems:

We do an enormous amount of trading. There’s billions of dollars of nominal value kind of sloshing back and forth in the systems that we build. And what this means is, we are very nervous about technological risk. Because there is no faster way to light yourself on fire than to write a piece of software that makes the same bad decision over and over in a tight loop. (link)

He argues that on such a scale normal software testing isn’t enough, because even the very unlikely strange cases – that you haven’t thought about and written test cases for – might plausibly happen. So you have to understand the code really well and to make it readable, so that other people can check its correct functioning.

Only one inflation rate for the rich and the poor?

The story of GDP since 1940 is also the story of macroeconomics. (p20)

This is by Diane Coyle in her book “GDP: A Brief but Affectionate History”. Ever since the first Gross National Product (GNP) accounts were published for the United States in 1942, a great range of assumptions on what to include were necessary: Should we count services, the public sector or the financial sector? Ultimately these accounts are a social construct, so we need to decide which activities are worthwhile.

In macroeconomics, researchers have tried to get away from the model of the representative household by introducing heterogeneity among households. And similarly to these theoretical developments, new ideas for national accounts have been put forth: Thomas Piketty, Emmanuel Saez and Gabriel Zucman propose (pdf) to start using “distributional national accounts”. Previously we could answer what the aggregate economy produces and consumes, but these new accounts promise to tell us: How much has income grown for somebody at a particular place in the income distribution?

There already exist some indicators for this question, such as the top 1% income share estimated from tax returns. But what’s new is to provide accounts that are consistent with the macro data.

It’s interesting to ponder over the question of how to convert nominal to real values for income groups. Should we use one inflation rate for everyone or a different inflation rate for every income bracket?

Richer people spend less on food and other items relatively to their total income than do poorer people. The German statistical office, for example, offers this tool (in German) to calculate personalized inflation rates. But there are good reasons for and against using a single inflation rate and our choice should depend on how we want to think about income:

  • Income as consumption. More income means you can buy and consume more goods now or in the future. Normally, this is what economists think of when they hear “income”.
  • Income as economic power. Being rich also comes with more influence, so income might be a good indicator for who’s powerful in society.

The first concept is probably better suited for international or intra-temporal comparisons. We might ask: “How much better off is somebody in Switzerland relative to somebody in Kenya?” or “How much better off is somebody in Germany now than compared to 1950?” And for both questions we probably want to take into account that prices differ in the two countries and have been different in the past.

But within a single country at one point in time, the second concept is likely more useful. If both rich and poor people generally live nearby, compete for the same resources and participate through the same political entity, then we should probably use the same price indicator for both groups.

So it seems to make sense to just use one inflation rate in the distributional national accounts. But how large is the dispersion in prices that people actually pay?

Greg Kaplan and Sam Schulhofer-Wohl (pdf) look at scanner data for the prices of sales transactions by households [source: MR]. They find great variation among the prices that people pay for similar goods and this effect even dominates the movements of the aggregate price level:

[…] almost all of the variability in a household’s inflation rate over time comes from variability in household-level prices relative to average prices for the same goods, not from variability in the aggregate inflation rate.

And even similar households pay different prices for the same goods:

Households with low incomes, more household members, or older household heads experience higher inflation on average, […], but these effects are small relative to the variance of the distribution, and observable household characteristics have little power overall to predict household inflation rates.

So something else, apart from income, dominates individual inflation rates.

This is based on 500 mio. transactions by 50,000 U.S. households between 2004 and 2013. Coyle also argues in her book for using “user-generated statistics” (p138) to improve our understanding of economic activity. But it’s a pity that the time dimension for this kind of data is relatively short.

It’s previously been found that relevant economic actors (managers of firms in New Zealand) know remarkably little about the aggregate inflation rate. Kaplan and Schulhofer-Wohl offer the intriguing explanation that the aggregate inflation rate might simply matter little to individuals as they face different prices anyway. This probably also holds implications about how central banks should think about the transmission of monetary policy.

However, Kaplan and Schulhofer-Wohl say it’s important to know whether people can forecast their own personal inflation rate. If they cannot, then people might keep looking at the aggregate inflation rate as the best predictor of where also their personal price level will be in the future.

Coyle argues in her book that though GDP has many imperfections, it’s still the best way to measure economic activity and that instead replacing it we should use a “dashboard of indicators” (p118):

The U.S. Commerce Department called GDP one of the greatest inventions of the twentieth century, and so it was. There is no replacement for it on the horizon. (p138)

Not a replacement, but the authors Piketty, Saez and Zucman have a good point that we should add the cross-sectional dimension to it. So let’s hope that statistical agencies will take over this task, through maintaining and publishing these distributional national accounts.