Do We Need a Better Medium for Representing Code?

Unpopular opinion: text is actually a really bad medium for storing source code

Text source code should be a view of the truth, not the source of truth. I should be able to freely switch between whatever views I want.
— Hillel (@hillelogram) May 20, 2019

After all these years of writing code as texts, I completely agree that text is not a good medium for storing code. But I'm afraid that texts won't go away in any near future. In this clipping, I would explain why text is a bad format, show several attempts to fix this problem, and finally, why text is gonna stay for long.

Why are texts bad?

After all, the sole purpose of writing code is to express our ideas. There are many ways to express ideas. Other than texts, we can express ideas with drawing, illustrating, speaking, performing, and so on.

The text format became so popular and dominated almost all programming language because it's the easiest and cheapest way for machines to understand our ideas. We didn't have any sophisticated technologies for machines to recognize our voices, pictures, facial or body languages decades before. (And we still don't have any of these technologies now.)

Even though text is so popular, I still feel the pain from the fact that we will lose so much information when translating ideas into code in plain text. Most of the time, the code we write is only the executable part of the idea: how a feature should work. Other more important information, like why this feature works in this way, was lost in the translating process.

When running the code, this information loss won't create any trouble. As long as the code express the logic correctly in a way that machines can understand, machines will do exactly what they are told to do. The problem occurs when we try to understand the idea behind the code and we need to update this idea.

Here are some typical cases I ran into everyday:

Go to definitions

Maybe it's because I don't use the most powerful IDE or the most advanced plugins, I still rely on text pattern matcher like grep and rg to find the location of module definitions and variable definitions. And every once a while, my attempt to jump to a definition would fail because there are modules with similar names or something like that.

Every occurrences of a class name, variable name is actually saying "we are going to use the idea represented by this name here." And since the runtime can understand this fact easily, our tooling should be able to do this as well.
Change variable/class names

On the other way around, I should be able to change a class name while not changing any existing behaviours of my application. But currently in most dynamic languages like Ruby, I still need to rely on text search and replace every occurrence manually.
Documentation & test

To me, documentation and test are also important aspects of our ideas. They explain why the code works like this and gives examples of how to use this code.

But in most languages, documentations are not tightly attached to the code itself, so many documentations are telling lies. Tests are often placed in a separated file and in a separated directory. It's a hassle to switch between test file and implementation file, not to mention to understand them as a whole.

What does this "better" look like?

The ideal situation is for our machines and tools to understand our ideas better and then support our work as programmers in a better way. Like what's said in the tweets above, we should be able to switch between different whatever views or aspects of an idea. Then, it's natural for us to jump to definitions as we wish, and change names without breaking the application in any way.

Attempts to fix this problem

This is not a new idea. And there are tons of solutions out there trying to help machines to understand our thoughts better already.

Type systems

By specifying type information explicitly, we help the machine to understand more aspects of our program.

But writing type information can become cumbersome. (Just think about languages like Java)

Type annotations and inference can help a lot. See TypeScript, dialyzer for erlang and elixir, and sorbet for Ruby.

IDEs & LSP

IDEs like IntelliJ by JetBrains and the Language Server Protocol by Microsoft are tools that leverages parsers and compilers to help us modify our code more efficiently.

Visual Programming Language

Pure visual languages like Prograph are attempts to program in more formats other than text. It's a pity that they are not popular enough.

Integrated Language Design

Languages like Smalltalk was developed with developing experience in mind. And they provide a complete tool chain to support this experience.

I'm excited to see new languages like Dark which promises to follow this direction as well:
- The defining principle behind Dark is that we want to remove all accidental complexity from coding.
- The core problem here is that the tools we have been building — as an industry — are incremental.
- Decreasing complexity
  - Infrastructure complexity
  - Deployment complexity
  - API complexity
  - Code-as-text complexity
-- from What is Dark? – Darklang – Medium
And also there are some similar attempts to add this kind of integration to existing languages like JavaScript:

This is awesome! We're just finishing up the editor for @darklang, and it works almost exactly like this!
— Paul Biggar (@paulbiggar) June 4, 2019

Do we need a better medium?

That being said, would text go away from programming world eventually? And would some multimedia format replace text completely? I really doubt that.

After all, we've been using text as a communicating medium for too long. Since the first human started drawing characters on oracle bones, text has become the best choice for our async communications. And as I mentioned above, writing code is communicating our ideas. So text is still and will remain to be our best choice.

But there are still many things we can do out there. The goal is for machines to understand our ideas better without too much extra help from programmers.

And eventually, we will build new abstractions upon our existing text format. New languages like Dark will appear and become more and more popular. Just like what high level languages did to assembly and machine code.

I can't wait to see more possibilities like this.

Do We Need a Better Medium for Representing Code?

Why are texts bad?

What does this "better" look like?

Attempts to fix this problem

Do we need a better medium?

Comments