Language Design: Machine or Human?

2021-02-11

When designing a computer language, be it for programming, data exchange, document markup or configuration, the first thing you should do is decide as to whether your language is meant for use by humans or by machines. Well, obviously, every language will be used by both, but still, who is your main target audience?

This question is important because depending on your answer there are a few specific contradicting design principles that you should follow.

Easy to Parse vs. Easy to Read

It should be easy to write parsers for machine language. The syntax should be very explicit and the grammar simple without unnecessary syntactic sugar. A good example is JSON: There are booleans, numbers, strings, lists and dicts, each with a clearly distinguishable syntax without exceptions. That’s it. I can write a parser for it in very little time. That makes JSON well-suited as an data interchange language for APIs or in other places where humans will never have to touch it directly.

Another example of something easy to parse is the bracketing of Scheme or Lisp: Its exhaustive bracketing makes it easy to parse¹ — there is, among other things, no operator precedence to consider.

However, a language that is easy to parse through a machine is not necessarily easy to read for a human. JSON is easy to parse, but YAML ² is easier to read, as it has pretty syntactic sugar (e.g. you can often leave out the quotes), less syntactic noise and enforces using line breaks. Things that make it harder to parse, but easier to read.

Easy to Generate vs. Easy to Write

Here, the same principles as above apply. A machine language should be minimal without syntactic sugar. A machine doesn’t care about having to write a lot of quotes, for example. A human, on the other hand, does – JSON and YAML are again good examples.

Low-level vs. High-Level

A machine-focused language should be simple and low-level. The obvious example would be assembler languages. They directly model how the machine works. Human-focused languages however should not focus on how the machine works, but rather on how their human user thinks. This is why we have high-level programming languages that keep you from having to think of memory allocation, pointers and pushing bytes around.

Conclusion

There are of course trade-offs to be made because, as mentioned, computer languages always are an interface between both a machine and a human. But to design a good language you should decide on who your language’s focus is and design it with them in mind.

Take a look at Write Yourself a Scheme in 48 Hours!↩︎
Note that YAML is not a good language. It has a lot of problems, but in its basic form it’s at least easy to read. Take a look at strictyaml if you want something better.↩︎

Thank you for reading!

If you like, follow me on the ⁂ Fediverse / Mastodon! I'm @eisfunke@inductive.space.

If you have any comments, feedback or questions about this post, or if you just want to say hi, you can ping me there.