The Philosophies of Software Languages, from Go to Elixir

Publié dans Coder stories

09 avr. 2019

9min

The Philosophies of Software Languages, from Go to Elixir
auteur.e

image

The creation of the languages looked at in this, the last part of this series, were all driven, at least in part, by performance demands. Whether speed of execution, reliability of running code, ease of maintenance, or all three.

Hitting the wall
Between 1975 and 1985, central processing unit (CPU) clock speeds increased by an order of magnitude from 10⁶ Hz to just under 10⁷ Hz. Over the next 10 years, it increased to just under 10⁸. Then, from 1995 to 2005, it increased by two orders of magnitude to just under 10¹⁰ Hz. But that was not the big news. What happened after that would change the shape of future programming languages: CPU clock speeds barely increased at all. Fourteen years later, in 2019, speeds are only just a little above 10¹⁰ Hz. Even as early as 2006, it was pretty clear that 15 years of exponential clock rate growth (had) ended. The answer to this problem? If you can’t make the processors go faster, add more processors.

Programming for multicore processing
To take full advantage of multicore processors, a programming language needs to provide mechanisms for concurrency and parallelization. None of the languages we have examined so far in this series were designed with concurrency in mind… except, of course, Smalltalk, whose conceptual model of code objects as “networked computers” and strong encapsulation was inherently concurrency-friendly, with processes and semaphores as built-in primitives. All of the languages in this chapter—Go, Rust, Kotlin, Clojure and Elixir—were specifically designed to address this need for concurrency. They vary in their support of parallelization.

The elephant in the room
Another thing to consider is that the dynamic scripting languages that had become so popular in the previous few decades, such as Python, JavaScript, and Ruby, all shared a mostly well-deserved reputation for performance problems. This was commonly attributed to them being “interpreted,” rather than compiled. There are several arguments to be made about why this is a misconception, but one of the key issues was the impact of memory access patterns, and the way in which these earlier languages tend to create “a scattered ‘mosaic’ of little objects scattered all over the place.” The first two languages we look at here (Go and Rust) were designed to remedy this problem, although some people feel that Go does not go far enough, since it does not generics for concurrent data structures..

Go

Rob Pike and Ken Thompson know a thing or two about programming. Together, they worked on the first version of Unix, created an unknown but rather remarkable operating system called Plan 9 (first released in 1992), and invented the UTF-8 encoding scheme, which allows for multilingual web pages and was first presented at the USENIX conference in San Diego in January 1993.

Another thing they did together, while working with Robert Griesemer at Google in the late 2000s, was create a programming language called Go, which was unveiled in November 2009.

As already mentioned, by 2006, it had become clear to informed observers that future hardware performance gains would be achieved through the use of multicore CPUs. Google also had extremely long compile times and complicated dependency-management issues that had become problematic. One was that build times were between 30 and 45 minutes, meaning that programmers could only test their code a few times a day—testing 5 times could use up half of an eight-hour day! Another was the way dependencies are managed in C++, Java, and Python (the main languages used at Google at that time); every module declares its own dependencies (imports). This leads, at best, to redundancies:

“In 1984, a compilation of ps.c, the source to the Unix ps command, was observed to #include <sys/stat.h> 37 times by the time all the preprocessing had been done. Even though the contents are discarded 36 times while doing so, most C implementations would open the file, read it, and scan it all 37 times.”ibid

At worst, it leads to circular references or ambiguous orders of precedence:

“Consider a Go program with three packages and this dependency graph: package A imports package B; package B imports package C; package A does not import package C.
This means that package A uses C only transitively through its use of B; that is, no identifiers from C are mentioned in the source code to A, even if some of the items A is using from B do mention C. For instance, package A might reference a struct type defined in B that has a field with a type defined in C but that A does not reference itself. As a motivating example, imagine that A imports a formatted I/O package B that uses a buffered I/O implementation provided by C, but that A does not itself invoke buffered I/O.
To build this program, first, C is compiled; dependent packages must be built before the packages that depend on them. Then B is compiled; finally A is compiled, and then the program can be linked.”
ibid

In designing their new language, Griesemer, Thompson, and Pike made the conscious decision to attack all three issues mentioned above: Speed of execution, reliability of running code, and ease of maintenance.

The first goal was to ensure speed of execution through built-in support of high-performance networking and multiprocessing, inlining functions, and efficient (compact) memory layout.

The issue of reliability of running code was ensured with static typing and garbage collection.

Finally, the ease of maintenance was addressed by attempting to design the syntax of the language to be readable and usable in the same way that Python and JavaScript are.

In their own words: “We felt it should be possible to have the efficiency, the safety, and the fluidity in a single language.”

Docker and Kubernetes are written in Go, and it is used by Heroku, Cloudflare, SoundCloud, and Bitly, among others.

Rust

In January 1998, Netscape “open-sourced” the source code for its browser (originally code-named Mozilla), and the nonprofit Mozilla Foundation was formed around it. Eight years later, in 2006, Graydon Hoare, a programmer at the foundation made some of the same observations as Griesemer and co, and started work on his language, dubbed “Rust.”

Hoare was definitely focused on two out of the three above goals: Speed of execution and reliability of running code. As for ease of maintenance, it is not entirely clear whether he considered C++ syntax to be the most readable and useable, or whether he simply considered it to be the most familiar. What is clear is that he devoted even more attention to the efficient use of memory than the Go authors, forcing developers to consciously manage memory using resource acquisition is initialization (RAII) instead of the more popular automated garbage collection, and guaranteeing memory safety.

Although, at a more superficial level, it might appear that Hoare was less concerned with fluidity, or ease of programming, he designed Rust with features such as higher-order functions (closures) and polymorphism using a user-definable type system (à la Haskell) that significantly enhances programmer productivity (when used by master programmers).

Rust was introduced in 2010 and is used by Amazon, Atlassian, Dropbox, Facebook, Google, Microsoft, npm, Red Hat, Reddit, and Twitter.

“Basically I’ve an anxious, pessimist personality; most systems I try to build are a reflection of how terrifying software-as-it-is-made feels to me. I’m seeking peace and security amid a nightmare of chaos. I want to help programmers sleep well, worry less.”Graydon Hoare

Kotlin

Some readers will question why this chapter doesn’t include Scala. Although it’s a language that has great support for concurrency, its original guiding philosophy was more centered in functional programming. Kotlin, however, which was inspired by Scala in a way, does have a specific focus on concurrency.
JetBrains is a software publisher that provides integrated development environments (IDEs) for a wide variety of programming languages. Because of this, it knows a lot about language design.

In 2010, designers at JetBrains were feeling the same needs that the authors of Go had been feeling a few years earlier. Go had not gained widespread adoption, so JetBrains decided the market was still open to another language solution.

Kotlin’s design, like Go’s, is optimized for rapid compilation, type safety, and readability. Its philosophy is to fulfill the very specific requirements that “industrial”-scale programming imposes: Compile times, dependency management, type safety and strict compilation, inlining functions, and not least, clear, concise, readable code.

Also, like Go, Kotlin has a choice of compilation targets: the Java virtual machine, JavaScript, and native.
With Google’s adoption of Kotlin as an officially supported language for Android development, the language has reached a tremendous level of acceptance in the mobile-phone world, but in the sphere of large-enterprise systems, Java still rules… for now.

Clojure

Rich Hickey really liked Lisp, and felt it should be made more modern. It needed to support concurrency/parallelism, and he wanted to take advantage of the extremely optimized Java Virtual Machine (JVM).

Clojure provided a completely new conceptual programming model to handle concurrency; that of identities and abstracted states. In this model, an object is abstracted from its states, which are immutable snapshots of its current values, and the values themselves, which are also immutable. Someone who is used to relational databases and entity/relationship modeling could think of it this way: Clojure considers identities as one entity, values as another entity and states as a “linking table” that creates a many to many relationships between the two. Doing this allows for more efficient memory management, and more critically, allows different pieces of code to operate concurrently, because what they refer to will not suddenly change as a result of some other piece of code. What replaces mutability is creation of new states.

Of all of the languages in this article, Clojure was perhaps the one most specifically designed for concurrency.

Clojure also implements message passing and actors. This series of articles has looked at two previous languages that made extensive use of message passing: Simula and SmallTalk. The actor model is an advanced mathematical concept for concurrent computation whose principle proponent is Carl Hewitt. It is also implemented in Scala and in Elixir, the next language we look at.

Elixir

The last language we’re covering also has a connection with an earlier one that we haven’t looked at before in this series. Elixir is a language that runs on the Erlang virtual machine (VM), and was created by José Valim in 2011 to solve problems of concurrency.

Erlang is a unique language, in that it has more in common with an obscure IBM platform than it does with any well-known popular language. Originally designed to create high-availability (extremely fault-tolerant) software for Ericsson telecommunication switches, Erlang has several distinctive properties. One difference is its very low latency—humans are extraordinarily sensitive to delays or interruptions in speech signals—but the really remarkable thing about Erlang is that, instead of trying to prevent errors (for example, with very strict compilation), it assumes errors will happen, and focuses instead on extremely rapid detection and retry. This is a philosophical holdover from its telecom roots.

IBM had released the Synchronous Data Link Control (SDLC) communications protocol in 1975, which was a triumph of creative thinking in engineering. For years, its research engineers had struggled to invent a communications protocol that could send data over telephone lines without errors. It was a lost cause: No matter how good the protocol got, it was never perfect and data was lost. Eventually, the lab came up with a different idea: Let errors happen. Just detect them as soon as possible and resend the information quickly. This created a revolution in computer networking, paving the way for error- (and collision-) detection-based high-speed modems, Ethernet, and wifi.

IBM built a real-time (low-latency), high-availability operating system for handling airline reservations called Transaction Processing Facility (TPF). Very few programmers know about TPF, which is a shame: It’s so robust that you could literally unplug the electricity and it would resume processing with no data loss when you plug it back in (all data is disk-based as opposed to in-memory). Over time, other industries, such as credit-card processors, started to use it.

Erlang has a lot of the same design approaches as TPF but seems to have been created independently. TPF was/is very obscure, and there is no indication that the Ericsson team, led by Joe Armstrong, had any knowledge of it when they were working on Erlang.

Elixir leverages Erlang’s VM and abstractions for fault tolerance and distributed systems, but it also aims to provide a more friendly syntax and high-level conventions adopted from more recent languages (Erlang was designed in 1986, a lot has happened since then). It has a Ruby-like syntax and a smattering of concepts from other languages, such as comprehensions from Python, and lazy evaluation, as found in Haskell. It has inherited a robust concurrency model from Erlang (using an actor model) and has built-in tooling for managing dependencies, code compilation, testing, and so on, which makes industrial-scale programming more feasible.

Elixir has been growing in popularity, no doubt aided by its number three position in the best-paying languages, as published by Stack Overflow in 2017. It is used by companies such as WhatsApp and Pinterest.

This batch of languages is due for a surge in popularity, as it is very common for computer languages to only really take hold about a decade after they have been introduced.
This is the end of this series… for now. New languages come into being on a continual basis. As each generation of languages solves the challenges or deficiencies of the previous generation, new weaknesses and deficiencies become visible, which drives the development of the next generation.
Future horizons might include better cross-compilation and interoperability, using industrial methods in software assembly, and languages to exploit the particular features of quantum computers.
The miracle of software is that there is no limit other than the imagination. We can all expect to be amazed at what the future has in store for us.

Did you like this article? Check out the rest of the series from Plankalkül to C, from Smalltalk to Perl and from Java to JavaScript!

This article is part of Behind the Code, the media for developers, by developers. Discover more articles and videos by visiting Behind the Code!

Want to contribute? Get published!

Follow us on Twitter to stay tuned!

Illustration by Victoria Roussel

Les thématiques abordées