(defblog exordium) Emacs and Lisp musings

Lisp is the red pill

This is the first article in a series about Lisp.

Lisp was the second programming language ever invented, right after Fortran in the late fifties. John McCarthy, one of the founders of the discipline of Artificial Intelligence, created it.

Lisp was very popular during the boom of AI; it even had its own hardware which I had the privilege to work with. Lisp has invented many, if not most, of the concepts and programming paradigms used in modern languages, including homoiconicity, first class functions, garbage collection, aspect-oriented programming, you name it1. Several languages creators have said that Lisp was a major source of inspiration for them (hello Java, Ruby and JavaScript). Some colleague of mine once said, and I am paraphrasing, that every programming language ever invented either tries to be a better Fortran or a degraded version of Lisp, e.g. some kind of Lisp for the masses. But don’t think that Lisp is history; modern Lisps like Clojure are state-of-the-art programming languages.

Lisp is a programmable programming language. What does that mean? It means that you can change Lisp (dynamically, even) to be what you want. So for example if you are writing a text editor, you can turn Lisp into a language for writing text editors. You will never find yourself wishing the language supported some feature that would make your life easier; you can just add the feature yourself.

Lisp achieves that by putting data and code at the same level: data can be used (evaluated, compiled) as code, and vice versa. Whereas C provides naive text-substitution macros and C++ provides brain-dead templates and template meta-programming weirdness, Lisp macros give you full access to the power of Lisp at compilation time. Basically you can tell the compiler “execute this code, and use the result as the code to be included in the program”.

Developing in Lisp is easy: you can use a REPL (read-eval-print loop) to play with your code as you write it. The story of the Deep Space 1 probe is an interesting anecdote about how useful a REPL can be: “Debugging a program running on a $100M piece of hardware that is 100 million miles away is an interesting experience”. Lisp can also be very fast, close to C-level performance according to some benchmarks. It had to be because it is so old (think about the kind of hardware they had in the sixties).

Dialects

There has been countless dialects of Lisp in history. The ones that are relevant today are:

  • Common Lisp: the ANSI standard, which has many implementations such as SBCL (Steel Bank Common Lisp, a high-performance native compiler). Common Lisp includes one the best object-oriented languages I’ve ever seen: CLOS (Common Lisp Object System).
  • Scheme: a Lisp with a minimalist design philosophy. It is the programming language used in the textbook SICP. Guile and Racket are popular implementations.
  • Clojure: A very modern language that runs with the JVM or a JavaScript runtime. Clojure brings in a modern Lisp syntax, pure functions, software transactional memory, and many other cool things.
  • Emacs Lisp: unfortunately the worse Lisp out there. It uses dynamic binding2 by default, it is single-threaded, and it is super slow. The only good thing you can say about it is “well at least it’s a Lisp”. Finding a replacement is a hot topic today in the Emacs community.

Getting started

The simplest way to get a taste of Lisp is just to fire up Emacs. There are two ways to interact with ELisp:

  • M-x ielm (inferior emacs lisp mode) gives you a REPL similar to say irb or python. Use C-UP and C-DOWN to repeat commands you typed earlier. For example, try to evaluate () (nil).
  • You can use any ELisp buffer such as the scratch buffer. C-j evaluates the lisp expression before the cursor and inserts the result where the cursor is. M-C-x evaluates the current form and prints the result in the mini-buffer (also in messages). You can also use functions like M-x evaluate-region.

Give it a try: open the scratch buffer and type "hello world". Then evaluate with M-C-x (meta control x): the mini-buffer should display the string. This works because strings, like numbers, are objects that evaluate to themselves.

If you want a real hello world, type this and evaluate again3:

(print "hello world")

This is what Lisp calls a symbolic expression (s-expression or sexp). It calls the function print passing a string as parameter (you can also use format which is similar to C’s printf). Lisp uses the Polish notation for function calls; for example (1 + 2 + 3) * 5 is written in Lisp as (* (+ 1 2 3) 5).

The mini-buffer should display the string “hello world” twice: one is the printed text and the other is the returned value, which is also the printed text. Every function returns a value which is normally the last form that was evaluated.

If you wanted the code above to return nil (which in ELisp and Common Lisp means void, false, and the empty list) you could do this:

(progn
  (print "hello world")
  nil)

A progn is a bloc (list of sexp) which evaluates each form in sequence and returns the value of the last form. It is named like that because of an other function prog1 which does the same thing but returns the value of the first form. Run this code again with M-C-x: the mini-buffer should display the string that was printed and the return value nil.

That’s nice but our hello world should really be a function. So let’s define one:

(defun hello-world ()
  "Prints hello world and returns nil"
  (print "hello world")
  nil)

;; Call it:
(hello-world)

defun is followed by the name of the function, the list of arguments (an empty list here), an optional documentation string, and the forms. The body of a function is an implicit progn. Note that comments begin with a semicolon.

Working with Lists

The basic data structure in Lisp is a single-linked list. The syntax of a list is exactly the same as a sexp. For example let’s declare a variable l containing a list of integers:

(defvar l '(1 2 3))

defvar is followed by a variable name and a value (and an optional documentation string). Notice the quote character before the value: this is syntactic sugar for (quote (1 2 3)) which means “don’t evaluate this”. The quote is needed because otherwise Lisp would try to call a function named “1” with parameters 2 and 3.

If you wanted to use the result of a function call as value instead, you could use the list function, which creates a list containing its arguments:

(defvar l (list 1 2 3))

Here we don’t use the quote because we want the list form to be evaluated. There is no need to quote the numbers because a number evaluates to itself.

l is a symbol, which is an object in memory with a unique name. A Symbol has a name, and possibly a value, a function definition and a property list. Our function hello-world above is also a symbol; it has no value but it has a function definition. There is a special kind of symbol called keyword which has just a name and evaluates to itself; keywords start with a colon like :foo (they are like interned strings).

The value of l is a list containing 3 cells or cons (for construct), each made of 2 pointers: a pointer to the value and a pointer to the next cell (or nil).

list

Function car returns the value of the first pointer, and cdr the value of the second pointer (they are named like that for historical reasons4):

(car l)       ; => 1
(cdr l)       ; => (2 3)
(car (cdr l)) ; => 2
(cadr l)      ; => same as above, it's a shorcut
(cddr l)      ; => (3)
(caddr l)     ; => 3
(cdddr l)     ; => nil e.g. ()

You can create a cons using the function that has the same name; its parameters are the car and the cdr.

(defvar l2 (cons 0 l)) ; => (0 1 2 3)
l                      ; => still (1 2 3)

The call to cons returns a new list starting with 0 and pointing to the first cons of l. You can verify that l and l2 share the same tail with function eq, which returns t (true) if its arguments are the same Lisp object:

(eq l l2)       ; => nil
(eq l (cdr l2)) ; => t

Of course Lisp has plenty of functions to manipulate lists. Here is how to reverse our list:

(setq l (reverse l)) ; => (3 2 1)

setq sets a variable to a new value (set eq).


That’s it for today. Lots more to come. Stay tuned!

  1. And almost OOP. Alan Kay, who invented Smalltalk and coined Object Oriented Programming, said that “Lisp is the greatest single programming language ever designed”.

  2. As opposed to lexical binding. If you define a local variable x in function foo and then call function bar, bar will see the value of x even if you don’t pass it as parameter. A long time ago people thought it was a good idea for performance reasons.

  3. If you make a mistake and end up in the debugger, just press q to exit.

  4. CAR and CDR were the names of two registers in the CPU of the IBM 704! Those were literally the name of 2 instructions: “contents of the address register” and “contents of the decrement register”. You can also use FIRST and REST if you prefer.

Org mode part 1

This is the first article in a series about Org mode.

Org mode is a killer feature of Emacs. Some people use Emacs just for that mode. It can do many things including organizing notes, project planning, web publishing and literate programming. You can even write your emacs configuration in Org mode and publish it: here is an example (to make it work, you only need a tiny init.el that loads the Org file and runs the embedded Lisp code).

The only bad thing about Org mode is that it is not universal, because it is very tied to Emacs. There are plugins for Vim and Sublime Text for instance, but they only cover a fraction of the features that the real thing provides. This is the reason why Markdown is more popular than Org while being objectively inferior. Although more and more sites understand Org files (GitHub certainly does).

Let’s get started.

Outlines

An Org file is a plain text file with headlines, text, and some additional information such as tags and timestamps. A headline starts with a series of asterisks. The more asterisks there are, the deeper the headline is.

For example, you can create a file with extension “.org” and with this content:

* Top-level headline
Some text under that headline.
** Second-level headline (child of the top-level headline)
More text.
*** Third-level headline (child of the second-level headline)
** Another second-level headline

You can make Emacs render this nicely with the org bullet extension, which masks the asterisks and displays Unicode bullets instead:

Org-mode1

When typing the text above, use M-RET (meta + return) to create a new headline at the same level as the one above it, or a first-level headline if the document does not have headlines yet. Use M-LEFT and M-RIGHT to promote or demote a headline, e.g. change its level. You can also move a headline and all the text under it up and down using M-UP and M-DOWN.

Finally the TAB key collapses or expands headlines. When a headline is collapsed, its content is replaced with an ellipsis like so:

Org-mode2

S-TAB (shift + tab) collapses or expands everything.

Lists and checkboxes

If you prefer, you can also create hierarchies using lists. For example:

Org-mode3

The same keys work with lists, e.g. use M-RET to create a new list item. You can also change the style of your list using S-LEFT and S-RIGHT. For example, change the list to use numbers:

Org-mode4

Notice that if you move an item up and down with M-UP / M-DOWN, the numbers are automatically updated.

To create an item with a checkbox, use S-M-RET (shift meta return). Toggle a checkbox using C-c C-c.

Org-mode5

TODOs

An alternative to checkboxes is TODO items in headlines:

Org-mode6

Type S-M-RET (shift meta return) to create a new headline that starts with a TODO. Change the state of a TODO into DONE or vice versa using S-LEFT and S-RIGHT.

You can add more states to TODO and DONE. Exordium uses this code to add the WORK and WAIT states:

(setq org-todo-keywords
      '((sequence "TODO" "WORK" "WAIT" "DONE")))

You can also specify the states on a per-file basis by adding a line like this at the beginning of the file (save and reopen to make it work):

#+TODO: TODO WAIT | DONE CANCELED

The vertical bar separates the TODO keywords (states that need action) from the DONE states (which need no further action). They are displayed with different colors.

Markup

Org’s markup syntax is more intuitive than the one of Markdown (IMO):

- *Bold* word
- /Italic/ word
- _Underlined_ word
- ~code~ word
- URL: http://gnu.org or [[http://www.gnu.org/software/emacs/][GNU Emacs]]
- Images: [[/Users/pgrenet/Pictures/tux.jpg]]

You can make the images display inline using this code in your configuration (reopen the Org file to make it work):

(setq org-startup-with-inline-images t)

Org-mode7

Tables

Finally the pièce de résistance: type this text:

| Name | Phone | Age |
|--

Then hit TAB and see what happens. Voila! The table will automatically resize itself as you tab and shift-tab to move between cells.

Fast cursor movement

I’ve used Eclipse for many years. What always bothered me about it is that it forces you to use the mouse all the time, even for things like switching between buffers1. Which is a very common operation: add an argument to the definition of a function, switch to the file where it is called, and change the function call. In fact my Emacs configuration sets up a very easy key for switching between the 2 most recently used buffers.

One of the many great things about Emacs (and Vim as well) is that you can do everything you need without ever using the mouse, and in fact without even requiring a GUI. This is a killer feature compared to most IDEs because menus and mice are slow. If your hands don’t have to leave the home row, you can change text almost as fast as you think.

You can get more productive if you know how to move the cursor quickly within a buffer. There are several clever extensions for that, but here we’ll review a few built-in keys. Note that I’m using the arrow keys because they are easy to remember2.

Beginning and end

These four keys are a must:

Key binding Description
C-a Go to the beginning of the line.
C-e Go to the end of the line.
M-< Go to the beginning of the buffer.
M-> Go to the end of the buffer.

Move by words

Key binding Description
C-right Move forward one word (right-word).
C-left Move backward one word (left-word).

Any major mode may have its own definition of what a word is (it is defined in the mode’s syntax table).

Unfortunately these keys are not symmetrical: moving right then left does not necessarily bring you back where you started. For programming, I found it useful to define these extra keys for moving by semantic units rather than words:

(define-key global-map [(meta right)]
  #'(lambda (arg)
      (interactive "p")
      (forward-same-syntax arg)))

(define-key global-map [(meta left)]
  #'(lambda (arg)
      (interactive "p")
      (forward-same-syntax (- arg))))

You can pass a numeric argument to these commands to move by more than a single word. The Universal numeric argument prefix lets you pass a number to a command, and the prefix is C-u followed by the number followed by the command. For example C-u 3 C-right moves forward 3 words.

Move by paragraphs

I use these all the time:

Key binding Description
C-up Move up one paragraph (backward-paragraph).
C-down Move down one paragraph (forward-paragraph).

Move by defuns

You can move to the beginning or end of a class or function almost the same way you move to the beginning or end of the line, except that the prefix is M-C-:

Key binding Description
M-C-a Go to the beginning of a class or function.
M-C-e Go to the end of a class or function.

Repeat to go to the next or previous class/function.

Move by s-expression

This is very handy for Lisp. In Lisp, an s-expression (symbolic expression or sexp) is an atom or a list. For other programming languages, Emacs also considers strings and blocs between curly braces or square brackets. Moving by sexp is similar to moving by word, only the prefix is M-C-:

Key binding Description
M-C-left Move forward one sexp (forward-sexp).
M-C-right Move backward one sexp (backward-sexp).
M-C-d Move down a sexp.
M-C-u Move up a sexp.
M-C-n Move to the next sexp in the same nested level.
M-C-p Move to the previous sexp at the same nested level.

Give it a try, it is more useful than you think.

One more thing

You can use M-x view-lossage to assess your productivity with Emacs: this function displays the last 300 keys you have pressed. If it shows the same key repeated many times, you are probably doing it wrong.

  1. Eclipse has a shortcut key that displays a menu of the open files, but it is slow and cumbersome.

  2. Touch-type purists prefer to use other keys like C-f and C-b.

A new blog!

Well. Finally got around to putting this website together. What’s neat, it is written in Markdown using Emacs and built with Jekyll. All it takes to publish it is to push the git repo.

This will be a blog about Emacs. It will contain concise posts about how to improve your productivity and how to program Emacs to make it your perfect editor.