ELisp crash course
This is the second article in a series about Lisp; it assumes you read the first one.
If you use Emacs but don’t know Lisp, you are missing a lot: Emacs is infinitely customizable with Emacs Lisp. This post is an introduction to ELisp, hopefully giving you enough basics to write useful functions. Today we will mostly focus on the language itself, as opposed to the gazillion of Emacs-specific APIs for editing text.
My goal is not to review every function of the language: it would take a book
to do so. My goal instead is to give a good high-level overview of Elisp. If
you find yourself looking for a function or variable, you can browse the
Emacs elisp site
or you can use M-x apropos
which displays anything that matches a given
string.
Let’s start with…
The Basic Types
Strings are double quoted and can contain newlines. Use backslash to escape double quotes:
A string is a sequence of characters. The syntax for a character is ?x
:
question mark followed by character. Some need to be escaped, like for example
?\(
, ?\)
and ?\\
.
There are many functions operating on strings, like for example:
Note that none of these functions have any side effect, as it is the case with most functions in Lisp - they are pure functions. They create a new object and return it.
Integers have 29 bits of precision (I don’t know why) and doubles have 64 bits. Binary starts with “#b”, octal with “#o” and hexadecimal with “#x”.
The most useful data structure in Lisp is a list, but the language also has
arrays, hash tables, and objects. An array is called a vector, and you can
create one like so: [ "the" "answer" "is" 42]
. Like lists, they can contain
objects of various types. You use spaces to separate the values; comas are part
of the Lisp syntax but they are used for something else as we will soon see.
Quote
The quote is a special character in the Lisp syntax that prevents an expression from being evaluated. For instance:
The quote prevents the evaluation of the symbol “a” on the first line, and the list on the second line, otherwise they would be considered as a variable and a function call respectively.
The backquote is like a quote, except that any element preceded by a coma is evaluated. The backquote is very handy for defining macros, e.g. functions that generate code. For example:
Variables
Lisp is a dynamically-typed language, like Ruby or Python and unlike Java or C++. You don’t need to declare the type of a variable, and a variable can hold objects of different types over time.
We already saw in the previous post how to declare a global variable with
defvar
and set it with setq
. Another way to use variables is function
parameters:
Here we define a function add
with 2 arguments, which returns the sum of its
arguments. Then we call it. message
is an Emacs function similar to C’s
printf: it prints a message in the mini-buffer and in the messages
buffer1.
Every time you call add
, Lisp creates new bindings to hold the values of
x
and y
within the scope of the function call. A single variable can have
multiple bindings at the same time; for example the parameters of a recursive
function are rebound for each call of the function.
The let
form declares local variables. The syntax is (let (variable*) body)
where each variable is either a variable name, or a list (variable-name
value). Variables declared with no value are bound to nil. For example:
The scope of the variable bindings is the body of the let
form. After the
let
, the variables refer to whatever, if anything, they referred to before
the call to let
. You can bind the same variable multiple times:
Note that let
binds variables in parallel and not sequentially. That means
that you cannot declare a variable whose value depends on another variable
declared in the same let. For example this is wrong:
There are two ways to fix the code above: you could use a second let
within
the first, or you could replace let
with let*
: it binds variables
sequentially, one after the other. The key to understand that is to remember
that the origin of Lisp is the
Lamda Calculus, where
everything is a function call. The first let
form above is equivalent to
calling an anonymous function like this:
Here we define a lambda (anonymous) function with 2 arguments, and we call it with the values of the arguments. The syntax of a lambda is (lambda (arguments*) body), and we call it like any other function by putting it in a second pair of parentheses with the arguments.
The equivalent of a let*
requires multiple function calls:
The first lambda binds x to 1 and the second lambda binds y to x * 10.
Conditions
In ELisp and Common Lisp, nil
is used to mean false, and everything that is
not nil is true, including the constant t
which means true. Therefore a
symbol is true, a string is true and a number is true (even 0). nil
is the
same as ()
and it is considered good taste to use the former when you mean
false (or void) and the latter when you mean empty list. Note that Clojure and
Scheme treat boolean logic differently: for them the empty list and false are
different things.
Let’s start with simple boolean functions. not
returns the negation of its
argument, so that (not t)
returns nil and vice versa. Like most functions in
Lisp, and
and or
can take any number of arguments. and
returns the value
of the last argument that is true, or nil if it finds an argument that is not
true. or
returns the value of the first argument that is true, or nil if none
of them are true. For example:
You can compare for equality using =
for numbers, string=
for strings or
eq
for same address in memory. There is also a generic equal
function that
tests if the objects are equal no matter what type they are, so that’s the only
one you need to remember.
(if then else*) is a special form that is equivalent to C’s ternary operator
?:
. It must have at least a then form and can only have one. It may have
one or more else forms. It returns the value of the then form or the value
of the last else form. For example:
If you just want a then or an else, it is better to use when
and unless
because they can have multiple then or else forms. They return the value of
the last form or nil. Here is an example:
Finally cond
is like a super-charged version of C’s switch/case: it chooses
between an arbitrary number of alternatives. Its arguments are a collection of
clauses, each of them being a list. The car
of the clause is the condition,
and the cdr
is the body to be executed if the condition is true (the body can
have as many forms as you like). cond
executes the body of the first clause
for which the condition is true. For example:
The code above uses predicates like numberp
which returns t
if the argument
is a number. The function current-buffer
returns a buffer object which is
neither a number, string, list or symbol (it is an instance of a class). Notice
the last clause: the condition is t
which is obviously always true. This is
the “otherwise” clause guaranteed to fire if everything else above has failed.
Loops
The simplest loop is a while
:
dotimes
takes a variable and a count, and sets the variable from 0 to count -
1:
dolist
takes a variable and a list, and sets the variable to each item in the
list:
If you need anything more complicated, take a look at the documentation of the
loop
macro. This is a very powerful macro with lot of options that takes an
(almost) English sentence as argument and generates what you mean. For example,
a C “for” loop can be expressed like so:
Another example is the following code which iterates over a “plist” (property
list) which is a collection like (key1 value1 key2 value2) using cddr
to
move by 2 items at a time and skipping the properties where the key is an even
number:
Elisp also has exceptions, try/catch/finally and anything else you would expect.
Functions
Lisp uses several keywords for declaring arguments within a defun
.
&optional
introduces optional arguments, which if not specified are bound to nil. For example(defun foo (a b &optional c d) ...)
makes c and d optional.&rest
takes all remaining arguments and concatenates them into a list. For example the signature of thelist
function is simply(&rest objects)
.&key
introduces a keyword argument, that is an optional argument specified by a keyword with a default value of your choice. For example:
Functions are first class objects in Lisp. You can store them in a variable and call them later. For example:
The syntax #'foo
is sugar for (function foo)
which returns the definition
of the function stored in symbol foo. It basically returns a pointer to the
code. funcall
calls the function with a given list of arguments. Note that
Emacs is very tolerant and (setq f 'list)
(e.g. setting f to the symbol
“list”) will also work.
apply
works like funcall
but it applies the function to a list of
arguments:
An interesting example of using apply
is mapcar
which applies a function to
each element of a list and returns a list of the results:
Interactive functions
Let’s use our fresh knowledge to do something useful.
Sometimes I want to include a separator in a comment, e.g. a sequence of dashes
or tilde that fills up the rest of the line until the 80 character column (the
fill-column
variable defines that limit). For example, if I type “// Begin of
test” I want a magic key to do this:
Elisp functions must be declared “interactive” if you want to call then using Meta-x or bind them to a key. You do this declaration by calling the interactive special form (it’s not a function) as the first form in the body of your function.
end-of-line
move the cursor to the end of the line, as you probably
guessed. The let form calculates the number of characters to insert before it
reaches the end of the line using the variable fill-column
(which should be
set to 79) and the current-column
function which returns the cursor’s column
number. The insert
function inserts a character or string at the position of
the cursor. Finally global-set-key
binds the function to a key chord. Note
that this is a simple implementation; it might be more efficient to create a
string with n characters using (make-string num-chars ?~)
.
Let’s write another one. Suppose you work in an organization that has created its own code style, and suppose that said code style proclaims that lines longer than 80 characters are a cardinal sin. Believe me, such code styles do exist. So let’s write an interactive function that will find the next “long” line in the current buffer, from the position of the cursor. It could look like this2:
This interactive function takes a numeric argument which is the max length of
lines. The “P” string in the call to interactive
specifies that we use an
argument (in raw form; see the documentation of interactive for
details). Either the user invokes this function with M-x goto-long-line
, in
which case the argument len
is set to nil, or she invokes the function with
C-u 7 9 M-x goto-long-line
, in which case the argument len
is set to 79 (for
instance). The first setq
line is used to set a default value to len
:
either it is the number that the user specified or it is the value of variable
fill-column
.
Without going into too much details, the rest of the code is a while
loop
until we have found a line or we reached the end of the buffer (predicate
eobp
). At each step we go down one line (forward-line
) and we check the
length of the line. Note that the Emacs function point
returns the position
of the cursor as an offset into the file (the current character number if you
will). Our function is designed to be called both interactively and within a
program, so it tests how we are called using predicate called-interactively-p
before deciding to print a message or not. point-min
returns the position of
the first character in the buffer (should be 1) and goto-char
goes to a given
character position.
Note that sometimes the compiler complains when you call a function that is designed to be used interactively in your code (these functions are marked as such using a property). Usually the warning says you should use another function, supposedly more efficient because doing less tests.
That’s it for today. Lots more to come. Stay tuned!