How to Write a Emacs Major Mode for Syntax Coloring

By Xah Lee. Date: . Last updated: .

This page shows you how to write a emacs major mode to do syntax coloring of your own language.

emacs mymath major mode
syntax color your own language

Problem

You are writing a major mode for a new language. You want keywords of the language syntax colored.

Suppose your language source code looks like this:

Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]

You want the words “Sin”, “Cos”, “Sum”, colored as functions, and “Pi” and “Infinity” colored as constants.

Solution

Save the following in a file.

;; a simple major mode, mymath-mode

(setq mymath-highlights
      '(("Sin\\|Cos\\|Sum" . font-lock-function-name-face)
        ("Pi\\|Infinity" . font-lock-constant-face)))

(define-derived-mode mymath-mode fundamental-mode "mymath"
  "major mode for editing mymath language code."
  (setq font-lock-defaults '(mymath-highlights)))

Now, copy and paste the above code into a buffer, then call eval-buffer.

Now, type following code into a buffer:

Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]

Now, M-x mymath-mode, emacs will now syntax color the buffer's text.

How Does it Work?

The string "Sin\\|Cos\\|Sum" is a regex, the font-lock-function-name-face is a pre-defined variable that holds the value for the default font and coloring spec used for function keywords.

The line define-derived-mode defines your mode, named “mymath-mode”, based on the fundamental-mode. fundamental-mode is the most basic mode.

The line (setq font-lock-defaults '(mymath-highlights)) sets up the syntax highlighting for your mode.

Here's another simple example: Emacs Lisp: html6-mode.

Writing a Mode for a Language that Has Hundreds of Keywords

Typically, a language has hundreds of keywords. Elisp has a way to generate regex for your keywords.

Suppose you are writing a mode for the Linden Scripting Language (LSL). LSL has about 553 keywords. First, here's a sample of LSL source code so you get some idea of how we want it colored.

// sample LSL file

// Examples of variable declaration and assignment:
integer score = 0;
string mySay = "i ♥ you";
vector v = <3,4,5>;
list myList= [2,4,7,3];

// Example of defining a function.
// built-in function's names start with “ll” (Linden Library).
integer sum(integer a, integer b)
{
    integer result = a + b;
    return result;
}

 default
 {
     state_entry()
     {
         llSay(0, mySay);
     }

     touch_start(integer total_number)
     {
         if (score == 1) {
             llSay(0, mySay);
         } else {
             llWhisper(0, "Ouch!");
         }
     }
 }

Each type of keyword uses a different color:

Here's the code.

;;; mylsl-mode.el --- sample major mode for editing LSL.

;; Copyright © 2015, by you

;; Author: your name ( your email )
;; Version: 2.0.13
;; Created: 26 Jun 2015
;; Keywords: languages
;; Homepage: http://ergoemacs.org/emacs/elisp_syntax_coloring.html

;; This file is not part of GNU Emacs.

;;; License:

;; You can redistribute this program and/or modify it under the terms of the GNU General Public License version 2.

;;; Commentary:

;; short description here

;; full doc on how to use here


;;; Code:

;; define several category of keywords
(setq mylsl-keywords '("break" "default" "do" "else" "for" "if" "return" "state" "while") )
(setq mylsl-types '("float" "integer" "key" "list" "rotation" "string" "vector"))
(setq mylsl-constants '("ACTIVE" "AGENT" "ALL_SIDES" "ATTACH_BACK"))
(setq mylsl-events '("at_rot_target" "at_target" "attach"))
(setq mylsl-functions '("llAbs" "llAcos" "llAddToLandBanList" "llAddToLandPassList"))

;; generate regex string for each category of keywords
(setq mylsl-keywords-regexp (regexp-opt mylsl-keywords 'words))
(setq mylsl-type-regexp (regexp-opt mylsl-types 'words))
(setq mylsl-constant-regexp (regexp-opt mylsl-constants 'words))
(setq mylsl-event-regexp (regexp-opt mylsl-events 'words))
(setq mylsl-functions-regexp (regexp-opt mylsl-functions 'words))

;; create the list for font-lock.
;; each category of keyword is given a particular face
(setq mylsl-font-lock-keywords
      `(
        (,mylsl-type-regexp . font-lock-type-face)
        (,mylsl-constant-regexp . font-lock-constant-face)
        (,mylsl-event-regexp . font-lock-builtin-face)
        (,mylsl-functions-regexp . font-lock-function-name-face)
        (,mylsl-keywords-regexp . font-lock-keyword-face)
        ;; note: order above matters, because once colored, that part won't change.
        ;; in general, longer words first
        ))

;;;###autoload
(define-derived-mode mylsl-mode c-mode "lsl mode"
  "Major mode for editing LSL (Linden Scripting Language)…"

  ;; code for syntax highlighting
  (setq font-lock-defaults '((mylsl-font-lock-keywords))))

;; clear memory. no longer needed
(setq mylsl-keywords nil)
(setq mylsl-types nil)
(setq mylsl-constants nil)
(setq mylsl-events nil)
(setq mylsl-functions nil)

;; clear memory. no longer needed
(setq mylsl-keywords-regexp nil)
(setq mylsl-types-regexp nil)
(setq mylsl-constants-regexp nil)
(setq mylsl-events-regexp nil)
(setq mylsl-functions-regexp nil)

;; add the mode to the `features' list
(provide 'mylsl-mode)

;; Local Variables:
;; coding: utf-8
;; End:

;;; mylsl-mode.el ends here

Note that the highlighting mechanism of font-lock-defaults is based on first-come-first-serve basis. Once a piece of text got its coloring, it won't be changed. So, the order of your list is important. In general, put longer length keywords first. (this won't fix all cases where a keyword matches part of other keywords. If your language has a lot such keywords, you need to use other forms to solve this problem. (info "(elisp) Search-based Fontification"))

The `( ,a ,b …) is a lisp special syntax to evaluate parts of elements inside the list. Inside the paren, elements preceded by a , will be evaluated.

In the above, we based our mode on c-mode, because the syntax is similar. Basing on a similar language's mode will save you time in coding many features, such as handling comment and indentation.

The line:

(provide 'mylsl-mode)

adds the symbol mylsl-mode to the variable features list. 〔➤see What's Emacs Lisp feature?

Now, to run the code, call eval-buffer. 〔➤see How to Evaluate Emacs Lisp Code

Open the LSL language sample file given above, then call mylsl-mode. Here's the result:

emacs sample mylsl-mode
sample mylsl-mode syntax highlighting result.

How to Name Your Major Mode

Emacs Lisp: How to Name Your Major Mode


Continue to:

  1. Emacs Lisp: How to Color Comment in Major Mode
  2. Emacs Lisp: How to Write Comment Command in Major Mode
  3. Emacs Lisp: How to Write Keyword Completion Command
  4. Emacs Lisp: How to Create Keymap for Major Mode
  5. Emacs: Lookup Google, Dictionary, Documentation

(info "(elisp) Major Mode Conventions")

Like it? Buy Xah Emacs Tutorial. Thanks.

or, buy something from my keyboard store.