Colouring arbitrary shell output | Julius Kibunjia's Blog

Open Table of Contents

Highlighting Text Without Writing A Custom Lexer
- The Script
- Going Further

Highlighting Text Without Writing A Custom Lexer

Pygments is a cool project and lexers already exists for pretty much every language you may need. I hardly ever need to highlight source files though since despite how I’ve made it seem, opening an editor is almost always the appropriate action. What I do need though is highlighting for any command that produces a large amount of text which in my case is typically the output of cower -s.

cower is a tool that allows you to search for and download packages from the Arch User Repository. It doesn’t have as many features as some of the other tools but works well and stays out of your way. Here’s how the output typically looks:

cower uncolored output

As you can see it can be kind of hard to find exactly what you need in all the text produced so I started thinking of a way to use pygments without first having to define a custom lexer.

The Script

Pygments has a tutorial on their website describing how to write a lexer that was very useful in writing this script. I will omit most of the set-up code which you can see in the gist with the full version of the script and only include the main function.

def main():
    global groups
    parser = argparse.ArgumentParser()
    parser.add_argument('-p', '--pattern', dest='patterns', nargs='+')
    args = parser.parse_args()

The first part of the main function indicates the usage of a global variable groups which is a list of the custom pygments tokens I created for use in this script. Next is instantiation of the argument parser which takes at least one regular expression to use for highlighting (nargs='+') and puts them in a list.

    class CustomLexer(RegexLexer):
        name = 'rcolor'
        tokens = { 'root' : list(zip(args.patterns,
				      itertools.cycle(groups))) }

    text = sys.stdin.read()
    result = pygments.highlight(text, CustomLexer(),
            Terminal256Formatter(style=RegexStyle))
    print(result)

    return 0

As per the instructions on how to create a lexer, we create a sub-class of RegexLexer with the regular expressions passed in by the user. This is done simply by setting the root value to a list of tuples of the form (regex_string, token_group). In our case the regular expressions are specified by the user on the command line and are in a list args.patterns.

itertools.cycle ensures that all the regular expression are assigned a group for colouring even if we need to reuse the groups. Finally since this script is meant to be used to colour output from any given command, we read input from stdin.

All that’s left is to highlight the text with our custom lexer and ensure the output produced is tailored to the terminal. Here’s what the end result looks like:

cower coloured output

Looks a lot nicer doesn’t it?

Going Further

Before starting this article I did not know of the existence several programs that do essentially the same thing. A list can be found on the Arch wiki.