Introduction to CSS Escape Sequences

Regardless of where they appear, string values in CSS behave in a similar way. The most important thing to remember about them is that they are not HTML. This means, for instance, that inserting literal angle brackets without escaping them as HTML entity references (< and >) is perfectly legal. In other words, the rule:

#example:before { content: "3 < 5"; }

would result in a pseudo-element whose contents are the five characters (including spaces) < and not a broken HTML start tag. Similarly, this rule:

#example:before { content: "&lt;"; }

results in a pseudo-element whose contents are the four characters &lt; and not an HTML-escaped lessthan glyph. This tells us that the < and & characters are not treated specially by CSS string parsers, even though they are characters with special meaning in SGML-derived languages like HTML and XML.

Within CSS strings, the only character with any special meaning is the backslash (\). This character delimits the beginning of an escape sequence, a sequence of characters used to collectively represent a different character, in much the same way as the ampersand (&) does in HTML code.

Escape sequences are useful because they allow style sheet authors to represent characters that would normally be ignored or interpreted differently by traditional CSS parsing rules. The most obvious example of this is representing a literal backslash in a CSS string. At first, you might think that the following CSS rule would produce a backslash at the start of every paragraph, but you’d be mistaken.

p:before { content: "\"; }

When a CSS parser reads the declaration in this rule, it thinks that the backslash is the start of an escape sequence, and so it ignores it. Next, it encounters a straightened double quote and, since this character is not a legal component in an escape sequence, it recognizes it as the end of the string value and returns. The result is an empty string, sans backslash: "".

To get the backslash to appear, we therefore need to escape it, or “undo” its special meaning. This is simple enough. We merely prepend the backslash with another one, like this:

p:before { content: "\\"; }

This time when a CSS parser reads the declaration in the rule, it finds the first backslash, switches into its “escape sequence mode,” finds a literal backslash character as part of the string value it is parsing, and then finds the end-of-value straightened quotation mark. The result is what we were originally after, and the value that the CSS parser returns to the renderer is a single backslash: “\”. Note that CSS makes no distinction between single-quoted or double-quoted strings, so in either case two backslashes are needed in code to output one.

A similar situation exists if you wish to produce a literal double-quote within a double-quoted string. Instead of writing """; you would write "\""; to tell the CSS parser to treat the second quote as part of a value instead of the end-of-value delimiter. Alternatively, you could use single quotes as the string delimiter (content: '"';).

After the starting backslash, only hexadecimal digits (the numerals 0 through 9 and the English letters A through F) are allowed to appear within an escape sequence. In such escape sequences, these digits always reference Unicode code points regardless of the character set used in the style sheet itself. As a result, it’s possible to uniformly represent characters in a style sheet that are not possible to embed directly inside the style sheet itself.

Accented characters (like the “é” in résumé or café) is an example of one class of characters that would need to be escaped in a CSS string if the style sheet were encoded in plain ASCII instead of, say, UTF-8.

One useful application for this is to embed line breaks into generated content. The Unicode code point for the newline character is U+00000A. In a CSS string, this can be written as \00000A. In a way similar to the way a hex triplet for color values can be shortened, escape sequences can also be shortened by dropping any leading zeros from the code point, so another way to write a newline is \A. Here’s a CSS rule that separates the two words “Hello” and “world” with a newline, placing each on their own line.

#example:before { content: "Hello\Aworld."; }

Something to be careful of when using escape sequences in CSS strings is ending the escape sequence where you intend to. Observe what happens if our “Hello world” text changed to “Hello boy.”

#example:before { content: "Hello\Aboy."; }

Now, instead of a newline (code point \A), our escape sequence is a left-pointing double angle quotation mark, or (code point \AB). Our generated content now reads “Hello«oy.” This happens because the “B” in “boy” is interpreted as a hexadecimal digit. The escape sequence terminates at the next character, the “O,” because that letter isn’t also such a digit.

You can explicitly conclude an escape sequence in one of two ways. First, you can specify the sequence in full using all six hexadecimal digits (including leading zeros, if there are any). Second, you can append a space. The following two CSS rules are therefore equivalent:

#example:before { content: "Hello\00000Aboy."; }
#example:before { content: "Hello\A boy."; }

Knowing this, we can now split our earlier image caption example across two lines just where we want to. Pay close attention to the addition of the white-space: pre; declaration. Since we’re generating whitespace characters and in most situations all whitespace in HTML gets collapsed to a single space, the white-space declaration is needed to interpret the newline literally (as though all the generated content were inside a <pre> element).

img[title]:before {
    content: attr(title) "\AImage retrieved from"
    attr(src);
    white-space: pre;
    display: block;
}

Translations

Spanish (elwebmaster.com)

There have been 13 comments | Subscribe to Comments | Jump to Form »

Siegfried

Nice. But the real useful thing seems still impossible. I’d like to append a capital letter “D” to an image, and this D linking to the url given by the longdesc attribute. And this automatically via CSS for all images having this attribute.

The D is no problem. But the link…

1

Geert De Deckere

Thorough explanation. It makes me wonder, though, why one would still want to encode a stylesheet as ASCII. Just serve it all as UTF-8 and you don’t need those hard-to-read escapes.

Remember the @charset "UTF-8"; rule. Useful for offline copies of a website (or other situations where the correct Content-Type HTTP headers are missing).

2

Karl

@Siegfried: This would be easy through javascript. CSS helps display thing.

“Check the javascript”: https://github.com/karlcow/QuoteLink/blob/master/includes/injected.js I made for “QuoteLink”:https://addons.opera.com/addons/extensions/details/quotelink/ extension which displays and activate the link attribute in blockquote and q.

3

Aankhen

Good post. One technical problem, though—there seems to be an extra layer of unescaping going on, so a lot of backslashes are being swallowed up. The third code snippet, for example, is showing up like this for me:

bq. p:before { content: “”; }

(i.e. there’s no backslash in there.)

4

Alex

Interesting read!

Not sure if anyone else had this problem, but a lot of your backslashes are escaping themselves out of the resulting article.

5

Jonathan

You appear to be missing a few backslashes from your examples!

6

Stephan

It appears like each first backslash is missing from the examples.. :

“To get the backslash to appear, we therefore need to escape it, or “undo” its special meaning. This is simple enough. We merely prepend the backslash with another one, like this:

p:before { content: “\”; }

That should be

p:before { content: “\\”; }

right?

7

Matthew Dorey

A very nice article!

However, you may want to look at how your code text is being presented.

I am using Chrome and I cannot see what is presumably a backslash that has gone missing. For example, I see

bq. #example:before { content: “HelloAworld.”; }

whereas I am pretty sure (from your text if nothing else) that it is meant to be

bq. #example:before { content: “Hello\Aworld.”; }

Maybe you have been tripped up by the very thing you were writing about?!

8

Toonix

Hi, just a quick note that the backslashes in the red code examples are not displayed correctly – there’s always one missing.

Cheers!

9

Markus

Nice read, but it seems like the (first) backslash is always missing in your examples:

#example:before { content: “HelloAworld.”; }

Should read:

#example:before { content: “Hello\Aworld.”; }

10

Joss

Just a heads up – I can’t see any of the backslashes in the examples, I think because they were stripped either on save or on display. You might need to do double slash in order to show single slash, if you catch my drift!

Great article.

11

Mert TOL

TRUE! I don’t know why but after last edit all backslashes are gone!

I just saw the comments and updated the post. Thank you all.

12

Kevin Sweeney

I recommend using the ASCII character code (05C) to render a backslash rather than using “\\”. I detail why here:

https://coderwall.com/p/ok-0ta

13

Post Comment on This Article

Your e-mail address won't be published. If you simply add some value to the original post and stay on the topic, your comment will be approved.

You can use Textile parameters on your comments. For example: _italic_ *bold* bq. quated text "link text":URL — Get your own picture next to your comment with a Gravatar account.