The internal format for strings is always UTF-16, it is not tied to the page encoding.
Let’s recall the kinds of quotes.
Strings can be enclosed within either single quotes, double quotes or backticks:
let single = 'single-quoted'; let double = "double-quoted"; let backticks = `backticks`;
Single and double quotes are essentially the same. Backticks, however, allow us to embed any expression into the string, by wrapping it in
Another advantage of using backticks is that they allow a string to span multiple lines:
Looks natural, right? But single or double quotes do not work this way.
If we use them and try to use multiple lines, there’ll be an error:
Single and double quotes come from ancient times of language creation, when the need for multiline strings was not taken into account. Backticks appeared much later and thus are more versatile.
Backticks also allow us to specify a “template function” before the first backtick. The syntax is:
func`string`. The function
func is called automatically, receives the string and embedded expressions and can process them. This feature is called “tagged templates”, it’s rarely seen, but you can read about it in the MDN: Template literals.
It is still possible to create multiline strings with single and double quotes by using a so-called “newline character”, written as
\n, which denotes a line break:
As a simpler example, these two lines are equal, just written differently:
There are other, less common special characters:
||In Windows text files a combination of two characters
||Backspace, Form Feed, Vertical Tab – mentioned for completeness, coming from old times, not used nowadays (you can forget them right now).|
As you can see, all special characters start with a backslash character
\. It is also called an “escape character”.
Because it’s so special, if we need to show an actual backslash
\ within the string, we need to double it:
So-called “escaped” quotes
\` are used to insert a quote into the same-quoted string.
As you can see, we have to prepend the inner quote by the backslash
\', because otherwise it would indicate the string end.
Of course, only the quotes that are the same as the enclosing ones need to be escaped. So, as a more elegant solution, we could switch to double quotes or backticks instead:
Besides these special characters, there’s also a special notation for Unicode codes
\u…, it’s rarely used and is covered in the optional chapter about Unicode.
length property has the string length:
\n is a single “special” character, so the length is indeed
lengthis a property
People with a background in some other languages sometimes mistype by calling
str.length() instead of just
str.length. That doesn’t work.
Please note that
str.length is a numeric property, not a function. There is no need to add parenthesis after it. Not
To get a character at position
pos, use square brackets
[pos] or call the method str.at(pos). The first character starts from the zero position:
As you can see, the
.at(pos) method has a benefit of allowing negative position. If
pos is negative, then it’s counted from the end of the string.
.at(-1) means the last character, and
.at(-2) is the one before it, etc.
The square brackets always return
undefined for negative indexes, for instance:
We can also iterate over characters using
Let’s try it to show that it doesn’t work:
The usual workaround is to create a whole new string and assign it to
str instead of the old one.
In the following sections we’ll see more examples of this.
Or, if we want a single character lowercased:
There are multiple ways to look for a substring within a string.
The first method is str.indexOf(substr, pos).
It looks for the
str, starting from the given position
pos, and returns the position where the match was found or
-1 if nothing can be found.
The optional second parameter allows us to start searching from a given position.
For instance, the first occurrence of
"id" is at position
1. To look for the next occurrence, let’s start the search from position
If we’re interested in all occurrences, we can run
indexOf in a loop. Every new call is made with the position after the previous match:
The same algorithm can be layed out shorter:
There is also a similar method str.lastIndexOf(substr, position) that searches from the end of a string to its beginning.
It would list the occurrences in the reverse order.
There is a slight inconvenience with
indexOf in the
if test. We can’t put it in the
if like this:
alert in the example above doesn’t show because
0 (meaning that it found the match at the starting position). Right, but
0 to be
So, we should actually check for
-1, like this:
The more modern method str.includes(substr, pos) returns
true/false depending on whether
It’s the right choice if we need to test for the match, but don’t need its position:
The optional second argument of
str.includes is the position to start searching from:
str.slice(start [, end])
Returns the part of the string from
startto (but not including)
If there is no second argument, then
slicegoes till the end of the string:
Negative values for
start/endare also possible. They mean the position is counted from the string end:
str.substring(start [, end])
Returns the part of the string between
This is almost the same as
slice, but it allows
startto be greater than
end(in this case it simply swaps
Negative arguments are (unlike slice) not supported, they are treated as
str.substr(start [, length])
Returns the part of the string from
start, with the given
In contrast with the previous methods, this one allows us to specify the
lengthinstead of the ending position:
The first argument may be negative, to count from the end:
Let’s recap these methods to avoid any confusion:
||negative values mean
All of them can do the job. Formally,
Of the other two variants,
slice is a little bit more flexible, it allows negative arguments and shorter to write.
So, for practical use it’s enough to remember only
As we know from the chapter Comparisons, strings are compared character-by-character in alphabetical order.
Although, there are some oddities.
A lowercase letter is always greater than the uppercase:
Letters with diacritical marks are “out of order”:
This may lead to strange results if we sort these country names. Usually people would expect
Zealandto come after
Österreichin the list.
There are special methods that allow to get the character for the code and back:
Returns a decimal number representing the code for the character at position
Creates a character by its numeric
Now let’s see the characters with codes
65..220 (the latin alphabet and a little bit extra) by making a string of them:
See? Capital characters go first, then a few special ones, then lowercase characters, and
Ö near the end of the output.
Now it becomes obvious why
a > Z.
The characters are compared by their numeric code. The greater code means that the character is greater. The code for
a (97) is greater than the code for
- All lowercase letters go after uppercase letters because their codes are greater.
- Some letters like
Östand apart from the main alphabet. Here, its code is greater than anything from
The “right” algorithm to do string comparisons is more complex than it may seem, because alphabets are different for different languages.
So, the browser needs to know the language to compare.
Luckily, modern browsers support the internationalization standard ECMA-402.
It provides a special method to compare strings in different languages, following their rules.
The call str.localeCompare(str2) returns an integer indicating whether
str is less, equal or greater than
str2 according to the language rules:
- Returns a negative number if
stris less than
- Returns a positive number if
stris greater than
0if they are equivalent.
This method actually has two additional arguments specified in the documentation, which allows it to specify the language (by default taken from the environment, letter order depends on the language) and setup additional rules like case sensitivity or should
"á" be treated as the same etc.
- There are 3 types of quotes. Backticks allow a string to span multiple lines and embed expressions
- We can use special characters, such as a line break
- To get a character, use:
- To get a substring, use:
- To lowercase/uppercase a string, use:
- To look for a substring, use:
includes/startsWith/endsWithfor simple checks.
- To compare strings according to the language, use:
localeCompare, otherwise they are compared by character codes.
There are several other helpful methods in strings:
str.trim()– removes (“trims”) spaces from the beginning and end of the string.
str.repeat(n)– repeats the string
- …and more to be found in the manual.
Strings also have methods for doing search/replace with regular expressions. But that’s big topic, so it’s explained in a separate tutorial section Regular expressions.
Also, as of now it’s important to know that strings are based on Unicode encoding, and hence there’re issues with comparisons. There’s more about Unicode in the chapter Unicode, String internals.