Icu Syntax | FormatJS

If you are translating text you'll need a way for your translators to express the subtleties of spelling, grammar, and conjugation inherent in each language. We use the ICU Message syntax which is also used in Java and PHP.

The intl-messageformat library takes the message and input data and creates an appropriately formatted string. This feature is included with all of the integrations we provide.

The following sections describe the ICU Message syntax and show how to use this features provided the FormatJS libraries:

Basic Principles#

The simplest transform for the message is a literal string.

Hello everyone

All other transforms are done using replacements called "arguments". They are enclosed in curly braces ({ and }) and refer to a value in the input data.

Simple Argument#

You can use a {key} argument for placing a value into the message. The key is looked up in the input data, and the string is interpolated with its value.

ICU Message Editor

Formatted Argument#

Values can also be formatted based on their type. You use a {key, type, format} argument to do that.

The elements of the argument are:

key is where in the input data to find the data
type is optional, and is how to interpret the value (see below)
format is optional, and is a further refinement on how to display that type of data

ICU Message Editor

`number` Type#

This type is used to format numbers in a way that is sensitive to the locale. It understands the following values for the optional format element of the argument:

ICU Message Editor

Internally it uses the Intl.NumberFormat API. You can define custom values for the format element, which are passed to the Intl.NumberFormat constructor.

Sometimes embedding how the number will be formatted provides great context to translators. We also support ICU Number Skeletons using the same syntax:

ICU Message Editor

You can read more about this here.

For fine control over decimal precision, you can use the Fraction Precision # and 0 symbols, which specify the number of decimal places to display:

ICU Message Editor

Note that the # symbol doesn't render trailing zeroes, as seen in this example:

ICU Message Editor

To render trailing zeroes, use the 0 symbol:

ICU Message Editor

For more details, see Fraction Precision.

`date` Type#

This type is used to format dates in a way that is sensitive to the locale. It understands the following values for the optional format element of the argument:

short is used to format dates in the shortest possible way
medium is used to format dates with short textual representation of the month
long is used to format dates with long textual representation of the month
full is used to format dates with the most detail

ICU Message Editor

Internally it uses the Intl.DateTimeFormat API. You can define custom values for the format element, which are passed to the Intl.DateTimeFormat constructor.

`time` Type#

This type is used to format times in a way that is sensitive to the locale. It understands the following values for the optional format element of the argument:

short is used to format times with hours and minutes
medium is used to format times with hours, minutes, and seconds
long is used to format times with hours, minutes, seconds, and timezone
full is the same as long

ICU Message Editor

Internally it uses the Intl.DateTimeFormat API. You can define custom values for the format element, which are passed to the Intl.DateTimeFormat constructor.

Supported DateTime Skeleton#

Similar to number type, we also support ICU DateTime skeleton. ICU provides a wide array of pattern to customize date time format. However, not all of them are available via ECMA402's Intl API. Therefore, we only support the following patterns

Symbol	Meaning	Notes
G	Era designator
y	year
M	month in year
L	stand-alone month in year
d	day in month
E	day of week
e	local day of week	`e..eee` is not supported
c	stand-alone local day of week	`c..ccc` is not supported
a	AM/PM marker
h	Hour [1-12]
H	Hour [0-23]
K	Hour [0-11]
k	Hour [1-24]
m	Minute
s	Second
z	Time Zone

`{select}` Format#

The {key, select, matches} is used to choose output by matching a value to one of many choices. (It is similar to the switch statement available in some programming languages.) The key is looked up in the input data. The corresponding value is matched to one of the matches and the corresponding output is returned. The key argument must follow Unicode Pattern_Syntax.

The matches is a space-separated list of individual match cases, where each match case has the format match {output}. (A match case is similar to the case statement of the switch found in some programming languages.) The match is a literal value. If it is the same as the value for key then the corresponding output will be used.

The output is itself a message, so it can be a literal string or also have more arguments nested inside of it.

The other match case is special and is used if nothing else matches. (This is similar to the default case of the switch found in some programming languages.) Note that other is treated as a reserved keyword in the match pattern, not as a variable lookup, so it will always act as the fallback case even if your input data contains a field named other.

Danger

other is required as per icu4j implementation. We will throw an error if select is used without other.

ICU Message Editor

Here's an example of nested arguments.

ICU Message Editor

`{plural}` Format#

The {key, plural, matches} is used to choose output based on the pluralization rules of the current locale. It is very similar to the {select} format above except that the value is expected to be a number and is mapped to a plural category.

The match is a literal value and is matched to one of these plural categories. Not all languages use all plural categories.

zero: This category is used for languages that have grammar specialized specifically for zero number of items. (Examples are Arabic and Latvian.)
one: This category is used for languages that have grammar specialized specifically for one (singular) item. Many languages, but not all, use this plural category. (Many popular Asian languages, such as Chinese and Japanese, do not use this category.)
two: This category is used for languages that have grammar specialized specifically for two (dual) items. (Examples are Arabic and Welsh.)
few: This category is used for languages that have grammar specialized specifically for a small number (paucal) of items. For some languages this is used for 2-4 items, for some 3-10 items, and other languages have even more complex rules.
many: This category is used for languages that have grammar specialized specifically for a larger number of items. (Examples are Arabic, Polish, and Russian.)
other: This category is used if the value doesn't match one of the other plural categories. Note that this is used for "plural" for languages (such as English) that have a simple "singular" versus "plural" dichotomy.
=value: This is used to match a specific value regardless of the plural categories of the current locale.

Info

Don't use =1 in place of one. one doesn't always mean 1 but rather means singular, which can match to more than number 1 in certain locales. Some locales considered all numbers ends with 1 (like 1, 11, 111) to be singular.

Danger

other is required as per icu4j implementation. We will throw an error if plural is used without other.

ICU Message Editor

In the output of the match, you can use the # special token as a placeholder for the numeric value and it'll be formatted as if it were {key, number}. This is the style we prefer to use.

ICU Message Editor

`{selectordinal}` Format#

The {key, selectordinal, matches} is used to choose output based on the ordinal pluralization rules (1st, 2nd, 3rd, etc.) of the current locale. It is very similar to the {plural} format above except that the value is mapped to an ordinal plural category.

The match is a literal value and is matched to one of these plural categories. Not all languages use all plural categories.

zero: This category is used for languages that have grammar specialized specifically for zero number of items. (Examples are Arabic and Latvian.)
one: This category is used for languages that have grammar specialized specifically for one item. Many languages, but not all, use this plural category. (Many popular Asian languages, such as Chinese and Japanese, do not use this category.)
two: This category is used for languages that have grammar specialized specifically for two items. (Examples are Arabic and Welsh.)
few: This category is used for languages that have grammar specialized specifically for a small number of items. For some languages this is used for 2-4 items, for some 3-10 items, and other languages have even more complex rules.
many: This category is used for languages that have grammar specialized specifically for a larger number of items. (Examples are Arabic, Polish, and Russian.)
other: This category is used if the value doesn't match one of the other plural categories. Note that this is used for "plural" for languages (such as English) that have a simple "singular" versus "plural" dichotomy.
=value: This is used to match a specific value regardless of the plural categories of the current locale.

Danger

other is required as per icu4j implementation. We will throw an error if selectordinal is used without other.

In the output of the match, the # special token can be used as a placeholder for the numeric value and will be formatted as if it were {key, number}.

ICU Message Editor

Rich Text Formatting#

We also support embedded rich text formatting in our message using tags. This allows developers to embed as much text as possible so sentences don't have to be broken up into chunks.

NOTE: This is not XML/HTML tag

Message Format#

Our price is <boldThis>{price, number, ::currency/USD precision-integer}</boldThis>
with <link>{pct, number, ::percent} discount</link>

JavaScript Usage#

You need to provide handler functions for each tag to specify how they should be rendered:

import IntlMessageFormat from 'intl-messageformat'

const message = new IntlMessageFormat(
  `Our price is <boldThis>{price, number, ::currency/USD precision-integer}</boldThis> with <link>{pct, number, ::percent} discount</link>`,
  'en'
)

const output = message.format({
  price: 2,
  pct: 0.2,
  // Handler functions for rich text tags
  boldThis: chunks => `<b>${chunks.join('')}</b>`,
  link: chunks => `<a href="https://example.com">${chunks.join('')}</a>`,
})

console.log(output)
// "<b>$2</b> with <a href="https://example.com">20% discount</a>"

React Example#

In React applications, you can return React elements from the handler functions:

import {FormattedMessage} from 'react-intl'

function PriceDisplay() {
  return (
    <FormattedMessage
      id="price.message"
      defaultMessage="Our price is <boldThis>{price, number, ::currency/USD precision-integer}</boldThis> with <link>{pct, number, ::percent} discount</link>"
      values={{
        price: 2,
        pct: 0.2,
        boldThis: chunks => <b>{chunks}</b>,
        link: chunks => <a href="https://example.com">{chunks}</a>,
      }}
    />
  )
}

// Renders: Our price is <b>$2</b> with <a href="https://example.com">20% discount</a>

Custom Behavior

Rich text formatting requires handler functions to process the tags. The tag names (like boldThis and link) are arbitrary - you define them in your message and provide corresponding handler functions in the values object.

You can also configure default handlers globally using the defaultRichTextElements configuration in IntlProvider, so you don't have to pass them for every message.

Quoting / Escaping#

The ASCII apostrophe ' (U+0027) can be used to escape syntax characters in the text portion of the message, which mimics the behavior of ICU's quoting/escaping.

ICU Message Editor

Two consecutive ASCII apostrophes represents one ASCII apostrophe, similar to %% in printf represents one %. However, we recommend using curly apostrophe ’ (U+2019) for human-readable strings and only use ASCII apostrophe ' (U+0027) in ICU message syntax.

Basic Principles#

Simple Argument#

ICU Message Editor

Formatted Argument#

ICU Message Editor

number Type#

ICU Message Editor

ICU Message Editor

ICU Message Editor

ICU Message Editor

ICU Message Editor

ICU Message Editor

date Type#

ICU Message Editor

time Type#

ICU Message Editor

Supported DateTime Skeleton#

{select} Format#

Danger

ICU Message Editor

ICU Message Editor

{plural} Format#

Info

Danger

ICU Message Editor

ICU Message Editor

ICU Message Editor

{selectordinal} Format#

Danger

ICU Message Editor

Rich Text Formatting#

Message Format#

JavaScript Usage#

React Example#

Custom Behavior

Quoting / Escaping#

ICU Message Editor

ICU Message Editor

ICU Message Editor

ICU Message Editor

ICU Message Editor

`number` Type#

`date` Type#

`time` Type#

`{select}` Format#

`{plural}` Format#

`{selectordinal}` Format#