Question 1

Why is an emoji counted as one character?

Accepted Answer

The tool iterates by Unicode code point rather than by UTF-16 unit, so an astral character like an emoji is a single entry even though it occupies two units, called a surrogate pair, inside a JavaScript string.

Question 2

What is the difference between the code point and the UTF-8 bytes?

Accepted Answer

The code point is the abstract Unicode number for the character, shown as U+ hex and in decimal. The UTF-8 bytes are how that code point is actually stored on disk or sent over the network, which takes one to four bytes.

Question 3

How are combining characters handled?

Accepted Answer

A base letter and a separate combining mark are shown as two rows because they are two code points. That mirrors how the text is stored. Note that a single precomposed é is one code point, while e plus a combining accent is two.

Question 4

What format is the hex value in?

Accepted Answer

It uses the standard Unicode notation U+XXXX, padded to at least four hex digits, for example U+0041 for A and U+1F600 for a grinning face. Code points above the four-digit range simply use more digits.

Question 5

What is the HTML entity column for?

Accepted Answer

It gives the numeric character reference, such as &#233; for é, which you can paste directly into HTML to render that exact character regardless of the page encoding. The Copy HTML entities button concatenates them for a whole string.

Question 6

Does it show the official Unicode name of each character?

Accepted Answer

No. It reports the numeric identifiers, code point, decimal, HTML entity and UTF-8 bytes, but not the descriptive Unicode name such as LATIN SMALL LETTER A. Pair it with a character-name database if you need the label.

Question 7

How are emoji made of several code points shown?

Accepted Answer

Sequences joined with a zero-width joiner, such as some family or flag emoji, are stored as multiple code points and therefore appear as multiple rows. Each underlying code point is listed on its own line.

Question 8

Can I copy several values at once?

Accepted Answer

Yes. Copy code points joins every U+ value with spaces, and Copy HTML entities joins the &#...; references with no separator so they paste straight into markup.

Question 9

Is my text private?

Accepted Answer

Yes. Every character is analyzed in your browser and nothing is uploaded, so sensitive text never leaves your device.

Unicode Character Lookup

How to look up Unicode characters

Examples

Frequently asked questions

Learn more

Related tools

HTML Entity Encoder

URL Encoder

.env to JSON

ASCII Table

Aspect Ratio Box Generator

Aspect Ratio Calculator