Boneyard Tools

LLM Token Counter

Count exactly how many tokens your text uses, with the model's REAL tokenizer rather than a rough words-times-1.3 guess. Switch between GPT-4o, GPT-4, Llama 3 and BERT, watch the live count as you type, see each token colour-coded so you understand how the text is split, and get an instant cost estimate across popular APIs. Compare every tokenizer side by side in one click. The tokenizers run entirely in your browser (a tiny download, no model weights), so nothing is uploaded.

How to count LLM tokens

  1. Paste or type your text into the box.
  2. Pick a tokenizer (GPT-4o, GPT-4, Llama 3, or BERT) to see the live token count and the colour-coded split.
  3. Read the cost estimate, or click Compare all models to see the token count for every tokenizer at once.

Examples

Counting a prompt

The quick brown fox jumps over the lazy dog.
GPT-4o: ~10 tokens, each shown colour-coded, with an estimated input cost per model.

Frequently asked questions

Are these real token counts or an estimate?

Real. The tool loads the model's actual tokenizer (for example o200k_base for GPT-4o, the Llama 3 tokenizer, or WordPiece for BERT) via transformers.js and runs it in your browser, so the count matches what the API would charge, not a heuristic like characters divided by four.

Is my text uploaded anywhere?

No. The tokenizer runs entirely in your browser. Your text never leaves your device, and tokenizers are tiny (a few megabytes, no model weights), so there is almost nothing to download.

Why do GPT, Llama and BERT give different counts?

Each model is trained with its own vocabulary and merge rules, so the same text splits into a different number of tokens. That is why the tool lets you compare them: a prompt that is 100 GPT-4o tokens may be a different count for Llama or BERT.

How accurate is the cost estimate?

It multiplies your token count by each model's published per-token price. Prices change often and vary by provider and by input vs output, so treat it as an approximate input-side estimate and confirm current pricing with your provider.

What is the coloured token view?

Each box is one token as the tokenizer sees it, coloured so adjacent tokens are easy to tell apart. Leading spaces show as a middle dot and newlines as a return glyph, which makes it clear that, for many tokenizers, a leading space is part of the token.

Related tools