Question 1

How is this different from sorting or grouping by keyword?

Accepted Answer

Keyword grouping needs the same words to appear. Clustering compares meaning: 'the puppy chased a ball' and 'the dog fetched the stick' group together even though they share no words, because their sentence embeddings are close. It uses cosine similarity, not string matching.

Question 2

Is my text uploaded anywhere?

Accepted Answer

No. The MiniLM embedding model runs entirely in your browser via WebAssembly. Your list is processed on your device and never sent to a server. Only the model is downloaded, once, then cached, so nothing is uploaded.

Question 3

Which AI model does this use?

Accepted Answer

all-MiniLM-L6-v2, a compact sentence-transformer (about 23 MB) that maps text to 384-dimensional vectors. It is fast, widely used for semantic grouping, and runs locally through transformers.js and ONNX.

Question 4

What does the tightness slider do?

Accepted Answer

It sets the minimum cosine similarity for an item to join an existing cluster. A higher value makes clusters tighter, so you get more, smaller groups of very similar items. A lower value merges loosely related items into fewer, broader groups. Re-cluster after changing it.

Question 5

How big a list can it handle?

Accepted Answer

It comfortably handles hundreds to a few thousand lines. Embedding scales with the amount of text, so very large lists take longer to embed on the first pass, but clustering itself is fast once the vectors exist. Everything still runs in your browser, nothing is uploaded.

AI Text Clustering

How to cluster a list by meaning

Examples

Frequently asked questions

Related tools

Semantic Search

Paraphrase Detector

Semantic Diff

Zero-Shot Text Classifier

AI Text Summarizer

Acronym Generator