AI Text Clustering
Paste a list of items, one per line, and group them into topics by meaning instead of shared words. A MiniLM embedding model maps every line to a vector, then items are clustered by cosine similarity: lines about dogs land together, finance lines in another group, cooking in a third, even when they share no words. A tightness slider sets how similar items must be to join a cluster (higher = tighter, more clusters). Everything runs in your browser, so nothing is uploaded; the model downloads once on first use, then is cached.
How to cluster a list by meaning
- Paste your items into the box, one per line.
- Set the tightness slider (higher means tighter, more granular clusters), then click Group similar items (the first run loads the model).
- Read each cluster card, titled by a representative item and listing its members; the largest clusters show first.
Examples
Group a mixed list into topics
Lines: 'The puppy chased a ball.' / 'My savings account earns interest.' / 'I baked sourdough bread.' / 'The dog fetched the stick.'
Cluster 1 (dogs): the puppy and the dog lines. Cluster 2: the savings line. Cluster 3: the bread line.
Frequently asked questions
How is this different from sorting or grouping by keyword?
Keyword grouping needs the same words to appear. Clustering compares meaning: 'the puppy chased a ball' and 'the dog fetched the stick' group together even though they share no words, because their sentence embeddings are close. It uses cosine similarity, not string matching.
Is my text uploaded anywhere?
No. The MiniLM embedding model runs entirely in your browser via WebAssembly. Your list is processed on your device and never sent to a server. Only the model is downloaded, once, then cached, so nothing is uploaded.
Which AI model does this use?
all-MiniLM-L6-v2, a compact sentence-transformer (about 23 MB) that maps text to 384-dimensional vectors. It is fast, widely used for semantic grouping, and runs locally through transformers.js and ONNX.
What does the tightness slider do?
It sets the minimum cosine similarity for an item to join an existing cluster. A higher value makes clusters tighter, so you get more, smaller groups of very similar items. A lower value merges loosely related items into fewer, broader groups. Re-cluster after changing it.
How big a list can it handle?
It comfortably handles hundreds to a few thousand lines. Embedding scales with the amount of text, so very large lists take longer to embed on the first pass, but clustering itself is fast once the vectors exist. Everything still runs in your browser, nothing is uploaded.
Related tools
Semantic Search
Search any text by meaning, not keywords. Paste a list or document, type a query, and an AI ranks the closest matches in your browser. Nothing is uploaded.
Paraphrase Detector
Check whether two texts say the same thing. AI compares their meaning, not their words, and returns a similarity score and a verdict. Nothing is uploaded.
Semantic Diff
Diff two texts by meaning, not characters. AI matches reworded or reordered ideas a line diff misses, then groups common, removed and added. Nothing uploaded.
Zero-Shot Text Classifier
Label text with your own categories using AI, no training needed. Paste content, add labels, and BART-MNLI classifies it in your browser. Nothing is uploaded.
AI Text Summarizer
Summarize any text with AI, right in your browser. Paste an article, pick a length, and get a short abstractive summary. Nothing is uploaded.
Acronym Generator
Turn any phrase into an acronym from the first letter of each main word. Skip small words like the and of, keep them, or add dot separators.