AI Find Similar Lines
Sort a list by how close each line is in MEANING to one reference line. Type a reference (a sentence, a question, a description) and paste your list, and a MiniLM embedding model scores every line by semantic similarity and ranks them, closest first. Because it compares meaning, it surfaces paraphrases that share no words with your reference, which keyword search and string matching miss. Everything runs in your browser, so nothing is uploaded; the model downloads once on first use, then is cached.
How to rank a list by similarity to a reference
- Type your reference line: the sentence or idea you want to match against.
- Paste your list into the box, one item per line.
- Click Find similar; the first run loads the model, then the list comes back ranked by closeness in meaning.
Examples
Surface a paraphrase with no shared words
Reference: 'a young dog playing' List: 'The puppy chased a ball.' / 'Interest rates rose.' / 'She baked bread.'
Ranks 'The puppy chased a ball.' first by a wide margin; the finance and baking lines score far lower.
Frequently asked questions
How is this different from keyword search or Ctrl+F?
Keyword search needs the reference words to appear in a line. This compares meaning using sentence embeddings, so a reference like 'a young dog playing' ranks 'the puppy chased a ball' at the top even though they share no words. It scores every line by cosine similarity rather than matching strings.
Is my text uploaded anywhere?
No. The MiniLM embedding model runs entirely in your browser via WebAssembly. Your reference and list are processed on your device and never sent to a server. Only the model is downloaded, once, then cached.
Which AI model does this use?
all-MiniLM-L6-v2, a compact sentence-transformer (about 23 MB) that maps the reference and each line to a 384-dimensional vector. It is fast, widely used for semantic similarity, and runs locally through transformers.js and ONNX.
What does the similarity percentage mean?
It is the cosine similarity between the reference and that line, shown as 0 to 100 percent. Higher means closer in meaning. Scores are relative, so use them to rank and compare lines rather than as an absolute pass or fail cutoff.
How long can the list be?
It comfortably handles hundreds to a few thousand lines; the reference and every line are embedded in one pass, then ranked, and the top matches are shown. Longer lists take a little longer to embed; the model downloads once on first use, then is cached, and everything runs in your browser so nothing is uploaded.
Related tools
Semantic Search
Search any text by meaning, not keywords. Paste a list or document, type a query, and an AI ranks the closest matches in your browser. Nothing is uploaded.
Near-Duplicate Finder
Find semantically near-duplicate lines in a list, including paraphrases that share no words. An AI compares meaning in your browser. Nothing is uploaded.
Semantic Dedupe
Remove near-duplicate lines from a list, including paraphrases that share no words, and get a clean output. An AI runs in your browser. Nothing is uploaded.
Zero-Shot Text Classifier
Label text with your own categories using AI, no training needed. Paste content, add labels, and BART-MNLI classifies it in your browser. Nothing is uploaded.
Acronym Generator
Turn any phrase into an acronym from the first letter of each main word. Skip small words like the and of, keep them, or add dot separators.
Add Line Numbers
Add line numbers to any text online. Set the start value, step, and separator, pad numbers to align, and copy or download the result. Free and private.