Boneyard Tools

AI Find Similar Lines

Sort a list by how close each line is in MEANING to one reference line. Type a reference (a sentence, a question, a description) and paste your list, and a MiniLM embedding model scores every line by semantic similarity and ranks them, closest first. Because it compares meaning, it surfaces paraphrases that share no words with your reference, which keyword search and string matching miss. Everything runs in your browser, so nothing is uploaded; the model downloads once on first use, then is cached.

How to rank a list by similarity to a reference

  1. Type your reference line: the sentence or idea you want to match against.
  2. Paste your list into the box, one item per line.
  3. Click Find similar; the first run loads the model, then the list comes back ranked by closeness in meaning.

Examples

Surface a paraphrase with no shared words

Reference: 'a young dog playing'  List: 'The puppy chased a ball.' / 'Interest rates rose.' / 'She baked bread.'
Ranks 'The puppy chased a ball.' first by a wide margin; the finance and baking lines score far lower.

Frequently asked questions

How is this different from keyword search or Ctrl+F?

Keyword search needs the reference words to appear in a line. This compares meaning using sentence embeddings, so a reference like 'a young dog playing' ranks 'the puppy chased a ball' at the top even though they share no words. It scores every line by cosine similarity rather than matching strings.

Is my text uploaded anywhere?

No. The MiniLM embedding model runs entirely in your browser via WebAssembly. Your reference and list are processed on your device and never sent to a server. Only the model is downloaded, once, then cached.

Which AI model does this use?

all-MiniLM-L6-v2, a compact sentence-transformer (about 23 MB) that maps the reference and each line to a 384-dimensional vector. It is fast, widely used for semantic similarity, and runs locally through transformers.js and ONNX.

What does the similarity percentage mean?

It is the cosine similarity between the reference and that line, shown as 0 to 100 percent. Higher means closer in meaning. Scores are relative, so use them to rank and compare lines rather than as an absolute pass or fail cutoff.

How long can the list be?

It comfortably handles hundreds to a few thousand lines; the reference and every line are embedded in one pass, then ranked, and the top matches are shown. Longer lists take a little longer to embed; the model downloads once on first use, then is cached, and everything runs in your browser so nothing is uploaded.

Related tools