Boneyard Tools

AI Paraphrase Detector

Paste two texts and find out whether they say the same thing, even when the words are completely different. A MiniLM embedding model maps each text to a vector and measures their cosine similarity, then labels the pair Paraphrase, Related, or Different. 'The cat sat on the mat' and 'a feline rested on the rug' score high; an unrelated sentence scores low. Everything runs in your browser, so nothing is uploaded; the model downloads once on first use, then is cached.

How to check if two texts mean the same thing

  1. Paste the first text into Text A and the second into Text B.
  2. Click Compare (the first run loads the model).
  3. Read the similarity score and the verdict: Paraphrase, Related, or Different.

Examples

Two paraphrases with no shared words

Text A: 'The cat sat on the mat.'  Text B: 'A feline rested on the rug.'
High similarity, verdict: Paraphrase, even though the two sentences share no words.

Frequently asked questions

How is this different from a plagiarism or string-match checker?

String matching needs the same words to appear. This compares meaning: 'the cat sat on the mat' and 'a feline rested on the rug' are flagged as paraphrases despite sharing no words, because their sentence embeddings are close. It uses cosine similarity, not character overlap.

Is my text uploaded anywhere?

No. The MiniLM embedding model runs entirely in your browser via WebAssembly. Both texts are processed on your device and never sent to a server. Only the model is downloaded, once, then cached, so nothing is uploaded.

Which AI model does this use?

all-MiniLM-L6-v2, a compact sentence-transformer (about 23 MB) that maps text to 384-dimensional vectors. It is fast, widely used for semantic similarity, and runs locally through transformers.js and ONNX.

What do the Paraphrase, Related, and Different verdicts mean?

They are bands on the cosine similarity score. At or above 80% the texts are near-identical in meaning, so Paraphrase. From 55% to 80% they are clearly Related, sharing topic or partial meaning. Below 55% they are Different. The score itself is always shown so you can judge borderline cases.

How long can each text be?

Short sentences and paragraphs work best, since the model averages meaning across the whole text. Very long inputs still work but their meaning gets blended, which can soften the score. Everything still runs in your browser, nothing is uploaded.

Related tools