The Similarity Band

The Similarity Band is a visual representation of the similarities that our system finds in books.

Each Band is a generated watermark which represents the book’s text, from beginning to end, going from left to right. This is the text as it appears inside the digital file which is uploaded into the Similar Works archive, so it includes things like the copyright notice, the table of contents, disclaimers, back matter, and samples of other books.

When every book is run through the master algorithm, the text is split into logical chunks, usually consisting of no more than a sentence or two. The Band is generated by lining up all the chunks in order, and then recording a color depending on whether a similarity has been detected within that chunk or not.

If you see a plain band with no stripes, then no similarities have been found.

A Similarity Band showing no text matches

The Band can tell you a lot about how books are related to each other! For example, if a book has a lot of stripes on the far right side of the Band, then there are similarities detected near the end of the text. That probably indicates that the same back matter or samples appear in another book. Stripes on the left indicate similarities detected near the start of the text, and they are probably disclaimers or generic copyright notices.

(We do our best to filter out disclaimers and other generic language used by a lot of authors, so hopefully you won’t see too many of those.)

Unfortunately, the algorithm can only identify similarities. It can’t tell us why the similarity exists.

Common Phrases or Quotations

If you see only one or two stripes, then those are likely common phrases. The sensitivity of the master algorithm is carefully tuned to try to avoid this, but it’s not always successful. These can also be quotations.

Similarity Band for The Best of Relations

Here’s an example of a Similarity Band for The Best of Relations, by Catherine Bilson. You can see that there is a single white stripe indicating a similarity about two-thirds of the way through the book. That similarity was identified as coming from none other than Pride and Prejudice, by Jane Austen – which is not unusual, as The Best of Relations is based on Pride and Prejudice! In this case, it’s a famous line from Jane Austen’s classic that Catherine Bilson added to her novel.

I was given good principles, but left to follow them in pride and conceit.

The Best of Relations/Pride and Prejudice

What Does Plagiarism Look Like?

Royal Love, by Cristiane Serruya

This is the Similarity Band for Royal Love, a romance novel by Cristiane Serruya. Royal Love is currently part of an ongoing court case filed by famed romance author Nora Roberts against Cristiane Serruya in April 2019, accusing her of plagiarizing lines from as many as forty other romance authors.

The Similar Works system has identified many similarities in Royal Love, spread throughout the book.