Similar Works is an ambitious, independent project that aims to help authors protect their work from being maliciously copied.

It’s easy to detect if someone copies your whole book. But what if they copied just a paragraph? A few lines here and there? What if they copied snippets from a hundred books, and not just yours? How do you find the needles in a haystack composed of billions of words?

No one was looking for this, until it happened. So we built a system that could do it, and then built the architecture around it, and we’re going to do our best to make sure it doesn’t happen again.

Add your book to the Similar Works archive here. It’s free, and it always will be.

So what exactly is the archive?

In short, it is a massive, custom-built database that’s designed to operate with the Similar Works master algorithm. Every time a new book is added to the archive, the algorithm checks it against every book that already exists in it, and identifies matching patterns of text.

It could take you a week to open two books and go through them page by page, looking for similar phrases or sentences. The Similar Works system can do it in seconds, with almost the same accuracy as a human.

Every book that’s added to the archive helps us to identify common phrases to be filtered out, and it protects that book in the future. If a suspect work is added to the archive that has copied elements from any other book we’ve got, we’ll be able to identify those elements and inform the rightsholders.

Similar Works grew out of a prototype built by Claire Ryan in February 2019. Claire is a fantasy author that you’ve never heard of, who wrote a couple of books that you’ve never read. She is also a senior web developer specializing in data processing and scaling.

If you have questions, you can tweet us @SimilarWorks.