About the BookID™ Copyright Protection System

Article author
Jason
  • Updated

BookID™ is Scribd’s automated copyright protection system. Scribd developed BookID to help safeguard intellectual property on Scribd.com and the Scribd mobile app. Our team is working to bring BookID's protections to SlideShare in the near future.

How it works

BookID algorithmically analyzes computer-readable text for semantic data that it then encodes into a digital "fingerprint.” BookID stores the fingerprints of known copyrighted works on a secure server that is inaccessible to the Internet and the general public.

BookID scans every document uploaded to Scribd and removes those that have the same, or a substantially similar, fingerprint. BookID intermittently scans the entire Scribd library to remove matching content that was uploaded prior to fingerprinting. BookID’s approach reduces misidentifications and enables the detection of infringing works even if they have been altered to some degree.

Content is fingerprinted by BookID when:

  • it is uploaded directly to BookID by a verified author or publisher enrolled in the BookID for Authors and Publishers Program;
  • it is added to Scribd’s subscription reading and listening service through a verified publisher or distributor.
  • it is removed from Scribd pursuant to a DMCA notification.

Limitations

BookID relies on computer-readable text, which is not necessarily the same as text that is readable by humans. Content scanned from paper sources may not contain computer-readable text, which makes those sources unsuitable for fingerprinting. Similarly, text that is encoded with optical character recognition (OCR) technology may contain garbled or partial data. These conditions make it very difficult, if not impossible, to detect matches.

BookID’s fingerprint scanner cannot detect specific keywords, titles, names, copyright notices, or other disclaimers that are part of a document's text. In other words, BookID cannot be programmed to block all documents that contain a book’s title. Likewise, BookID cannot translate different languages. If BookID fingerprints an English-language document, it can only detect subsequent uploads that are also in English.

BookID cannot detect images, illustrations, and sheet music at this time.

False positives

BookID contains fingerprints of educational textbooks and other works that contain long excerpts of classic literature, religious texts, legal documents, and government publications in the public domain. This occasionally results in the temporary removal of non-copyrighted, authorized, or public domain material from Scribd.com and the mobile app.

While the occasional false positives is unavoidable, we continually tune BookID to reduce their occurrence. Unfortunately, the high volume of fingerprints and uploads to Scribd prevent us from conducting manual oversight or uploader notification prior to removing a match. If you have received a notification that your work was removed by BookID and you feel the removal was improper, forward the notification that you received to copyright@scribd.com along with an explanation of your concern. We will review each case as quickly as possible.

BookID for Authors and Publishers

BookID for Authors and Publishers is a free program for approved content providers that wish to submit copies of their works to be fingerprinted by BookID. Click here to learn more.

Was this article helpful?