
Authors and publishers get to choose how the text in their books may be used. In fact, it’s common practice to state how a book may be used on the copyright page using disclosures like:
“No part of this book may be reproduced or transmitted in any form or by any means, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from the publisher.”
Authors control what formats their books may appear in, whether the book can be translated into other languages, and whether Stephen Spielberg has permission to film a movie adaptation of their work. They also have the right to deny or allow the use of their book for training AI.
The Anthropic Settlement
So yes, a lot of books have already been used to train AI systems. I can’t argue against that. There was a major class-action lawsuit against Anthropic for training their AI system, Claude, on books without obtaining the authors’ permission.
I wish the lawsuit had been settled simply because training on books without permission is wrong. But the case took a different turn: the issue wasn’t just AI training. The company had allegedly obtained many of those books through piracy.
AI companies stole books to train AI.
Learn more about the Anthropic Settlement
Honestly, I wasn’t horribly surprised to learn this. Large AI companies have often treated publicly available content on the internet as “fair game” (it isn’t) when building training datasets.
But my focus here isn’t on what has already happened. I’m not here to convince you not to use AI tools trained on questionable data. I simply needed a starting point for this discussion, which is this:
Bad actors are going to act badly.
The most practical response is to protect yourself and your work. Make it clear that your books are meant for human readers, and put barriers in place so that if another Anthropic-style incident happens, you have legal standing to seek compensation.
Whether or not you think it would be cool for an AI chatbot to mimic your writing style, companies should obtain your permission before using your text to train AI tools.
Opt-out Where You Can
As a first line of defense, always read the terms of service for online tools before inputting text. Some platforms may use posts or user-generated content to train AI systems. Some platforms also offer ways to opt out of having your data used for AI training.
And remember: any publicly available content is easier for bad actors to scrape and reuse.
Say “NO” to AI Training on Your Copyright Page
An easy way to make others aware that your book should not be used as training data is to include a “No AI Training” disclosure on the copyright page.
The Author’s Guild has put together an excellent disclosure which they encourage all authors and publishers to use:
“NO AI TRAINING: Without in any way limiting the author’s [and publisher’s] exclusive rights under copyright, any use of this publication to “train” generative artificial intelligence (AI) technologies to generate text is expressly prohibited. The author reserves all rights to license uses of this work for generative AI training and development of machine learning language models.”
Learn more on the Author’s Guild website
Of course, you can tailor this disclosure, but it serves as a strong first barrier.
Register Your Book’s Copyright Claim
Your work is automatically copyrighted the moment you create it. However, registering your work with the appropriate copyright authority, depending on where you live, is vital if you ever need to enforce your rights.
Remember the settlement mentioned earlier? Courts determined that Anthropic may need to compensate authors and publishers whose books were used.
In the United States, books must generally be registered with the U.S. Copyright Office before a lawsuit is filed in order to seek statutory damages. So to the indie authors who claim filing copyright isn’t that important? It actually becomes extremely important if you ever need to protect your book.
Registering copyright for a book in the U.S. is super simple and can be done online through the U.S. Copyright Office’s website. Registration currently costs about $45 (as of 2026), can be done right after you publish, and typically doesn’t require a mailed copy of the book.
Closing Thoughts
The internet is a wonderful place where people can freely share thoughts, ideas, and their creations. Unfortunately, it’s also very easy for bad actors to grab publicly available content and use it in ways the creator never intended.
I’m glad we have legal protections against theft and copyright infringement, and I’m hopeful that clearer laws and policies will emerge that allow companies to train AI models using only data where permission was explicitly granted.
There’s always a balance between protecting your work and hiding it in obscurity. I hope creators continue to put themselves out there, make new things, and share beauty with the world. I’m personally going to keep writing stories and making books. It’s something I enjoy and something I’m good at.
Want to keep the discussion going?
Find me on Instagram, Bluesky, and Threads.
No comments:
Post a Comment