It doesn't matter how much of my book the AI is scraping; it shouldn't have any access to my book.
So then, no human should have access to your book. It literally makes no difference.
On the other hand, if a human is being inspired by my book, then you're right; it's a lot lower-profile and I may never notice and find out...but if I do find out, then I can examine the book myself and compare it to my own work. But AI is churning out hundreds or thousands of works a day, while human creators are significantly less prodigious.
That's fair, but I don't really see how it changes anything.
So ok, fine. The New Big AI discloses all of it's sources, your book is among them.
New Big AI generates 90,000 works a day.
Are you going to have the resources to go through every single on the of the 90,000 works in order to compare and see if any of them may meet the threshold of violating a copyright you own?
Or i'll even toss a bone here. The AI company develops a way that will not only disclose what works it has fed into the AI, but upon every single work generated the AI will also provide a word-by-word breakdown of what source it used (which is... absolutely ridiculous and is just not in any way reasonable or feasible), how do you then break it down? And the issue still remains that you will still go have to go through 90,000 works created per day to scan to see what percentage of your your words the work used.
AND THEN, just because it used your work as a source for say, 4.3% of the new work generated, now you need to read the work and compare it to see if it was actually a copyright violation or it literally just used words from your work. 90,000 times. Per day.
It would be absolutely ridiculous to demand that you are paid by the percentage that the AI detects in has used from a source, because even if it used 100% of the words from your work... that doesn't actually mean it has violated any copyright. It just means it used the same words you did.
Anyway you package it, it either doesn't make any sense or is just entirely not feasible and not workable. The BEST thing you can possibly get is for AI companies to disclose their sources so that you can potentially ensure the source was legally obtained.
My serious question... when it comes to compensation, as an author, what do you want? Specifically? What do you want to be compensated on, how would the compensation work? I'm genuinely curious because I haven't actually seen any true, specific demand other than "pay me".
EDIT -
From a purely pragmatic point of view, the "try and stop us people" once again do have the most convincing argument.
For the sake of argument, let's say there is a universal law passed, AI devs must disclose what works the AI was trained on.
Ok. Law is done.
AI dev doesn't include your work. You suspect your work was used. Prove it. From a US perspective, if you're trying to press criminal charges, prove it beyond a reasonable doubt. Civil charges, prove it 51%. Either way,
prove your work was used.
I'm fairly certain it's impossible.