It is safe to say that technologies lumped under “Artificial Intelligence” (AI) will have an effect on the publishing and writing process, but I believe that much of this will be continuation of trends that have been in place for more than a decade. Since I largely ignore developments in AI, so these comments are agnostic with respect to specific technologies. As a bit of a spoiler, one place where AI is taking over an economic niche is in the production of scam electronic books (e-books).
Scam E-Books: Pre-AI Situation
Although AI may make it easier to produce scam e-books, as the name suggests, they were not exactly difficult to produce in the first place.
People who do not dig too deeply into niches in online bookstores could easily be unaware of the existence of scam e-books. For myself, I only became aware of them once I started researching self-publishing. I mainly search for books in not very popular non-fiction areas like economics, and there is not a large enough volume of book buyers or even new books in those niches to allow scam books to make any money or survive detection. Scams need to hide amongst popular genres for self-published authors, which are mainly fiction like romance and erotica, fan fiction, young adult, topics like vampires, etc. In order to understand modern publishing, most books sold are fiction. Most people who want to be writers want to write novels, not primers on the bond market.
Non-fiction writing used to be a way of making a reasonable amounts of money. Non-fiction books get out of date, so there is always a market for new books on a topic. The major best sellers are normally fiction, and the top authors make spectacular amounts of money. The problem is that older (possibly dead) fiction authors crowd out a significant portion of the critical bookstore shelf inches, making it hard to break into the superstar club. There are high volume non-fiction genres, like cookbooks and biographies of famous people, which behave more like fiction. The decline of the traditional publishing industry meant that they cut support to the non-superstar authors that write bread and butter non-fiction books.
Scam books were only possible after the rise of online booksellers selling e-books. Bookstore shelf space is too valuable to waste on garbage — if a book is not selling, it rapidly disappears from shelves. They only make sense at an extremely low price point (not possible when printed) and do not take inventory space.
Before AI, scam books were lovingly created by artisans the old fashioned way (scraping text off of websites, and getting a cheap cover image, possibly by a $5 internet hire). The scraped text would then have some extra payloads added in the form of links. (Electronic books are a group of web pages (or rarely, PDF’s) bound into the equivalent of a zip file.) These files are then uploaded to electronic bookstores, where they go through automated testing before being put on sale. (The main skill associated with scam book production is knowing how to get past those automated tests.)
Back when I started self-publishing (2015 or so), the pay-offs used to be as follows.
Impersonate real books before they go on sale. It is possible to sell books related to other (good) books by labelling them as being “supplementary materials.” This article from the Author’s Guild discusses this in the context of AI developments.
Get people to click on malicious links. Although it sounds crazy to go through the steps of producing a fake book to get somebody to click on a link, it is unsurprising. When I launched my blog on the Blogger platform, it was hit immediately by “page hits” following links from dubious websites. That was the result of the continuous bombardment of hits by bots whose sole purpose is to get unwary bloggers to click the links on their “referral” analytics page. That is, there was a continuous bombardment of web activity whose only target was the handful of people who set up blogs. In other words, malicious links seem to pay.
Get people to buy the book and hope that they do not read it. This appears to be a fairly common online phenomenon — you can see it in Steam Games. (If you look at Steam achievements, there are some that should be triggered by a few minutes of play, yet the global achievement rate is surprisingly low.) A consumer loads their shopping cart with 10 cheap e-books, starts reading them based on expected quality, but then buys another 10 before finishing the first 10.
A variant of the previous: a good number of books sold are not read past the first chapter. All you need is a mediocre first chapter, and many people may not even notice that the rest of the book is random unrelated texts scraped from the web. The explanation of this is that a lot of books sold are (unwanted) gifts, many people get frustrated by reading, and books can be purely lifestyle purchases. (A local home decoration store sells hardcover book bundles based on the colour scheme of the covers.) Most best-selling nonfiction books have catchy premises that can be summarised in a few sentences for very good commercial reasons.
Most scam books (by volume) seemed to be aimed at taking advantage of monthly reading subscriptions. At the time, there would be a link at the beginning to the end of the book. By jumping to the end, the book was counted as “fully read” for the purposes of the book subscription pay-out, even though the reader only briefly looked at 2 out of 400 pages. (The e-books are divided into “pages” of a standard size, even though chapters are scrolling webpages.)
Although the trick of embedding a link to the end of the book can be patched, I expect the last category of scam to be durable. The reasoning is straightforward: it only damages the interests of other publishers (and their authors). The consumer just skips to the next book, and they don’t care if one of the “free” books borrowed during the monthly subscription is a dud. The electronic store that sells subscriptions gets a relatively fixed portion of the total subscription volume as revenue. The economic losers are publishers/authors: their cash pay-out is diluted to include scammers. A cynic might note that large publishers are the only entities that have bargaining power vis-à-vis large electronic bookstores — so this situation is rather convenient.
Given the economic incentives, it is a safe bet that hyper-specific niches in online libraries will continue to be filled with low-effort books that survive off skimming a portion of reading subscriptions.
We now turn towards the issue of non-scam books.
Book Covers
Using AI art for book covers is presumably already common, and is a natural tool. For e-books, the cover is generally only seen at thumbnail size before purchase, and so they naturally want to be simpler than the covers that are seen full-size in consumers’ hands in bookstores. In that environment, tuned AI art could easily be commercially superior to most inexpensive human-produced art.
AI book covers also eliminate certain issues with book covers. If you spent any time at looking at (pre-AI) book covers in the romance and murder fiction sections of a bookstore, you will note that book covers that are photographs of people generally do not show full faces. A fairly common trope is focussing in on a woman’s lips, with the rest of the face clipped off or obscured. This is done for legal and cultural reasons — publishers do not want real-life models being linked to the morally dubious activities described within potboiler fiction. If such a book cover has a photograph of an identifiable model, the model would have probably been paid a decent fee to waive legal claims. To the extent that AI art can avoid such hassles, it will have an advantage.
Publishing: Editing
There are a few types of “editing” in the publishing industry. (I am probably mangling terms of art herein.)
Content editing. Someone who interacts with the author early in the process, and helps set the content and scope of the book. They will also offer “strategic feedback,” along the lines of “Chapter Two stinks, rewrite it completely.”
Copy-Editing. This is late-stage editing, and only offers “tactical feedback” on a manuscript. This is often called “proof-reading” in common parlance, but that is actually something different in publishing — proofreading is the now-obsolete task of ensuring that the printed proof of a text matches the final edited manuscript. (In the old days, workers had to assemble type to match the manuscript.) Copy-editing can be divided (at least) into grammar, text style, and technical copy-editing.
Content editing is the most subjective and hardest skill to replace. The decline of the traditional publishing industry has meant that deluxe content editing is reserved for the authors with the highest expected sales. If AI tools can replicate these tasks, they could greatly improve the quality of the output of authors across the board.
Grammar copyediting was to a certain extent already automated by the time I started self-publishing. It was very easy to hire people online to do such copy-editing at extremely low rates. Was the secret low wages? Although the low wage gig economy was at fault, the secret was that the copy-editors just turned Microsoft Word’s grammar check settings to maximum, and made the suggested changes. The people hiring them were presumably unaware of the ability to change the settings, and/or did not notice that was the output of the copy-editor. (The default grammar checking settings are lenient, presumably because the maximum settings make users sad.) Although I am not sure about the quality of the grammar check in other languages, the Antidote software was crucial for my ability to function in a French working environment.
Stylistic copyediting is more subjective, but I would not be surprised if a lot of writing tics could be cleaned up by an automated sweep.
Technical copyediting is going to be more difficult to replace. I used to do technical copy-editing duty for a few hours a week when I worked for a financial/economic research publisher. You need to go through a text and make sure it makes sense analytically — e.g., did the author write “sell” instead of “buy”? Although such a switch sounds stupid, it is a common result of editing that reverses sentence structure. The original author cannot see the problem because humans tend to read what they intended to write instead of the actual words on the page when the text is fresh in their mind. Maybe AI can do this eventually, but based on what I have seen, AI-generated text creates a need for human technical copy-editing.
Publishing: Legal
Publishing a book opens one up to a lot of legal risk. AI trained on internet discourse is going to create a lot of potential legal problems. However, it might be possible that AI could sniff through text and find passages that the lawyers need to look at.
Publishing: Translation
Automated translation appears to be quite powerful, but one needs to keep in mind the previous sub-section. You really do not want to be legally liable for unchecked machine-generated published content. You also need to be aware that foreign countries have different laws regarding “freedom of expression.”
Writing: Entire (Good) Books?
Since I mainly read non-fiction, the current probabilistic output of AI makes the prospect of an entire book written by AI unappealing. However, I can easily accept that AI could produce certain types of fiction — particularly fan fiction or erotic fiction — that would be as good as the top 95% of human-produced works.
AI Writing Support
The most realistic prospect from my perspective is the use of AI to speed up writing. I will just give examples from my experience.
Academic texts can have a lot of boilerplate text that follows existing patterns. If AI could incorporate references without hallucinating them, then it would have been possible to generate probably 80% of the text in my academic papers. The other 20% is the theorems and proofs that represent the new content. The use of AI could also help clean up writers’ mathematical notation usage, as well help people who are writing in their second language.
I could see how AI text/images could allow someone with a good imagination but lacking in writing/graphical skills produce illustrated children’s texts. Since the volume of text is small, it would be fairly easy to clean up.
Although some non-fiction is “literary”, other non-fiction that is aimed to be instructive tends to be formulaic. A typical structure is described as “tell them what you are going to tell them, tell them, and then tell them what you told them.” (This is pattern that shows up in journalism.) The repetition is needed to defeat the tendency of readers to skim texts. Using AI to generate mini-summaries could speed up writing — and make the manuscript less likely to have inconsistencies created by editing one part and not changing the summary.
Based on what I have seen in AI-generated code, building a primer programming text would not be that difficult, the skill of the author would be in deciding what examples to put in the book. The downside of this being a powerful use case is the collapse of sales for programming books — people just use free resources on the internet.
Will AI Revolutionise Publishing?
The only revolutionary possibility I see is AI replacing authors completely — that is, generating entire books that are good enough to make Weekly Top 20 bestseller charts. (If a half dozen people randomly purchase one of my books within an hour at one store, that book would temporarily be the Number 1 seller in my categories on most online bookstores. That just tells us about the sales volume of the category.) In which case, people could just bypass bookstores and have an AI generate new books for them to read.
Outside of that, I do not think much will change from the perspective of most consumers, or most of the industry. If you go into medium-sized general bookstore, there is only a handful of new authors that make it onto bookshelves in a given year. Online bookstores have a wider variety of authors (like me), but said authors do not get the sales volumes that the big authors (who show up in bookstores) do. What AI will do is change the mix of books that are on the margins of the industry. In addition to having scam e-books that are only visible to hyper-active subscription borrowers, there will be books with moderate sales that were improved with AI tools. Given that most authors already had basic writing tools (like grammar checking) built into word processing software, it is debateable how much of a change this is. Furthermore, since these are books that are generating a tiny slice of total book sales, there will be no observable difference in the productivity of the industry as measured by national statistics.
Concluding Remarks
The task of writing has not really changed much since I was writing my thesis in the early 1990s. I sit in front of a computer, drink coffee, and type. Software tools have improved, but unless they eliminate the need to sit in front of a computer and drink coffee, I am still doing what I am already doing. From an economic perspective, the publishing industry sells books. Electronic delivery changes things versus printing on paper, but that is now proven technology. The constraint facing publishing is not producing texts, the constraint is getting people to pay for the works. Software tools are not putting money into book-readers’ pockets, so the value of the industry is unchanged.
No comments:
Post a Comment
Note: Posts are manually moderated, with a varying delay. Some disappear.
The comment section here is largely dead. My Substack or Twitter are better places to have a conversation.
Given that this is largely a backup way to reach me, I am going to reject posts that annoy me. Please post lengthy essays elsewhere.