On a related note, I have a Google alert set to inform me whenever my name appears on the internet or in the news. Ninety percent of the time the alert is for a pirated copy of one of my books that's available for download.
This was a really wonderful post, but what gets me worried (as a fiction author myself) is the progress in LLM's over time. While these limitations may hold (for now), it's unclear what generative models will be capable of in 2, 3, 5, 10 years, etc. Quite possibly they will usurp all but the most creative writers. Richard Ngo (highly credible), I believe, predicted that LLM's will be able to write high quality short stories within 2-3 years or something like that.
Thank you! It's hard to know. The kind of data analysis I'm mostly talking about here is different than LLMs certainly. The latter have interesting potential and lots of problems. I personally am skeptical they will get that good in the near term, and I definitely build in extra skepticism because the biggest AI hypers are the ones who were hyping crypto, NFTs, and the metaverse a year ago.
It's quite possible current LLM tech will plateau and not improve much for a while. But it's possible a new LLM tech will change things dramatically.
Honestly, the most promising LLM uses I'm seeing are tied into practical applications in things like agtech. Or providing better weather prediction models--which gets away from the A.I. training decline issue because in both cases there are new data inputs on a regular basis that are monitored/modified by human observation.
Yes absolutely! I'm plugged into the world of cleantech and there the applications of AI are so much more useful and meaningful. The art/text/music gen AI applications are a little too vanity-oriented tbh. The world has some serious problems we need to solve as a civilization. Building algorithms that can write novels is not one of them.
Agreed. It just grates that so much emphasis is put upon art/text/music when there are SO MANY OTHER important applications for AI that could seriously solve a lot of issues.
Aug 7, 2023·edited Aug 7, 2023Liked by Lincoln Michel
I am a bit more sanguine about the possibility of leveraging computational and quantitative methods for literary criticism. Iterations like this are useless though. I think the real indicator of this possibility is the existences of large language models that support naturalistic text generation (e.g. ChatGPT). There is clearly some mapping here of how language works, and it should be able to illuminate something about why certain texts are compelling while others do not. A secondary indicator comes from experiments with language like lipograms (wrt Oulipo).
Yes I'm being a bit hyperbolic here. There's useful ways to analyze books for data certainly. I think the problem comes with treating it like an equation to solve. This % of adverbs needed. This exact emotional arc shape.
Since I complained about the positive/negative word study, I'll say I actually reference that study sometimes in my classes NOT because of the "six possible arcs" thing--which I think is silly as explained--but because the charts demonstrate the wave like motion of stories. Basically stories have more ups and downs than the typical Freytag's Pyramid model shows.
But yeah this specific Prosecraft one seems very useless to me.
I use Grammarly to help me edit my stories, and I'm always arguing with the damn thing. It keep wanting to change my sentences and the spellings of my characters, and doesn't really understand other languages. So, I basically use it for spelling and grammar checks.
Ugh I can't stand Grammarly. Tried it and stopped using it ages ago. If I'm a professional I shouldn't need a software program to tell me how to write.
This was a fascinating piece. Slightly off topic, but Shaxpir reminds me a bit of Jellybooks, which also claims to use 'science' to understand 'good' books. Rather than focusing on the books themselves, its focus is on reading behaviours as a proxy for evaluating books themselves. But there are a variety of assumptions built into it about what makes a book good. For example, it assumes completion rate is a sign of engagement and makes a lots of the fact that people don't finish business books. However, while that probably makes sense as a metric for fiction, it makes zero sense for a lot of non-fiction, which people tend to use as a reference text (unless it's a memoir or popular non-fiction that is written with the narrative arc of a fiction book).
Likewise, they treat 'velocity' (i.e., reading speed) as a measure of engagement. But based on that metric, romance novels are the height of literature (of course, some unquestionably are, but this isn’t exactly a universal feature of the genre), because they tend to be read quickly. The idea of wanting to savour a book to make it last longer, or being so emotionally connected to a book that you can only read it for short stretches because it makes you so tense, or so sad, or you're terrified about what's coming next is completely outside the logic of their metrics.
The interesting counterpoint to this is the widespread adoption by authors of 'grammar checking' tools, which are utilising ever more sophisticated natural language processing techniques for statistical analysis and line editing recommendations.
I'd rather that authors have tools like grammar checking and actively use them (cough cough mainstream understaffed news media) but it does say a lot about our elementary educational system. My middle schooler told me, to my shock, that they still had not learned what definite/indefinite articles were.
Pity the Spanish or Italian pupil (and their teacher) - four indefinites and seven definites respectively. But at least they're got their genders down pat...
We must be plugged into the creative hive mind :) -- just this weekend I gave a talk in SF about generative AI and the creative process of humans. One of the things I said was "Do we really think we can parse human intelligence and creative genius into data, algorithms, patterns, or neural networks?"
Everything you say here Lincoln rings profoundly true — a story, a narrative, is so much more than the letters and words and punctuation.
The key to Mr. Smith's journey lies in the very first thing he says about how he started. He wanted to write a memoir, his first book, and he "didn't know how many words I should write." Starting off with focusing on data is, IMHO, not the best way to approach writing. If you have a client or a submission with a word count, all good of course. But apart from that... focus on what drives you to write that book/novella/story, and worry about word count later. Focus on story, character, structure, texture... all those thing that help you hone your craft. A master storyteller does not need to count words.
The second thing that strikes me in his letter is when he says "After I published that book, I was so moved by the experience that I started my own company to make tools for authors." Again, all good, and great he was so moved that he wanted to help authors. Yet he never (at least not that he shares) proactively reached out to authors of different types and asked them how they work, what they need, what would be most helpful. He just went ahead with his idea of what would be helpful to authors. So after just one book, without talking to other authors, would you say you know the craft of writing enough to build a suite of tools?
Mr. Smith says, at the end of his apology, "In the future, I would love to rebuild this library with the consent of authors and publishers." Why this occurred to him only now, when he's facing an author outcry, is bewildering, and perhaps more than a little telling. It sounds awfully like the AI companies who scraped and only now are starting to ask for a conversation.
In simpler terms, what's happened here is the classic IT ‘trope’ of software being developed without consulting the business it's being targeted at. This unfortunately still happens for and within companies both large and small, and has done for decades. It remains a prevalent issue within the startup/VC community, where supply vs demand is skewed towards naive optimism.
I used to spend hours every week ensuring this kind of insular, basement-room thinking didn't happen and/or mitigating its impact.
The issue is primarily caused by a singularly innate or tribal ‘we know best’ unwillingness to communicate.
Don't get me wrong, it happens the other way round just as much: IT folks being asked to do the impossible, to unachievable deadlines, out of - often wilful and lauded - ignorance.
See what Steve Jobs had to say on the precedence of customer vs. technology here:
Shaxpir? Lol. With that adverb/vividness/etc breakdown, should've gone with Wyrdzwyrth.
😂
On a related note, I have a Google alert set to inform me whenever my name appears on the internet or in the news. Ninety percent of the time the alert is for a pirated copy of one of my books that's available for download.
This was a really wonderful post, but what gets me worried (as a fiction author myself) is the progress in LLM's over time. While these limitations may hold (for now), it's unclear what generative models will be capable of in 2, 3, 5, 10 years, etc. Quite possibly they will usurp all but the most creative writers. Richard Ngo (highly credible), I believe, predicted that LLM's will be able to write high quality short stories within 2-3 years or something like that.
Anyways, I've got my own related cope about this here lol: https://www.decentralizedfiction.com/p/butterflies-for-the-machine-god-fiction
Really a big fan of your work. Keep it up!
Thank you! It's hard to know. The kind of data analysis I'm mostly talking about here is different than LLMs certainly. The latter have interesting potential and lots of problems. I personally am skeptical they will get that good in the near term, and I definitely build in extra skepticism because the biggest AI hypers are the ones who were hyping crypto, NFTs, and the metaverse a year ago.
It's quite possible current LLM tech will plateau and not improve much for a while. But it's possible a new LLM tech will change things dramatically.
Honestly, the most promising LLM uses I'm seeing are tied into practical applications in things like agtech. Or providing better weather prediction models--which gets away from the A.I. training decline issue because in both cases there are new data inputs on a regular basis that are monitored/modified by human observation.
Yes absolutely! I'm plugged into the world of cleantech and there the applications of AI are so much more useful and meaningful. The art/text/music gen AI applications are a little too vanity-oriented tbh. The world has some serious problems we need to solve as a civilization. Building algorithms that can write novels is not one of them.
Agreed. It just grates that so much emphasis is put upon art/text/music when there are SO MANY OTHER important applications for AI that could seriously solve a lot of issues.
I am a bit more sanguine about the possibility of leveraging computational and quantitative methods for literary criticism. Iterations like this are useless though. I think the real indicator of this possibility is the existences of large language models that support naturalistic text generation (e.g. ChatGPT). There is clearly some mapping here of how language works, and it should be able to illuminate something about why certain texts are compelling while others do not. A secondary indicator comes from experiments with language like lipograms (wrt Oulipo).
Yes I'm being a bit hyperbolic here. There's useful ways to analyze books for data certainly. I think the problem comes with treating it like an equation to solve. This % of adverbs needed. This exact emotional arc shape.
Since I complained about the positive/negative word study, I'll say I actually reference that study sometimes in my classes NOT because of the "six possible arcs" thing--which I think is silly as explained--but because the charts demonstrate the wave like motion of stories. Basically stories have more ups and downs than the typical Freytag's Pyramid model shows.
But yeah this specific Prosecraft one seems very useless to me.
Update: in the minutes (!) since posting this, Shaxpir has announced they're taking down Prosecraft.
Explanation here: https://blog.shaxpir.com/taking-down-prosecraft-io-37e189797121
Thanks for sharing this.
I use Grammarly to help me edit my stories, and I'm always arguing with the damn thing. It keep wanting to change my sentences and the spellings of my characters, and doesn't really understand other languages. So, I basically use it for spelling and grammar checks.
Ugh I can't stand Grammarly. Tried it and stopped using it ages ago. If I'm a professional I shouldn't need a software program to tell me how to write.
I use it mainly for the spellcheck and punctuation check.
That works... Do you find Grammarly better at it than say Google docs/MS Word or about the same?
This was a fascinating piece. Slightly off topic, but Shaxpir reminds me a bit of Jellybooks, which also claims to use 'science' to understand 'good' books. Rather than focusing on the books themselves, its focus is on reading behaviours as a proxy for evaluating books themselves. But there are a variety of assumptions built into it about what makes a book good. For example, it assumes completion rate is a sign of engagement and makes a lots of the fact that people don't finish business books. However, while that probably makes sense as a metric for fiction, it makes zero sense for a lot of non-fiction, which people tend to use as a reference text (unless it's a memoir or popular non-fiction that is written with the narrative arc of a fiction book).
Likewise, they treat 'velocity' (i.e., reading speed) as a measure of engagement. But based on that metric, romance novels are the height of literature (of course, some unquestionably are, but this isn’t exactly a universal feature of the genre), because they tend to be read quickly. The idea of wanting to savour a book to make it last longer, or being so emotionally connected to a book that you can only read it for short stretches because it makes you so tense, or so sad, or you're terrified about what's coming next is completely outside the logic of their metrics.
Why do people watch sports? I mean they can just see the results and stats right?
67% ball possession! Woa that was great…
The interesting counterpoint to this is the widespread adoption by authors of 'grammar checking' tools, which are utilising ever more sophisticated natural language processing techniques for statistical analysis and line editing recommendations.
I'd rather that authors have tools like grammar checking and actively use them (cough cough mainstream understaffed news media) but it does say a lot about our elementary educational system. My middle schooler told me, to my shock, that they still had not learned what definite/indefinite articles were.
Pity the Spanish or Italian pupil (and their teacher) - four indefinites and seven definites respectively. But at least they're got their genders down pat...
I suggested using a spreadsheet for analysis of a book. I realise now that I should have included passive verbs and adverbs: https://open.substack.com/pub/terryfreedman/p/use-a-spreadsheet-for-literary-criticism?r=18suih&utm_campaign=post&utm_medium=web
We must be plugged into the creative hive mind :) -- just this weekend I gave a talk in SF about generative AI and the creative process of humans. One of the things I said was "Do we really think we can parse human intelligence and creative genius into data, algorithms, patterns, or neural networks?"
Everything you say here Lincoln rings profoundly true — a story, a narrative, is so much more than the letters and words and punctuation.
The key to Mr. Smith's journey lies in the very first thing he says about how he started. He wanted to write a memoir, his first book, and he "didn't know how many words I should write." Starting off with focusing on data is, IMHO, not the best way to approach writing. If you have a client or a submission with a word count, all good of course. But apart from that... focus on what drives you to write that book/novella/story, and worry about word count later. Focus on story, character, structure, texture... all those thing that help you hone your craft. A master storyteller does not need to count words.
The second thing that strikes me in his letter is when he says "After I published that book, I was so moved by the experience that I started my own company to make tools for authors." Again, all good, and great he was so moved that he wanted to help authors. Yet he never (at least not that he shares) proactively reached out to authors of different types and asked them how they work, what they need, what would be most helpful. He just went ahead with his idea of what would be helpful to authors. So after just one book, without talking to other authors, would you say you know the craft of writing enough to build a suite of tools?
Mr. Smith says, at the end of his apology, "In the future, I would love to rebuild this library with the consent of authors and publishers." Why this occurred to him only now, when he's facing an author outcry, is bewildering, and perhaps more than a little telling. It sounds awfully like the AI companies who scraped and only now are starting to ask for a conversation.
In simpler terms, what's happened here is the classic IT ‘trope’ of software being developed without consulting the business it's being targeted at. This unfortunately still happens for and within companies both large and small, and has done for decades. It remains a prevalent issue within the startup/VC community, where supply vs demand is skewed towards naive optimism.
I used to spend hours every week ensuring this kind of insular, basement-room thinking didn't happen and/or mitigating its impact.
The issue is primarily caused by a singularly innate or tribal ‘we know best’ unwillingness to communicate.
Don't get me wrong, it happens the other way round just as much: IT folks being asked to do the impossible, to unachievable deadlines, out of - often wilful and lauded - ignorance.
See what Steve Jobs had to say on the precedence of customer vs. technology here:
https://www.linkedin.com/posts/pascalbornet_tech-leadership-activity-7094886306123534336-t5EM
Oh yes, I said they were right to be pissed.
I just thought enough people had covered that part of it online.