On Prosecraft and why data analysis of fiction rarely says anything at all
Shaxpir? Lol. With that adverb/vividness/etc breakdown, should've gone with Wyrdzwyrth.
I agree with what you're saying, but what I'm also seeing online is that writers are mad that this service scraped their work in the first place without permission of the author or their publishing companies. Said authors are pissed. Said A.I. is hackneyed, at best.
On a related note, I have a Google alert set to inform me whenever my name appears on the internet or in the news. Ninety percent of the time the alert is for a pirated copy of one of my books that's available for download.
This was a really wonderful post, but what gets me worried (as a fiction author myself) is the progress in LLM's over time. While these limitations may hold (for now), it's unclear what generative models will be capable of in 2, 3, 5, 10 years, etc. Quite possibly they will usurp all but the most creative writers. Richard Ngo (highly credible), I believe, predicted that LLM's will be able to write high quality short stories within 2-3 years or something like that.
Anyways, I've got my own related cope about this here lol: https://www.decentralizedfiction.com/p/butterflies-for-the-machine-god-fiction
Really a big fan of your work. Keep it up!
I am a bit more sanguine about the possibility of leveraging computational and quantitative methods for literary criticism. Iterations like this are useless though. I think the real indicator of this possibility is the existences of large language models that support naturalistic text generation (e.g. ChatGPT). There is clearly some mapping here of how language works, and it should be able to illuminate something about why certain texts are compelling while others do not. A secondary indicator comes from experiments with language like lipograms (wrt Oulipo).
Update: in the minutes (!) since posting this, Shaxpir has announced they're taking down Prosecraft.
Explanation here: https://blog.shaxpir.com/taking-down-prosecraft-io-37e189797121
I use Grammarly to help me edit my stories, and I'm always arguing with the damn thing. It keep wanting to change my sentences and the spellings of my characters, and doesn't really understand other languages. So, I basically use it for spelling and grammar checks.
This was a fascinating piece. Slightly off topic, but Shaxpir reminds me a bit of Jellybooks, which also claims to use 'science' to understand 'good' books. Rather than focusing on the books themselves, its focus is on reading behaviours as a proxy for evaluating books themselves. But there are a variety of assumptions built into it about what makes a book good. For example, it assumes completion rate is a sign of engagement and makes a lots of the fact that people don't finish business books. However, while that probably makes sense as a metric for fiction, it makes zero sense for a lot of non-fiction, which people tend to use as a reference text (unless it's a memoir or popular non-fiction that is written with the narrative arc of a fiction book).
Likewise, they treat 'velocity' (i.e., reading speed) as a measure of engagement. But based on that metric, romance novels are the height of literature (of course, some unquestionably are, but this isn’t exactly a universal feature of the genre), because they tend to be read quickly. The idea of wanting to savour a book to make it last longer, or being so emotionally connected to a book that you can only read it for short stretches because it makes you so tense, or so sad, or you're terrified about what's coming next is completely outside the logic of their metrics.
Why do people watch sports? I mean they can just see the results and stats right?
67% ball possession! Woa that was great…
The interesting counterpoint to this is the widespread adoption by authors of 'grammar checking' tools, which are utilising ever more sophisticated natural language processing techniques for statistical analysis and line editing recommendations.
I suggested using a spreadsheet for analysis of a book. I realise now that I should have included passive verbs and adverbs: https://open.substack.com/pub/terryfreedman/p/use-a-spreadsheet-for-literary-criticism?r=18suih&utm_campaign=post&utm_medium=web
We must be plugged into the creative hive mind :) -- just this weekend I gave a talk in SF about generative AI and the creative process of humans. One of the things I said was "Do we really think we can parse human intelligence and creative genius into data, algorithms, patterns, or neural networks?"
Everything you say here Lincoln rings profoundly true — a story, a narrative, is so much more than the letters and words and punctuation.
The key to Mr. Smith's journey lies in the very first thing he says about how he started. He wanted to write a memoir, his first book, and he "didn't know how many words I should write." Starting off with focusing on data is, IMHO, not the best way to approach writing. If you have a client or a submission with a word count, all good of course. But apart from that... focus on what drives you to write that book/novella/story, and worry about word count later. Focus on story, character, structure, texture... all those thing that help you hone your craft. A master storyteller does not need to count words.
The second thing that strikes me in his letter is when he says "After I published that book, I was so moved by the experience that I started my own company to make tools for authors." Again, all good, and great he was so moved that he wanted to help authors. Yet he never (at least not that he shares) proactively reached out to authors of different types and asked them how they work, what they need, what would be most helpful. He just went ahead with his idea of what would be helpful to authors. So after just one book, without talking to other authors, would you say you know the craft of writing enough to build a suite of tools?
Mr. Smith says, at the end of his apology, "In the future, I would love to rebuild this library with the consent of authors and publishers." Why this occurred to him only now, when he's facing an author outcry, is bewildering, and perhaps more than a little telling. It sounds awfully like the AI companies who scraped and only now are starting to ask for a conversation.