What Algorithms Can't Tell You about Art

Lincoln Michel

Aug 7, 2023

On Prosecraft and why data analysis of fiction rarely says anything at all

Read →

25 Comments

Emma Smith-Stevens

Aug 7, 2023

Shaxpir? Lol. With that adverb/vividness/etc breakdown, should've gone with Wyrdzwyrth.

Expand full comment

Reply (1)

Terry Freedman

Aug 8, 2023

😂

Expand full comment

Bill Adler

Aug 8, 2023

On a related note, I have a Google alert set to inform me whenever my name appears on the internet or in the news. Ninety percent of the time the alert is for a pirated copy of one of my books that's available for download.

Expand full comment

ARX-Han

Aug 7, 2023

This was a really wonderful post, but what gets me worried (as a fiction author myself) is the progress in LLM's over time. While these limitations may hold (for now), it's unclear what generative models will be capable of in 2, 3, 5, 10 years, etc. Quite possibly they will usurp all but the most creative writers. Richard Ngo (highly credible), I believe, predicted that LLM's will be able to write high quality short stories within 2-3 years or something like that.

Anyways, I've got my own related cope about this here lol: https://www.decentralizedfiction.com/p/butterflies-for-the-machine-god-fiction

Really a big fan of your work. Keep it up!

Expand full comment

Reply (1)

Lincoln Michel

Aug 7, 2023

Thank you! It's hard to know. The kind of data analysis I'm mostly talking about here is different than LLMs certainly. The latter have interesting potential and lots of problems. I personally am skeptical they will get that good in the near term, and I definitely build in extra skepticism because the biggest AI hypers are the ones who were hyping crypto, NFTs, and the metaverse a year ago.

It's quite possible current LLM tech will plateau and not improve much for a while. But it's possible a new LLM tech will change things dramatically.

Expand full comment

Reply (1)

Joyce Reynolds-Ward

Aug 7, 2023

Honestly, the most promising LLM uses I'm seeing are tied into practical applications in things like agtech. Or providing better weather prediction models--which gets away from the A.I. training decline issue because in both cases there are new data inputs on a regular basis that are monitored/modified by human observation.

Expand full comment

Reply (1)

Birgitte Rasine

Aug 8, 2023

Yes absolutely! I'm plugged into the world of cleantech and there the applications of AI are so much more useful and meaningful. The art/text/music gen AI applications are a little too vanity-oriented tbh. The world has some serious problems we need to solve as a civilization. Building algorithms that can write novels is not one of them.

Expand full comment

Reply (1)

Joyce Reynolds-Ward

Aug 8, 2023

Agreed. It just grates that so much emphasis is put upon art/text/music when there are SO MANY OTHER important applications for AI that could seriously solve a lot of issues.

Expand full comment

Stetson

Aug 7, 2023Edited

I am a bit more sanguine about the possibility of leveraging computational and quantitative methods for literary criticism. Iterations like this are useless though. I think the real indicator of this possibility is the existences of large language models that support naturalistic text generation (e.g. ChatGPT). There is clearly some mapping here of how language works, and it should be able to illuminate something about why certain texts are compelling while others do not. A secondary indicator comes from experiments with language like lipograms (wrt Oulipo).

Expand full comment

Reply (1)

Lincoln Michel

Aug 8, 2023

Yes I'm being a bit hyperbolic here. There's useful ways to analyze books for data certainly. I think the problem comes with treating it like an equation to solve. This % of adverbs needed. This exact emotional arc shape.

Since I complained about the positive/negative word study, I'll say I actually reference that study sometimes in my classes NOT because of the "six possible arcs" thing--which I think is silly as explained--but because the charts demonstrate the wave like motion of stories. Basically stories have more ups and downs than the typical Freytag's Pyramid model shows.

But yeah this specific Prosecraft one seems very useless to me.

Expand full comment

Lincoln Michel

Aug 7, 2023

Update: in the minutes (!) since posting this, Shaxpir has announced they're taking down Prosecraft.

Explanation here: https://blog.shaxpir.com/taking-down-prosecraft-io-37e189797121

Expand full comment

Reply (1)

Susan

Aug 8, 2023

Thanks for sharing this.

Expand full comment

Joseph L. Wiess

Aug 8, 2023

I use Grammarly to help me edit my stories, and I'm always arguing with the damn thing. It keep wanting to change my sentences and the spellings of my characters, and doesn't really understand other languages. So, I basically use it for spelling and grammar checks.

Expand full comment

Reply (1)

Birgitte Rasine

Aug 8, 2023

Ugh I can't stand Grammarly. Tried it and stopped using it ages ago. If I'm a professional I shouldn't need a software program to tell me how to write.

Expand full comment

Reply (1)

Joseph L. Wiess

Aug 9, 2023

I use it mainly for the spellcheck and punctuation check.

Expand full comment

Reply (1)

Birgitte Rasine

Aug 9, 2023

That works... Do you find Grammarly better at it than say Google docs/MS Word or about the same?

Expand full comment

Kirsten Bell

Aug 8, 2023

This was a fascinating piece. Slightly off topic, but Shaxpir reminds me a bit of Jellybooks, which also claims to use 'science' to understand 'good' books. Rather than focusing on the books themselves, its focus is on reading behaviours as a proxy for evaluating books themselves. But there are a variety of assumptions built into it about what makes a book good. For example, it assumes completion rate is a sign of engagement and makes a lots of the fact that people don't finish business books. However, while that probably makes sense as a metric for fiction, it makes zero sense for a lot of non-fiction, which people tend to use as a reference text (unless it's a memoir or popular non-fiction that is written with the narrative arc of a fiction book).

Likewise, they treat 'velocity' (i.e., reading speed) as a measure of engagement. But based on that metric, romance novels are the height of literature (of course, some unquestionably are, but this isn’t exactly a universal feature of the genre), because they tend to be read quickly. The idea of wanting to savour a book to make it last longer, or being so emotionally connected to a book that you can only read it for short stretches because it makes you so tense, or so sad, or you're terrified about what's coming next is completely outside the logic of their metrics.

Expand full comment

Aug 8, 2023

Why do people watch sports? I mean they can just see the results and stats right?

67% ball possession! Woa that was great…

Expand full comment

Johnathan Reid

Aug 8, 2023

The interesting counterpoint to this is the widespread adoption by authors of 'grammar checking' tools, which are utilising ever more sophisticated natural language processing techniques for statistical analysis and line editing recommendations.

Expand full comment

Reply (1)

Birgitte Rasine

Aug 8, 2023

I'd rather that authors have tools like grammar checking and actively use them (cough cough mainstream understaffed news media) but it does say a lot about our elementary educational system. My middle schooler told me, to my shock, that they still had not learned what definite/indefinite articles were.

Expand full comment

Reply (1)

Johnathan Reid

Aug 9, 2023

Pity the Spanish or Italian pupil (and their teacher) - four indefinites and seven definites respectively. But at least they're got their genders down pat...

Expand full comment

Terry Freedman

Aug 8, 2023

I suggested using a spreadsheet for analysis of a book. I realise now that I should have included passive verbs and adverbs: https://open.substack.com/pub/terryfreedman/p/use-a-spreadsheet-for-literary-criticism?r=18suih&utm_campaign=post&utm_medium=web

Expand full comment

Birgitte Rasine

Aug 8, 2023

We must be plugged into the creative hive mind :) -- just this weekend I gave a talk in SF about generative AI and the creative process of humans. One of the things I said was "Do we really think we can parse human intelligence and creative genius into data, algorithms, patterns, or neural networks?"

Everything you say here Lincoln rings profoundly true — a story, a narrative, is so much more than the letters and words and punctuation.

The key to Mr. Smith's journey lies in the very first thing he says about how he started. He wanted to write a memoir, his first book, and he "didn't know how many words I should write." Starting off with focusing on data is, IMHO, not the best way to approach writing. If you have a client or a submission with a word count, all good of course. But apart from that... focus on what drives you to write that book/novella/story, and worry about word count later. Focus on story, character, structure, texture... all those thing that help you hone your craft. A master storyteller does not need to count words.

The second thing that strikes me in his letter is when he says "After I published that book, I was so moved by the experience that I started my own company to make tools for authors." Again, all good, and great he was so moved that he wanted to help authors. Yet he never (at least not that he shares) proactively reached out to authors of different types and asked them how they work, what they need, what would be most helpful. He just went ahead with his idea of what would be helpful to authors. So after just one book, without talking to other authors, would you say you know the craft of writing enough to build a suite of tools?

Mr. Smith says, at the end of his apology, "In the future, I would love to rebuild this library with the consent of authors and publishers." Why this occurred to him only now, when he's facing an author outcry, is bewildering, and perhaps more than a little telling. It sounds awfully like the AI companies who scraped and only now are starting to ask for a conversation.

Expand full comment

Reply (1)

Johnathan Reid

Aug 9, 2023

In simpler terms, what's happened here is the classic IT ‘trope’ of software being developed without consulting the business it's being targeted at. This unfortunately still happens for and within companies both large and small, and has done for decades. It remains a prevalent issue within the startup/VC community, where supply vs demand is skewed towards naive optimism.

I used to spend hours every week ensuring this kind of insular, basement-room thinking didn't happen and/or mitigating its impact.

The issue is primarily caused by a singularly innate or tribal ‘we know best’ unwillingness to communicate.

Don't get me wrong, it happens the other way round just as much: IT folks being asked to do the impossible, to unachievable deadlines, out of - often wilful and lauded - ignorance.

See what Steve Jobs had to say on the precedence of customer vs. technology here:

https://www.linkedin.com/posts/pascalbornet_tech-leadership-activity-7094886306123534336-t5EM

Expand full comment

Comment deleted

Aug 7, 2023

Comment deleted

Expand full comment

Lincoln Michel

Aug 7, 2023

Oh yes, I said they were right to be pissed.

I just thought enough people had covered that part of it online.

Expand full comment

Counter Craft

What Algorithms Can't Tell You about Art