Sep 6, 2022·edited Sep 6, 2022Liked by Lincoln Michel
Hey y'all, it's Kristen McLean, lead industry analyst from NPD BookScan. I thought I would chime in with some numbers here, since that statistic from the DOJ is super-misleading, and I'm not sure where it originally came from, since we did not provide it directly.
It is possible it came from our data, and was provided by one of the publisher parties, but based on the 58,000 figure, it's not obvious what exactly it includes in terms of "publisher frontlist". 58,000 titles is way too small a number for "all frontlist books published in a year by every publisher"--that's more like 487,000 frontlist titles--so it's clear it's a slice but I'm not sure HOW it was sliced.
NPD BookScan (BookScan is owned by The NPD Group, not Nielsen, BTW), collects data on print book sales from 16,000 retail locations, including Amazon print book sales. Included in those numbers are any print book sales from self-publishing platforms where the author has opted for extended distribution and a print book was sold by Amazon or another retailer. So that 487K "new book" figure is all frontlist books in our data showing at least 1 unit sale over the last 52 weeks coming from publishers of all sizes, including individuals.
Lots of press outlets have been calling about it today, so I did a little digging to see if I could reverse-engineer the citation, and am happy to share our numbers here for clarity.
Because this is clearly a slice, and most likely provided by one of the parties to the suit, I decided to limit my data to the frontlist sales for the top 10 publishers by unit volume in the U.S. Trade market. My ISBN list is a little smaller than the one quoted in the DOJ, but the principals will be the same.
The data below includes frontlist titles from Penguin Random House, Simon & Schuster, Hachette Book Group, HarperCollins, Scholastic, Disney, Macmillan, Abrams, Sourcebooks, and John Wiley. The figures below only include books published by these publishers themselves, not pubishers they distribute.
Here is what I found. Collectively, 45,571 unique ISBNs appear for these publishers in our frontlist sales data for the last 52 weeks (thru week ending 8-24-2022).
In this dataset:
>>>0.4% or 163 books sold 100,000 copies or more
>>>0.7% or 320 books sold between 50,000-99,999 copies
>>>2.2% or 1,015 books sold between 20,000-49,999 copies
>>>3.4% or 1,572 books sold between 10,000-19,999 copies
>>>5.5% or 2,518 books sold between 5,000-9,999 copies
>>>21.6% or 9,863 books sold between 1,000-4,999 copies
>>>51.4% or 23,419 sold between 12-999 copies
>>>14.7% or 6,701 books sold under 12 copies
So, only about 15% of all of those publisher-produced frontlist books sold less than 12 copies. That's not nothing, but nowhere as janky as what has been reported.
BUT, I think the real story is that roughly 66% of those books from the top 10 publishers sold less than 1,000 copies over 52 weeks. (Those last two points combined)
And less than 2% sold more than 50,000 copies. (The top two points)
Now data is a funny thing. It can be sliced and diced to create different types of views. For instance we could run the same analysis on ALL of those 487K new books published in the last 52 weeks, which includes many small press and independetly published titles, and we would find that about 98% of them sold less that 5,000 copies in the "trade bookstore market" that NPD BookScan covers. (I know this IS a true statistic because that data was produced by us for The New York Times.)
But that data does not include direct sales from publishers. It does not include sales by authors at events, or through their websites. It does not include eBook sales which we track in a separate tool, and it doesn't include any of the amazing reading going on through platforms like Substack, Wattpad, Webtoons, Kindle Direct, or library lending platforms like OverDrive or Hoopla.
BUT, it does represent the general reality of the ECONOMICS of the publishing market. In general, most of the revenue that keeps publishers in business comes from the very narrow band of publishing successes in the top 8-10% of new books, along with the 70% of overall sales that come from BACKLIST books in the current market. (Backlist books have gained about 4% in share from frontlist books since the pandemic began, but that is a whole other story.)
The long and short of it is publishing is very much a gambler's game, and I think that has been clear from the testimony in the DOJ case. It is true that most people in publishing up to and including the CEOs cannot tell you for sure what books are going to make their year. The big advantage that publisher consolidation has brought to the top of the market is deeper pockets and more resources to roll those dice. More money to get a hot project. More money to influence outcomes through marketing, more access to sales and distribution mechanisms, and easier access to the gatekeepers who decide what books make it onto retailers' shelves. And better ability to distribute risk across a bigger list of gambles.
It is largely a numbers game and I'm not just saying that because I'm a numbers gal. It's a tough business.
Hope this is helpful.
If anyone has questions, they are welcome to reach out to me directly at kristen.mclean@npd.com.
My pleasure. If you ever want to do an AMA here, I'm always happy to do one. Part of my mission is to de-mystify the data as much as possible. It's a big complex industry, and no-one has a complete view--it would be impossible. But I'm happy to pull back the curtian on anything that I do in any way that's helpful.
Oh I'd love that! I did have one quick Q on the data you posted. Does "unique titles" mean separate ISBNs or are hardcover/paperback/etc. editions of the same book combined here?
Separate ISBNs -- which could include newly released PB editions of previous HCs. It's pretty rare that both HC & PB editions appear in the same 12 month "frontlist" period. Most reissues keep the previous ISBNs, so it's mostly brand new books, or new editions of previous HC releases. Also, for us "frontlist sales" are all sales that occur within 12 months of a book's release. So on the 366th day, sales flip into the "backlist" bucket. So the sales referenced in the data above are "frontlist sales" from those 47+K books in this dataset that appeared within 12 months of the book's release, and that appear in the last 52 week period. Make sense?
Yeah it does! Does this mean some of these titles would only have a few weeks or months of sales? Like frontlist titles from, say, three weeks ago would be included?
(probably wouldn't dramatically change the numbers though.)
Yes, that's right. It includes the frontlist sales of the books published last week. So one week of sales. But it also includes the week 52 sales of a book that will drop into the other bucket next week, so it kind of all washes through.
Correct, we track audio and eBooks separately. And digital formats do not necessarily follow the print patterns. For instance, for someone with a prominent audio platform like a podcast, or a well-known musican or artist writing a memoir, their audio-share can climb well above 50% of sales when we line up all of the formats. Likewise, authors in specific genre-driven fiction will see their eBook sales share much higher than print or audio. Or, their audio may be much higher if their fans prefer the format--that happens a lot in sci-fi. It all depends. But overall, the print market dwarfs the digital markets at the top line, and all of it underlines the fact that topline data is only good for looking at big trends and directionality. Every book and author can perform differently than the common wisdom, and the exceptions are sometimes the most interesting thing to watch for futurists like me.
That is so useful and illuminating. Thank you for bringing clarity to this discussion - you're clearly a writing gal as well as a numbers gal. PS Sorry if my tweet lead to those calls.
Amazingly generous analysis! I’m pumped to now know where my bestselling business book fits into the NPD data verse! What’s even more sobering about a that your total ISBN count is NOT a count of authors.
Because of the original weird stat about how many books had sold less than 12. I think it’s a pretty useless benchmark. I generally use 5k as a threshold for commercial viability at the big publishers.
Great info and fascinating background to the stats- My question is how much of this concentration of sales/success on to a few blockbusters and the low sales among the 66 percent is driven by the marketing practices of the big publishers, who may pay a big advance for a book they want to "own" and keep out of the market but will then fail to provide the marketing backup for the same book? Basically, are they often betting big but then not following through on their investment? And the person left out in the cold in that scenario is the writer, and their art. This could be death to the artistic process, to the energy around our publishing culture. It's tempting for us to see sales as reader-driven market magic around a title but it's also I suspect driven not by readers but by the actual decisions and profit-motivations of the publishing industry. And then I'm curious about how much of this discourse about author "careerism" and a perceived lethargy and lack of innovation in our literary culture is down to the business-focused practices of the big publishers? This is a whole other story I realize, and one that is present in the discourse. Thanks for this fascinating discussion!
Hi Kristen, thanks for this, really good data. Similar to people claiming "“as many as 50% of papers are never read by anyone other than their authors, referees and journal editors.” They also claim that 90 percent of papers published are never cited." Simply people reading the data wrong or reading the data to suit a narrative to drive views and clicks.
I am a fellow of a UN Sustainable Development Goals group and one of the things we have been trying to figure out is which are the best selling textbooks in sustainability, energy, climate change, etc. Can you possibly help with this?
Thanks, Dilip. I don't have an easy answer on the best texbooks for those topics as it's a bit out of my wheelhouse. I do know there is an upcoming webinar co-sponsored by Publishers Weekly and Westchester Publishing Services on sustainability and accessibility in publishing, and some of the panelists (besides me) may have an answer for your question because they are better versed in the world on University and Academic presses. More information: https://us06web.zoom.us/webinar/register/WN__zOqIa7kRaan7q-wJBnKKg Good luck!
Dilip, I'm not a publishing professional in any way whatever, but I cannot miss the chance to underscore the recent work of the one author your group should read before any others this year: Stuart L. Hart, "Beyond Shareholder Primacy" (Stanford Business Books). Seriously, this is absolutely at the heart of how to meet those goals. (I admit he has long been a friend - doesn't change the fact!)
My reading of this statistic—a version of which surfaces from time to time, as you note in the opening—is that it's counting (1) new titles requiring an ISBN, which can include re-issues or new formats or POD editions, and (2) what those sales were during the year of release or first year of sales. It's also including many, many different types of publishers which may or may not sell in the bookstore market. That latter part, I think, is the biggest factor in why a book may look like a poor seller when it isn't.
On the other hand, traditionally published authors tend to highly value bookstore sales, as we saw recently with the B&N policy change. So in that regard, I think it's a helpful reminder that print bookstore sales may not drive a book's success and a lot depends on the publisher and category of book.
Hey peeps, I just recorded an episode inspired by this thread for the BBC podcast More or Less, because of all of the great engagement in the community here. Here it is for your listening pleasure: https://www.bbc.co.uk/programmes/p0d8nb1w
"And it’s true that publishers often have no idea what will sell. It’s a throw-against-the-wall-and-see-what-sticks industry. Are there are lot of problems with it? A lot of things that could be fixed? Ways that publishers could better market the backlist or frontlist? Yes! But it’s not quite as dire as some of these statistics suggest."
This is my issue though. There -are- ways to track this and as a tech-nerd, it befuddles me that I'm not picking up on omni-channel dashboards being used in publishing. There are just no APIs for some of that—even foot-traffic data—when there absolutely could be.
If it's not quite as dire, that data needs to exist and transparently. Right?
I don't disagree! Although I should note that many publishers internally provide good data for their authors. Hachette, my novel's publisher, gives me a weekly breakdown of sales by format as well as gross and net sales. (Another tricky thing with books is that publishers sell to bookstores who can then return the copies if they don't sell to customers, so it takes a while to know what truly sold.)
This data isn't available to the public, but then most authors probably wouldn't want it to be. Sales feel kinda private.
That said, I think the "throw against the wall" tactic is while perhaps heightened in publishing a common thing in arts/entertainment in general. Movie companies make lots of movies that don't do well, then a few blockbusters that sustain Hollywood and so on. Part of why Hollywood has been moving more and more toward corporate IP films, like superhero movies, where you can expect a good audience even for a movie with bad reviews.
I get that. I'm not traditionally published, so I don't actually have a bird's-eye. What I do have is what industry tools are available to me. I do understand not wanting all sales data available. There is precedent for more refined benchmarks here, however. Hard to track progress without comparisons and conflicting info. Does this make sense?
Do you, if you can comment, think "throw against the wall tactic" is something that can float going forward? Entertainment is moving towards big data, imho. I feel like that may be a big 2022 and beyond focus for publishing, more than it has been. Then again, I don't know.
Thanks for responding. Sorry if my questions are annoying, it's just very rare for me to get a chance to speak with someone actually inside the industry. I can only estimate and research so much, you know?
I'm sure publishing could use better data, but I do think art (broadly speaking) is always a bit of a crapshoot. You never really know what people will love. I'm sure Hollywood and video game companies use a lot of data, and still movies and games flop while others are surprise hits.
One of the bestselling novels of the last two years has been Song of Achilles, a book published over a decade ago that got popular on BookTok. Can any amount of data tell you what book will go viral in a decade? There's something kind of magical about the randomness, despite it's problems.
As an outsider, I see a desperate need by both editors and writers to believe that they have control over their destiny (in the form of book sales).
And yes, obviously writers and editors have *some* sort of influence on the success of their books, but it's sadly obvious that that connection is so tenuous that much advice is about as useful as knowing which deity to propitiate when you send your book out into the wild.
(And yes, I exaggerate. Good advice gives you another percent (maybe 2!) to the chance of success. And there are lots of things that can _destroy_ one's chances that advice can help one avoid.)
However, the reality is that a true understanding of how little control one has over something as important to one's hopes and dreams as one's book would be detrimental to most author's mental health, so I don't expect the situation to change any time soon. The desire for something/anything that promises more agency is just too deep to go unfulfilled.
A very good article - one other thing to note - especially for traditionally published authors (I'm the business manager of fantasy author Michael J. Sullivan BTW which is where I get my data from.)
is that for PHYSICAL books it can be almost impossible to tell how many books you've sold. Why? Well there are several numbers to consider:
- How many books were printed?
- How many were provided as "comps" for reviewers or industry folks?
- How many were disposed of through pulping?
- How many were disposed of through remaindering?
- How many are in the warehouse?
- How many left the warehouse and are now in the warehouses of Amazon or Barnes & Noble etc?
- How many came back as returns?
It's this last point that I want to talk more about because it can be a big number and hard to determine (especially early on) due to something called "a return reserve." When I received the first royalty report from Michael's books I was disappointed. They hadn't sold as well as I thought they had - but I only had the number of books Michael was paid for - not the number of books that walked out of the stores after being bought.
I later found out - almost by accident that there is another report (The Unit Ledger) which shows the difference between
- books shipped out of the warehouse
- books paid for on the royalty report
- and books that are being held in a "return reserves" pool.
This last group may eventually make it onto a royalty report - or they may be gone forever (hence one of the big differences between books printed and books sold).
If you look at your contract there is some language like "Royalties on each Work shall be based upon actual sales less actual returns and less a reasonable reserve for returnable copies* But what is reasonable? If memory serves 65% of all sales on Michael's first royalty report were "held in reserve" so my disappointment came from only seeing 35% of the books that "were bought by someone" - I'd question if 65% is a "reasonable" amount - but I have heard that many books will see 50% returns. Now with several years track record under my belt Michael's returns were only 6% - 7% (depending on title). So yeah not so reasonable.
Now I should note that those copies retained for returns - eventually "work themselves out" -And are usually essentially gone after the first year - and it's only then that you have a good idea of "actual sales."
As for Bookscan data - Each author's is going to be a bit different for Michael his Bookscan numbers to actual sales seems to run around 65%.
Lincoln, Kristen and Jane - some comments from the music side of the fence that will resonate.
The challenge this blog highlights is counting units, logging street dates and defining the time period. All are similar the challenges we face in music.
First, when we talk about how many songs are on the digital shelf, the answer has to be 'de-duplicated' - an ISRC for a single release of a song might be different from the ISRC on the album. They need to be rolled up into one. Second, the debate over frontline and catalogue definitions has been going on in music since 2017 (see my work here: https://tinyurl.com/3unbh5ba). We've seen click-bait headlines like 'is old music killing new music' whereas the truth is that music between 18 and 36 months old (that's old but not *that* old) is seeing a surge in demand. Finally, claims like this need to be specific of the time period, is it all time or in the most recent calendar year. If you were to ask how many songs hadn't received a click last year, then lots of them would have received a click in the years before.
The most recent example of this type of story in music is the excellent work of MBW which showed that 80% (78.4%) of artists on Spotify today – around 6.3 million of them – have a monthly audience on the platform smaller than 50 people. This is correct, but is it fair? Should hobbyists be lumped in with serious artists? Should it be a monthly stat, or all time? Should dead artists (and those who are no longer active) be grouped with those who are alive and kicking?
Anyway, just to sign off that I feel your pain and trust me - the longer the long tail, the more you'll see these headlines.
Most of a publisher’s profits come from the backlist titles that typically account for around 60% of gross sales in any given year. Way back in 1981 I looked at the previous five years of sales for the top 20 titles sold at a large publisher where I worked, and 16 were the same backlist titles for each of the five years, then four were frontlist that rarely stayed on the list past their launch year. In many cases, the returns in years two and three turned those former stars into money losers.
Backlist is definitely a key component of longterm publisher revenue, and it's rarely discussed in any depth. A "core backlist title", one that sells perennially at a predictable rate is the foundation for most publisher's business. It's not really a joke to say that Goodnight Moon built HarperCollins. Those core backlist titles are worth X100 or more than any "here now gone later" bestseller. That's the real diamond in the mud.
> Most of a publisher’s profits come from the backlist titles
Is this still true, though? Publishers putting anything but big sellers out of print has been a bête noire of authors for decades. (Nothing like having volume 3 of your trilogy published, but volume 1 is already out of print.)
I suspect business concerns (such as tax law) have driven publishers to the "go big right now or go home" business model over the last 40 years, although authors were complaining even in the late 80's.
I love how you re-de-mystified a popular topic. There’s a lot of negative press about traditional publishing, and for good reason, but I have a hard time believing things are as dire for writers as they are presented. I love a good contextual, hopeful explanation.
I mean, with outlets like substack and the ever-growing digital space, creatives really have a ton of opportunity. It’s hard to see in all the noise, though. Maybe it’s just easier to believe that things are getting worse instead of both getting better and getting worse?
I'd say the bottom line is that relatively few books even from major houses sell a substantial number of copies or make anyone a substantial amount of money.
The book industry has always been a place where a few big sellers subsidize everyone else. What's changed is the big houses are less likely to take on a book just because it's good, knowing a cookbook would probably balance it out. When the publishers strike out now it's because of a miscalculation on a book they had high hopes for.
Is there a reliable source for finding out average or typical numbers of sales for ebooks? I'm an author of 5 instruction books for writers, published by Perigee/Penguin Random House. Four of the five have been steady backlist sellers for nearly 30 years. I learned through my agent that I would never again get a contract to write another book because of the slow sales in the first 12 months. Yet, I'm a bestseller by my count because over 60,000 copies have sold, combining them all.
I've been indie publishing two booklets on writing craft and I see on my Amazon dashboard that half of the sales are to ebooks. That has made me more than curious about where TOTAL sales, print and ebook, might be found, as separate totals or as combined totals, and then those figures used to break down the 1-99 lifetime sales, 100 to whatever, and so forth.
This is a stellar article that clearly you put a ton of time into writing. Thanks for sharing. If I had a gripe with the whole thing, it's that the publishers are the ones trying to portray the woes of the industry to further merge and reduce competition. Worse still, many authors accept the state of publishing and have taken to boasting that a lack of "commercial appeal" indicates quality.
Hey y'all, it's Kristen McLean, lead industry analyst from NPD BookScan. I thought I would chime in with some numbers here, since that statistic from the DOJ is super-misleading, and I'm not sure where it originally came from, since we did not provide it directly.
It is possible it came from our data, and was provided by one of the publisher parties, but based on the 58,000 figure, it's not obvious what exactly it includes in terms of "publisher frontlist". 58,000 titles is way too small a number for "all frontlist books published in a year by every publisher"--that's more like 487,000 frontlist titles--so it's clear it's a slice but I'm not sure HOW it was sliced.
NPD BookScan (BookScan is owned by The NPD Group, not Nielsen, BTW), collects data on print book sales from 16,000 retail locations, including Amazon print book sales. Included in those numbers are any print book sales from self-publishing platforms where the author has opted for extended distribution and a print book was sold by Amazon or another retailer. So that 487K "new book" figure is all frontlist books in our data showing at least 1 unit sale over the last 52 weeks coming from publishers of all sizes, including individuals.
Lots of press outlets have been calling about it today, so I did a little digging to see if I could reverse-engineer the citation, and am happy to share our numbers here for clarity.
Because this is clearly a slice, and most likely provided by one of the parties to the suit, I decided to limit my data to the frontlist sales for the top 10 publishers by unit volume in the U.S. Trade market. My ISBN list is a little smaller than the one quoted in the DOJ, but the principals will be the same.
The data below includes frontlist titles from Penguin Random House, Simon & Schuster, Hachette Book Group, HarperCollins, Scholastic, Disney, Macmillan, Abrams, Sourcebooks, and John Wiley. The figures below only include books published by these publishers themselves, not pubishers they distribute.
Here is what I found. Collectively, 45,571 unique ISBNs appear for these publishers in our frontlist sales data for the last 52 weeks (thru week ending 8-24-2022).
In this dataset:
>>>0.4% or 163 books sold 100,000 copies or more
>>>0.7% or 320 books sold between 50,000-99,999 copies
>>>2.2% or 1,015 books sold between 20,000-49,999 copies
>>>3.4% or 1,572 books sold between 10,000-19,999 copies
>>>5.5% or 2,518 books sold between 5,000-9,999 copies
>>>21.6% or 9,863 books sold between 1,000-4,999 copies
>>>51.4% or 23,419 sold between 12-999 copies
>>>14.7% or 6,701 books sold under 12 copies
So, only about 15% of all of those publisher-produced frontlist books sold less than 12 copies. That's not nothing, but nowhere as janky as what has been reported.
BUT, I think the real story is that roughly 66% of those books from the top 10 publishers sold less than 1,000 copies over 52 weeks. (Those last two points combined)
And less than 2% sold more than 50,000 copies. (The top two points)
Now data is a funny thing. It can be sliced and diced to create different types of views. For instance we could run the same analysis on ALL of those 487K new books published in the last 52 weeks, which includes many small press and independetly published titles, and we would find that about 98% of them sold less that 5,000 copies in the "trade bookstore market" that NPD BookScan covers. (I know this IS a true statistic because that data was produced by us for The New York Times.)
But that data does not include direct sales from publishers. It does not include sales by authors at events, or through their websites. It does not include eBook sales which we track in a separate tool, and it doesn't include any of the amazing reading going on through platforms like Substack, Wattpad, Webtoons, Kindle Direct, or library lending platforms like OverDrive or Hoopla.
BUT, it does represent the general reality of the ECONOMICS of the publishing market. In general, most of the revenue that keeps publishers in business comes from the very narrow band of publishing successes in the top 8-10% of new books, along with the 70% of overall sales that come from BACKLIST books in the current market. (Backlist books have gained about 4% in share from frontlist books since the pandemic began, but that is a whole other story.)
The long and short of it is publishing is very much a gambler's game, and I think that has been clear from the testimony in the DOJ case. It is true that most people in publishing up to and including the CEOs cannot tell you for sure what books are going to make their year. The big advantage that publisher consolidation has brought to the top of the market is deeper pockets and more resources to roll those dice. More money to get a hot project. More money to influence outcomes through marketing, more access to sales and distribution mechanisms, and easier access to the gatekeepers who decide what books make it onto retailers' shelves. And better ability to distribute risk across a bigger list of gambles.
It is largely a numbers game and I'm not just saying that because I'm a numbers gal. It's a tough business.
Hope this is helpful.
If anyone has questions, they are welcome to reach out to me directly at kristen.mclean@npd.com.
Extremely, thank you so so much for commenting!
My pleasure. If you ever want to do an AMA here, I'm always happy to do one. Part of my mission is to de-mystify the data as much as possible. It's a big complex industry, and no-one has a complete view--it would be impossible. But I'm happy to pull back the curtian on anything that I do in any way that's helpful.
Oh I'd love that! I did have one quick Q on the data you posted. Does "unique titles" mean separate ISBNs or are hardcover/paperback/etc. editions of the same book combined here?
Separate ISBNs -- which could include newly released PB editions of previous HCs. It's pretty rare that both HC & PB editions appear in the same 12 month "frontlist" period. Most reissues keep the previous ISBNs, so it's mostly brand new books, or new editions of previous HC releases. Also, for us "frontlist sales" are all sales that occur within 12 months of a book's release. So on the 366th day, sales flip into the "backlist" bucket. So the sales referenced in the data above are "frontlist sales" from those 47+K books in this dataset that appeared within 12 months of the book's release, and that appear in the last 52 week period. Make sense?
Yeah it does! Does this mean some of these titles would only have a few weeks or months of sales? Like frontlist titles from, say, three weeks ago would be included?
(probably wouldn't dramatically change the numbers though.)
And thank you so much again!
You asked the question I was wondering about.
Yes, that's right. It includes the frontlist sales of the books published last week. So one week of sales. But it also includes the week 52 sales of a book that will drop into the other bucket next week, so it kind of all washes through.
Assuming this does not include audiobooks?
Correct, we track audio and eBooks separately. And digital formats do not necessarily follow the print patterns. For instance, for someone with a prominent audio platform like a podcast, or a well-known musican or artist writing a memoir, their audio-share can climb well above 50% of sales when we line up all of the formats. Likewise, authors in specific genre-driven fiction will see their eBook sales share much higher than print or audio. Or, their audio may be much higher if their fans prefer the format--that happens a lot in sci-fi. It all depends. But overall, the print market dwarfs the digital markets at the top line, and all of it underlines the fact that topline data is only good for looking at big trends and directionality. Every book and author can perform differently than the common wisdom, and the exceptions are sometimes the most interesting thing to watch for futurists like me.
Thank you so much for sharing! This is immensely helpful.
That is so useful and illuminating. Thank you for bringing clarity to this discussion - you're clearly a writing gal as well as a numbers gal. PS Sorry if my tweet lead to those calls.
NO worries. It's important, and I'm happy to be able to help folks understand, and I DEFINIETLY want the press to ask to get at the accurate facts.
Amazingly generous analysis! I’m pumped to now know where my bestselling business book fits into the NPD data verse! What’s even more sobering about a that your total ISBN count is NOT a count of authors.
Random question - why 12 and not 10 or 100 as the bottom rung?
Because of the original weird stat about how many books had sold less than 12. I think it’s a pretty useless benchmark. I generally use 5k as a threshold for commercial viability at the big publishers.
Thank you for this! If you had a Substack I'd be your number one subscriber.
Great info and fascinating background to the stats- My question is how much of this concentration of sales/success on to a few blockbusters and the low sales among the 66 percent is driven by the marketing practices of the big publishers, who may pay a big advance for a book they want to "own" and keep out of the market but will then fail to provide the marketing backup for the same book? Basically, are they often betting big but then not following through on their investment? And the person left out in the cold in that scenario is the writer, and their art. This could be death to the artistic process, to the energy around our publishing culture. It's tempting for us to see sales as reader-driven market magic around a title but it's also I suspect driven not by readers but by the actual decisions and profit-motivations of the publishing industry. And then I'm curious about how much of this discourse about author "careerism" and a perceived lethargy and lack of innovation in our literary culture is down to the business-focused practices of the big publishers? This is a whole other story I realize, and one that is present in the discourse. Thanks for this fascinating discussion!
Hi Kristen, thanks for this, really good data. Similar to people claiming "“as many as 50% of papers are never read by anyone other than their authors, referees and journal editors.” They also claim that 90 percent of papers published are never cited." Simply people reading the data wrong or reading the data to suit a narrative to drive views and clicks.
I am a fellow of a UN Sustainable Development Goals group and one of the things we have been trying to figure out is which are the best selling textbooks in sustainability, energy, climate change, etc. Can you possibly help with this?
Thank you!
Thanks, Dilip. I don't have an easy answer on the best texbooks for those topics as it's a bit out of my wheelhouse. I do know there is an upcoming webinar co-sponsored by Publishers Weekly and Westchester Publishing Services on sustainability and accessibility in publishing, and some of the panelists (besides me) may have an answer for your question because they are better versed in the world on University and Academic presses. More information: https://us06web.zoom.us/webinar/register/WN__zOqIa7kRaan7q-wJBnKKg Good luck!
Thank you Kristen!
Dilip, I'm not a publishing professional in any way whatever, but I cannot miss the chance to underscore the recent work of the one author your group should read before any others this year: Stuart L. Hart, "Beyond Shareholder Primacy" (Stanford Business Books). Seriously, this is absolutely at the heart of how to meet those goals. (I admit he has long been a friend - doesn't change the fact!)
My reading of this statistic—a version of which surfaces from time to time, as you note in the opening—is that it's counting (1) new titles requiring an ISBN, which can include re-issues or new formats or POD editions, and (2) what those sales were during the year of release or first year of sales. It's also including many, many different types of publishers which may or may not sell in the bookstore market. That latter part, I think, is the biggest factor in why a book may look like a poor seller when it isn't.
On the other hand, traditionally published authors tend to highly value bookstore sales, as we saw recently with the B&N policy change. So in that regard, I think it's a helpful reminder that print bookstore sales may not drive a book's success and a lot depends on the publisher and category of book.
Hey peeps, I just recorded an episode inspired by this thread for the BBC podcast More or Less, because of all of the great engagement in the community here. Here it is for your listening pleasure: https://www.bbc.co.uk/programmes/p0d8nb1w
Thanks, Kristen. Excited to listen! And if you're still ever up for an AMA about BookScan and book sales I'd love to do one.
I'd be happy to! December is good for me because I'm in a slower cycle.
Thank you! I just sent an email (which I'm only noting here because I emailed you once before so want to make sure it isn't going to spam.)
I had no idea this would go viral - and your explanations make a lot of sense. Thanks!
A good read, both enlightening, and a bit disheartening.
"And it’s true that publishers often have no idea what will sell. It’s a throw-against-the-wall-and-see-what-sticks industry. Are there are lot of problems with it? A lot of things that could be fixed? Ways that publishers could better market the backlist or frontlist? Yes! But it’s not quite as dire as some of these statistics suggest."
This is my issue though. There -are- ways to track this and as a tech-nerd, it befuddles me that I'm not picking up on omni-channel dashboards being used in publishing. There are just no APIs for some of that—even foot-traffic data—when there absolutely could be.
If it's not quite as dire, that data needs to exist and transparently. Right?
I don't disagree! Although I should note that many publishers internally provide good data for their authors. Hachette, my novel's publisher, gives me a weekly breakdown of sales by format as well as gross and net sales. (Another tricky thing with books is that publishers sell to bookstores who can then return the copies if they don't sell to customers, so it takes a while to know what truly sold.)
This data isn't available to the public, but then most authors probably wouldn't want it to be. Sales feel kinda private.
That said, I think the "throw against the wall" tactic is while perhaps heightened in publishing a common thing in arts/entertainment in general. Movie companies make lots of movies that don't do well, then a few blockbusters that sustain Hollywood and so on. Part of why Hollywood has been moving more and more toward corporate IP films, like superhero movies, where you can expect a good audience even for a movie with bad reviews.
I get that. I'm not traditionally published, so I don't actually have a bird's-eye. What I do have is what industry tools are available to me. I do understand not wanting all sales data available. There is precedent for more refined benchmarks here, however. Hard to track progress without comparisons and conflicting info. Does this make sense?
Do you, if you can comment, think "throw against the wall tactic" is something that can float going forward? Entertainment is moving towards big data, imho. I feel like that may be a big 2022 and beyond focus for publishing, more than it has been. Then again, I don't know.
Thanks for responding. Sorry if my questions are annoying, it's just very rare for me to get a chance to speak with someone actually inside the industry. I can only estimate and research so much, you know?
I'm sure publishing could use better data, but I do think art (broadly speaking) is always a bit of a crapshoot. You never really know what people will love. I'm sure Hollywood and video game companies use a lot of data, and still movies and games flop while others are surprise hits.
One of the bestselling novels of the last two years has been Song of Achilles, a book published over a decade ago that got popular on BookTok. Can any amount of data tell you what book will go viral in a decade? There's something kind of magical about the randomness, despite it's problems.
As an outsider, I see a desperate need by both editors and writers to believe that they have control over their destiny (in the form of book sales).
And yes, obviously writers and editors have *some* sort of influence on the success of their books, but it's sadly obvious that that connection is so tenuous that much advice is about as useful as knowing which deity to propitiate when you send your book out into the wild.
(And yes, I exaggerate. Good advice gives you another percent (maybe 2!) to the chance of success. And there are lots of things that can _destroy_ one's chances that advice can help one avoid.)
However, the reality is that a true understanding of how little control one has over something as important to one's hopes and dreams as one's book would be detrimental to most author's mental health, so I don't expect the situation to change any time soon. The desire for something/anything that promises more agency is just too deep to go unfulfilled.
Great reminder that facts, or assertions, need context!
A very good article - one other thing to note - especially for traditionally published authors (I'm the business manager of fantasy author Michael J. Sullivan BTW which is where I get my data from.)
is that for PHYSICAL books it can be almost impossible to tell how many books you've sold. Why? Well there are several numbers to consider:
- How many books were printed?
- How many were provided as "comps" for reviewers or industry folks?
- How many were disposed of through pulping?
- How many were disposed of through remaindering?
- How many are in the warehouse?
- How many left the warehouse and are now in the warehouses of Amazon or Barnes & Noble etc?
- How many came back as returns?
It's this last point that I want to talk more about because it can be a big number and hard to determine (especially early on) due to something called "a return reserve." When I received the first royalty report from Michael's books I was disappointed. They hadn't sold as well as I thought they had - but I only had the number of books Michael was paid for - not the number of books that walked out of the stores after being bought.
I later found out - almost by accident that there is another report (The Unit Ledger) which shows the difference between
- books shipped out of the warehouse
- books paid for on the royalty report
- and books that are being held in a "return reserves" pool.
This last group may eventually make it onto a royalty report - or they may be gone forever (hence one of the big differences between books printed and books sold).
If you look at your contract there is some language like "Royalties on each Work shall be based upon actual sales less actual returns and less a reasonable reserve for returnable copies* But what is reasonable? If memory serves 65% of all sales on Michael's first royalty report were "held in reserve" so my disappointment came from only seeing 35% of the books that "were bought by someone" - I'd question if 65% is a "reasonable" amount - but I have heard that many books will see 50% returns. Now with several years track record under my belt Michael's returns were only 6% - 7% (depending on title). So yeah not so reasonable.
Now I should note that those copies retained for returns - eventually "work themselves out" -And are usually essentially gone after the first year - and it's only then that you have a good idea of "actual sales."
As for Bookscan data - Each author's is going to be a bit different for Michael his Bookscan numbers to actual sales seems to run around 65%.
Lincoln, Kristen and Jane - some comments from the music side of the fence that will resonate.
The challenge this blog highlights is counting units, logging street dates and defining the time period. All are similar the challenges we face in music.
First, when we talk about how many songs are on the digital shelf, the answer has to be 'de-duplicated' - an ISRC for a single release of a song might be different from the ISRC on the album. They need to be rolled up into one. Second, the debate over frontline and catalogue definitions has been going on in music since 2017 (see my work here: https://tinyurl.com/3unbh5ba). We've seen click-bait headlines like 'is old music killing new music' whereas the truth is that music between 18 and 36 months old (that's old but not *that* old) is seeing a surge in demand. Finally, claims like this need to be specific of the time period, is it all time or in the most recent calendar year. If you were to ask how many songs hadn't received a click last year, then lots of them would have received a click in the years before.
The most recent example of this type of story in music is the excellent work of MBW which showed that 80% (78.4%) of artists on Spotify today – around 6.3 million of them – have a monthly audience on the platform smaller than 50 people. This is correct, but is it fair? Should hobbyists be lumped in with serious artists? Should it be a monthly stat, or all time? Should dead artists (and those who are no longer active) be grouped with those who are alive and kicking?
Anyway, just to sign off that I feel your pain and trust me - the longer the long tail, the more you'll see these headlines.
Most of a publisher’s profits come from the backlist titles that typically account for around 60% of gross sales in any given year. Way back in 1981 I looked at the previous five years of sales for the top 20 titles sold at a large publisher where I worked, and 16 were the same backlist titles for each of the five years, then four were frontlist that rarely stayed on the list past their launch year. In many cases, the returns in years two and three turned those former stars into money losers.
Backlist is definitely a key component of longterm publisher revenue, and it's rarely discussed in any depth. A "core backlist title", one that sells perennially at a predictable rate is the foundation for most publisher's business. It's not really a joke to say that Goodnight Moon built HarperCollins. Those core backlist titles are worth X100 or more than any "here now gone later" bestseller. That's the real diamond in the mud.
Believe me, I know. I have no backlist at Publerati and worked for a start-up decades ago also without a backlist. I’m a masochist for sure.
> Most of a publisher’s profits come from the backlist titles
Is this still true, though? Publishers putting anything but big sellers out of print has been a bête noire of authors for decades. (Nothing like having volume 3 of your trilogy published, but volume 1 is already out of print.)
I suspect business concerns (such as tax law) have driven publishers to the "go big right now or go home" business model over the last 40 years, although authors were complaining even in the late 80's.
I love how you re-de-mystified a popular topic. There’s a lot of negative press about traditional publishing, and for good reason, but I have a hard time believing things are as dire for writers as they are presented. I love a good contextual, hopeful explanation.
I mean, with outlets like substack and the ever-growing digital space, creatives really have a ton of opportunity. It’s hard to see in all the noise, though. Maybe it’s just easier to believe that things are getting worse instead of both getting better and getting worse?
I'd say the bottom line is that relatively few books even from major houses sell a substantial number of copies or make anyone a substantial amount of money.
The book industry has always been a place where a few big sellers subsidize everyone else. What's changed is the big houses are less likely to take on a book just because it's good, knowing a cookbook would probably balance it out. When the publishers strike out now it's because of a miscalculation on a book they had high hopes for.
Is there a reliable source for finding out average or typical numbers of sales for ebooks? I'm an author of 5 instruction books for writers, published by Perigee/Penguin Random House. Four of the five have been steady backlist sellers for nearly 30 years. I learned through my agent that I would never again get a contract to write another book because of the slow sales in the first 12 months. Yet, I'm a bestseller by my count because over 60,000 copies have sold, combining them all.
I've been indie publishing two booklets on writing craft and I see on my Amazon dashboard that half of the sales are to ebooks. That has made me more than curious about where TOTAL sales, print and ebook, might be found, as separate totals or as combined totals, and then those figures used to break down the 1-99 lifetime sales, 100 to whatever, and so forth.
"There are 3 kinds of lies: lies, damned lies, and statistics." --- Mark Twain.
They don't sell only a dozen copies? So, they sell far more? If your title is any hint at your writing skill, no wonder your books don't sell.
You read the title correctly, or else bungled the zing. Cheers.
This is a stellar article that clearly you put a ton of time into writing. Thanks for sharing. If I had a gripe with the whole thing, it's that the publishers are the ones trying to portray the woes of the industry to further merge and reduce competition. Worse still, many authors accept the state of publishing and have taken to boasting that a lack of "commercial appeal" indicates quality.