Comment Now
Follow Comments
A recent conversation with Fred Zimmerman, a long time friend and publishing entrepreneur, woke me up to the fact that the a part of the publishing industry that has long resisted technology may finally be ripe for transformation. The key question is: Does algorithmic content creation that uses machine learning and automation have a role to play in content creation?
The first impulse of most people like me, who have spent much of their careers writing for love and money, is to loudly answer NO WAY. I firmly believe that it is impossible to replace the creativity of the human mind and the skill of writing learned over years with an algorithm.
But Zimmerman, who is CEO of Nimble Books, is pioneering a new technique he calls combinatorial publishing that can create a book that is useful in seconds for pennies. He persuasively argues that algorithmic content creation has an important role to play, even if the virtuosity of the human will always be the beating heart of content creation.
After talking to Zimmerman, I realized that my knee-jerk rejection of machine learning and automation frames the question too narrowly. He persuaded me that the following observations are true:
- Algorithmic content creation will play a role in creating types of content that we are not using now.
- Algorithmic content creation will accelerate and enhance the traditional process of content creation.
- Algorithmic content creation will support new types of hybrid content that is collaboratively created by humans and machines.
- Algorithmic content creation changes the economics of publishing.
- The automation pioneered by algorithmic content creation will improve traditional publishing.
It is important to note that publishing has been transformed in many ways already. Machines and automation play a huge role in helping find much of the content we read today through search and recommendation engines. Web publishing and e-books have changed the form that content takes.
Algorithmic Content Creation: A New Frontier
But a look a current practice for content creation suggests that there is much to be learned. Almost all the content we find through machine assisted means was created the same way that it has been for hundreds of years. A person wrote the content and it was then published using a system for mass distribution.
Zimmerman’s Nimble Books uses a different model. His system starts with a seed of some sort that indicates the topic that the book should cover. The seed could be an existing article or set of articles or a set of keywords. Using a variety of search techniques, the combinatorial publishing system then searches a corpus of content and selects articles that match. These articles are then organized into a consistent form with a table of contents, formatted for publishing, and then wrapped in an automatically generated cover. Right now, a word cloud is used as the cover for most books, but Zimmerman is working on other ways to generate covers as well. Once created the books are automatically published through Amazon and other e-book distributors.
To make this process work, searchable corpuses of available content must exist. Right now, Wikipedia and other troves of content in the public domain are the foundation for Zimmerman’s prototyping efforts. Zimmerman uses open source software as the foundation for his technology and runs it all in the cloud.
When Zimmerman first explained the idea to me, I thought, why would anyone pay for a book created through an algorithm. But consider the success of iTunes. When iTunes was launched, it was possible to obtain many of the songs for free from illegal download sites. The same people who used those sites were willing to pay 99 cents a song for the convenience of having them delivered to their iPod in a legal manner. Zimmerman doesn’t have to pretend his books are masterpieces written by John Updike. As long as they collect relevant information in an important form, it is likely they will find an audience, it is worth recalling that there are many books written by humans that don’t really offer much value.
As with other forms of machine learning, it is important to recognize that the entire product does not have to be created by the machine. Algorithms can provide acceleration for steps in content creation that are better performed by machines, just as traders look at charts that display distillations of huge amounts of information.
Algorithmic Economics
The economics makes this model work. Zimmerman can create thousands of books a day for almost nothing. The key skills will be in selecting the right seeds and then marketing the books once published. Zimmerman’s model makes publishing the process of selecting the right seeds, a process that is quite similar to keyword research done for Search Engine Optimization.
The economics makes this model work. Zimmerman can create thousands of books a day for almost nothing. The key skills will be in selecting the right seeds and then marketing the books once published. Zimmerman’s model makes publishing the process of selecting the right seeds, a process that is quite similar to keyword research done for Search Engine Optimization.
It is easy to see internal corporate applications for this technique for assembling dossiers form proprietary and public collections of content. There are a variety of competitive intelligence services that do just this, but they stop short of creating a book. Hybrid applications will also make sense in which the starting point for a book is created using an algorithm and then human created chapters are added. As companies like Google and Bing add to their semantic search capabilities and Internet era data providers like Factual offer huge collections of information through APIs, it is easy to see how more structured information could be brought into algorithmically created books.
Zimmerman’s success rests on his ability to define new and useful forms of assembling and adding value to content.. In my work as a technology analyst, I certainly could use the type of books Nimble creates when getting started in learning about a new area. I’m sure many other areas could benefit from his approach. Domain-specific search engines are natural partners for Nimble.
Zimmerman knows he will fail if he promises too much, but so far he doesn’t have to. His books are honestly presented as algorithmic products and are starting to find an audience.
Zimmerman also is well aware that competitors will quickly emerge. The content farms such as Demand Media would be doing this already if Google’s page ranking algorithm didn’t punish them for offering duplicate content. But once he has proven the model, it is likely Nimble will become partners with or be acquired by a Google or an Amazon who could take his model and scale it.
The Birth of a Web 3.0 Application
Zimmerman sees this project in a broader context. “Arguably, Web 1.0 was about search—giving users a list of documents—and Web 2.0 was about social—giving users people to talk with. I believe we need to push ourselves to go beyond giving users answer sets and like buttons; those things have proven to be incredibly useful, but in some sense, they are just laying the groundwork for the task that the user is actually trying to accomplish. Web 3.0 needs to be about creating work product.”
Seen in that light, Nimble Combinatorial Publishing js a Web 3.0 vertical application for the publishing industry, since it addresses algorithmically the entire value chain involved in creating that industry’s major work product—books. If Zimmerman can nurture his startup expand on his lead, Nimble can provide unique and novel value propositions not just for publishers, but also for authors, booksellers, and readers—who can all use Nimble’s form-based interface to create their own algorithmically created books. If such books become popular, Zimmerman will be a prime example of Web 3.0 come to life and will likely transform the whole “industry of reading.”
Dan Woods is CTO and editor of CITO Research, a publication that helps CIOs and CTOs advance the craft of technology leadership. For more stories like this one visitwww.CITOResearch.com.