Google’s AI has read thousands of novels to improve itself


It is a known fact that reading improves your mind and the good thing about artificial intelligence is that it can go on reading as long as it is not stopped. An AI can go on reading for hours, for days, for weeks, for months, for years. This Guardian links says that Google has literally “swallowed” 11,000 novels to improve its AI’s conversation skills. Google’s AI is reading to improve its intelligence as well as its vocabulary.

Of course some of the authors are not comfortable with the ‘blatantly commercial use of expressive authorship’ – they believe that their work is being used to create a commercial property. Google after all will be using its AI for creating a highly interactive interface that may respond to user queries just like a human.

These 11,000 odd novels have been consumed at Google Brain to improve Google AI’s conversational style. Google is feeding all these books into its neural network (yesterday you read about how Google translate uses neural networks for machine translation) in order to be able to generate fluent, natural-sounding sentences.

According to people working at Google, the novels used for building the AI’s vocabulary and speaking style were taken from the “Books Corpus” – that took 11,038 books from the web. According to Google, these books were published by yet unpublished authors and they were available for free downloads.

One of the authors having an objection to Google’s AI reading thousands of these novels to improve itself is Rebecca Foster who has by now commercially published 29 novels. She says, even if one of her novels are available to download for free, the least Google would have done was to ask her before using it for a commercial enterprise. It’s like, someday, a filmmaker uses her book to make a film and then not pay her saying that the book was anyway freely available on the web. When writers like her made their works publicly available, the only intention was to make people read their work, not use it for commercial purposes or even for scientific purposes. Well, this is a different issue.

But why use novels to improve speaking style? Why feed the Google AI with more than 11,000 novels?

The engineers working at Google believe that novels mostly deal with a particular theme. Different people are talking about the same thing so it helps the AI to say the same thing in different ways. Also, most of the characters in these novels use conversational style. So the Google AI will also be able to use conversational style while interacting with users through various apps.

Has Google done the right thing – feeding 11,000 novels into its AI without the writers’ permission? Obtaining permission from these many writers in itself would have been a project but if it is necessary, then it should have been done. Even when Google started digitizing books for its digital library people had an objection but then the court ruled in favour of Google. If writers sue Google may be the company will be able to prove that it hasn’t used the content of these novels commercially and it is only being used for research purposes.

About Amrit Hallan
Amrit Hallan is the founder of He writes about technology not because "he loves to write about technology", he actually believes that it makes the world a better place. On Twitter you can follow him at @amrithallan

