Google Books, and its associated ngram indices, represent one of the largest publicly available databases in the world. At last count, Google had scanned and indexed over 25 million books containing over 1,000,000,000,000 terms (ngrams) - roughly comparable to all of the text on all of the pages of the internet. Not only is the database impressive in its scale and availability, but in the wealth of knowledge in it about our culture over the last 200 years.
Here are some examples of some of the insights you can gain with Google Books. These graphs show the relative occurrences in printed material of the specified words and phrases, by year, and to a good approximation reflect what people were thinking (and writing about) during this time.
Modes of transportation:
Many more examples are here.
Working with data sets this large required Google to pioneer new concepts in highly scalable parallel data processing, such as MapReduce, also known by the name of its popular implementation, Hadoop. These techniques allowed Google to break down the massive problem of indexing this vast database into manageable chunks that could be performed by many machines working in parallel. These systems and techniques are now used by many companies for big data problems, such as customer analytics and machine learning.