This posting outlines a concept I call the Great Ideas Coefficient — an additional type of metadata used to denote the qualities of a text.
![]() Great Ideas Coefficient |
In the 1950s a man named Mortimer Adler and colleagues brought together what they thought were the most significant written works of Western civilization. They called this collection the Great Books of the Western World. Before they created the collection they outlined what they thought were the 100 most significant ideas of Western civilization. These are “great ideas” such as but not limited to beauty, courage, education, law, liberty, nature, sin, truth, and wisdom. Interesting.
Suppose you were able to weigh the value of a book based on these “great ideas”. Suppose you had a number of texts and you wanted to rank or list them according to the number of times they mentioned the “great ideas”. Such a thing can be done through the application of TFIDF. Here’s how:
- create a list of the “great ideas”
- calculate the TFIDF score for each idea in a given book
- sum the scores for each idea
- assign the score to the book
- go to Step #2 for each book in a corpus
- sort the corpus based on the total scores
Once the scores are calculated, they can be graphed, and once they are graphed they can be illustrated.
An example of this technique is shown above. For each item in a list of works by Aristotle a Great Ideas Coefficient has been calculated and assigned. The list was the ordered by the score. The score was then plotted graphically. Finally, all the graphs were joining together as an animated GIF image to show the range of scores in the list. Luckily the process seems to work because Aristotle’s Metaphysics ranks at the top with the highest Great Ideas Coefficient, and his History of Animals ranks the lowest. ‘Seems to make sense.
The concept behing the Great Ideas Coefficient is not limited to “great ideas”. Any set of words or phrases could be used. For example, one could create a list of “big names” (Plato, Shakespeare, Galileo, etc.) and calculate a Big Names Coefficient. Alternatively, a person could create a list of other words or phrases for any topic or genre to weigh a set of texts against biology, mathematics, literature, etc.
Find is not the problem that needs to be solved now-a-days. The problem of use and understanding is more pressing. People can find plenty of information. They need (want) assistance in putting the information into context. “Books are for use.” The application of something like the Great Ideas Coefficient may be just one example.
Tags: metadata