Welcome to the bloggy home of Noah Brier. I'm the co-founder of Percolate and general internet tinkerer. This site is about media, culture, technology, and randomness. It's been around since 2004 (I'm pretty sure). Feel free to get in touch. Get in touch.

You can subscribe to this site via RSS (the humanity!) or .

Short Words, Short Data

Earlier this year I remember reading this short article from MIT on some research that showed word length had more to do with information being communicated than frequency of occurrence. Ran across it again and thought it was worth sharing:

Why are some words short and others long? For decades, a prominent theory has held that words used frequently are short in order to make language efficient: It would not be economical if “the” were as long as “phenomenology,” in this view. But now a team of MIT cognitive scientists has developed an alternative notion, on the basis of new research: A word’s length reflects the amount of information it contains.

The article goes on to explain, “For English words, 9 percent of the variation in length is due to amount of information, and 1 percent stems from frequency.” Not entirely sure what to do with this yet, but seems worth knowing.

November 27, 2011 // This post is about: , ,

Comments

  • Michal Migurski says:

    This seems intuitive to me – a word like phenomenology is long because it’s made up of a bunch of smaller words that have been put together to make its meaning clear. Frequency and information content are related concepts, I think.

  • Leave a Comment

    Your email address will not be published. Don't sweat it.