Information

March 15, 2006 at 6:00 pm — Glossary
information
  1. n. Data that reduces uncertainty.

I once defined information as “that which informs.” I was unsatisfied with this definition, because I didn’t have a good definition for “inform”. The dictionary definitions didn’t help (e.g. “to impart information” or “to impart facts”), so I turned to the web to seek other people’s ideas.

The definition I found most useful is Claude Shannon’s: Information is that which reduces uncertainty. I’ve refined Shannon’s definition by replacing the fuzzy word “that” with the sharper word “data,” which I define as descriptions of events or conditions.

For a while I was worried that this definition, focused so specifically on reducing uncertainty, was too limiting. Why uncertainty? And why reducing uncertainty? What about data that increases uncertainty? Suppose I discover a datum that invalidates a key “fact” that I thought I knew, and therefore leaves me uncertain about many other “facts.” It seems to me that that would be information, too.

I was tempted to substitute the more general word “alters,” and to find some variable more general than “uncertainty” as the central variable that is altered by information. But I’ve found that the limitation—the intense focus on reducing uncertainty—turns out to be helpful when I’m seeking information. In most cases where I’m gathering information my goal is to reduce my uncertainty. Of course, I may end up being informed by data I wasn’t seeking, and I may be informed in ways that increase my uncertainty. But when I’m seeking information, I’m almost always trying to reduce my uncertainty about something. This definition reminds me to ask myself which uncertainties are most important to me, and which I’m most uncertain about. Then I can focus more productively on gathering the data that may reduce my uncertainty.

I’ve made good use of this definition in a number of contexts. I’ve found it especially helpful in talking about testing, because a central purpose of testing is to deliver information, specifically information that reduces stakeholders’ uncertainty about quality.

The definition also helps me when I’m estimating. It invites me to assess my uncertainty about the variables that affect the thing I’m estimating. That assessment helps me to focus my search for data.

Another definition I like is Peter Drucker’s: Information is data endowed with relevance or purpose. Drucker’s definition emphasizes purpose and the relevance of data to our purposes. I’ve seen numerous metrics programs flounder because they started by collecting data rather than by clarifying their purpose for measuring. The end up collecting lots of data, and then not knowing how to make sense of it.

RSS feed for comments on this post.

7 Comments

Comment by Moof — March 16, 2006 at 2:21 am

Dale - this is a great entry, and it makes a tremendous amount of sense. Now I’m concerned about why I wasn’t concerned about it before I read your post …

Did that just make sense? 0.o

Could I have permission to add your definition to my tiny, but growing, “Blogtionary” along with a link to this post?

Thanks for your time and trouble, Dale!

Comment by Dale Emery — March 16, 2006 at 6:01 am

Yes, Moof, you have my permission.

Comment by Moof — March 20, 2006 at 6:14 pm

Thanks Dale!

Comment by Karl — March 21, 2006 at 9:40 pm

How about “data that reduces error”?

There are two points to consider. First, I can be certain yet wrong (truth vs. knowledge of truth). Second, how does your definition affect the concept of “misinformation”? Would it then be “data that increases uncertainty”? That seems problematic because the point of misinformation is to make someone believe (with certainty) something that is false.

Comment by Jason Gorman — March 25, 2006 at 12:10 pm

That’s a bit silly.

1. There’s already an accepted definition of “information”, which is data that has been communicated and understood:

http://wordnet.princeton.edu/perl/webwn?s=information

http://en.wikipedia.org/wiki/Information_(book)

http://dictionary.reference.com/search?q=information

There’s one definition that relates to uncertainty, from information theory, in which “information” is actually a measure of the uncertainty (or improbability) of some thing. (An indirect measure of entropy). The more information something contains, the more improbable it is (e.g., DNA is more improbable than a water molecule).

2. Your definition doesn’t work. If I give you two facts, which you can read and understand (”information” according to accepted definitions), then it doesn’t necessarily reduce uncertainty. For example:

FACT #1: The dice came up 6
FACT #2: Then the dice came up 4

What number will come up next? Have I reduced uncertainty by giving you more information? Or are you saying this doesn’t count as information?

Comment by George Dinwiddie — April 17, 2006 at 4:59 pm

Jason Gorman, I think you miss the point. “Information” implies significance, which “data” does not.

Your example of two facts about a dice game has no significance to me, and therefore does not inform me. To a player in the game, the information content might be different.

Also, the reduction of uncertainty is not about predicting the future, but understanding the present. After receiving your data about the dice game, I have no less uncertainty. Which dice? Were there throws between the “6″ and the “4″? Were you lying and the dice actually came up “snake-eyes” both times? Were you making a hypothetical statement and the dice didn’t exist, at all?

Comment by Jason Gorman — April 18, 2006 at 11:27 am

“Information” implies significance, which “data” does not?

That’s a new one on me as well, I’m afraid. I couldn’t find any definitions of information that implied that it had to be significant.

If I tell you that the dice came up 6 and then 4, and you understand me (or even misunderstand me), then that’s information, and I quite agree that it doesn’t reduce your uncertainty :-)

Sorry, the comment form is closed at this time.