- n. Data that reduces uncertainty.
I once defined information as “that which informs.” I was unsatisfied with this definition, because I didn’t have a good definition for “inform”. The dictionary definitions didn’t help (e.g. “to impart information” or “to impart facts”), so I turned to the web to seek other people’s ideas.
The definition I found most useful is Claude Shannon’s: Information is that which reduces uncertainty. I’ve refined Shannon’s definition by replacing the fuzzy word “that” with the sharper word “data,” which I define as descriptions of events or conditions.
For a while I was worried that this definition, focused so specifically on reducing uncertainty, was too limiting. Why uncertainty? And why reducing uncertainty? What about data that increases uncertainty? Suppose I discover a datum that invalidates a key “fact” that I thought I knew, and therefore leaves me uncertain about many other “facts.” It seems to me that that would be information, too.
I was tempted to substitute the more general word “alters,” and to find some variable more general than “uncertainty” as the central variable that is altered by information. But I’ve found that the limitation—the intense focus on reducing uncertainty—turns out to be helpful when I’m seeking information. In most cases where I’m gathering information my goal is to reduce my uncertainty. Of course, I may end up being informed by data I wasn’t seeking, and I may be informed in ways that increase my uncertainty. But when I’m seeking information, I’m almost always trying to reduce my uncertainty about something. This definition reminds me to ask myself which uncertainties are most important to me, and which I’m most uncertain about. Then I can focus more productively on gathering the data that may reduce my uncertainty.
I’ve made good use of this definition in a number of contexts. I’ve found it especially helpful in talking about testing, because a central purpose of testing is to deliver information, specifically information that reduces stakeholders’ uncertainty about quality.
The definition also helps me when I’m estimating. It invites me to assess my uncertainty about the variables that affect the thing I’m estimating. That assessment helps me to focus my search for data.
Another definition I like is Peter Drucker’s: Information is data endowed with relevance or purpose. Drucker’s definition emphasizes purpose and the relevance of data to our purposes. I’ve seen numerous metrics programs flounder because they started by collecting data rather than by clarifying their purpose for measuring. The end up collecting lots of data, and then not knowing how to make sense of it.