Entropy
Entropy is measure of the amount of uncertainty or randomness in a random variable. In information theory, entropy is used to quantify the amount of information contained in a message or signal. The more uncertain or random the message or signal, the higher its entropy. Entropy is often measured in bits, and it plays a crucial role in the design of communication systems, natural language processing (particularly using the cross-entropy loss), and cryptographic protocols. Claude Shannon was the first to introduce the concept of entropy in information theory, and his work laid the foundation for modern digital communication and cryptography.
Formula
Given a discrete random variable , which takes values in the alphabet and is distributed according to :
Let's break down the formula and explain each part:
- represents the entropy of a random variable X, which takes values from a set of possible outcomes, denoted by .
- (sigma) indicates summation, which means we will add up the results of the expression to the right of the symbol for each possible outcome in the set .
- represents the probability of a specific outcome x occurring.
- is the logarithm of the probability of outcome x. The base of the logarithm depends on the application, as mentioned earlier.
- The negative sign in front of the summation ensures that the entropy value is non-negative, as probabilities are always between 0 and 1, and logarithms of values between 0 and 1 are negative.
In simpler terms, the formula calculates the entropy by summing up the product of the probability of each outcome and the logarithm of its probability, with a negative sign to make the result non-negative. This gives us the average amount of information or uncertainty associated with the random variable X.