Computes Shannon entropy and the mutual information of two variables. The entropy quantifies the expected value of the information contained in a vector. The mutual information is a quantity that measures the mutual dependence of the two random variables.

Entropy(x, y = NULL, base = 2, ...)

MutInf(x, y, base = 2, ...)

Arguments

x

a vector or a matrix of numerical or categorical type. If only x is supplied it will be interpreted as contingency table.

y

a vector with the same type and dimension as x. If y is not NULL then the entropy of table(x, y, ...) will be calculated.

base

base of the logarithm to be used, defaults to 2.

...

further arguments are passed to the function table, allowing i.e. to set useNA.

Details

The Shannon entropy equation provides a way to estimate the average minimum number of bits needed to encode a string of symbols, based on the frequency of the symbols.
It is given by the formula \(H = - \sum(\pi log(\pi))\) where \(\pi\) is the probability of character number i showing up in a stream of characters of the given "script".
The entropy is ranging from 0 to Inf.

Value

a numeric value.

References

Shannon, Claude E. (July/October 1948). A Mathematical Theory of Communication, Bell System Technical Journal 27 (3): 379-423.

Ihara, Shunsuke (1993) Information theory for continuous systems, World Scientific. p. 2. ISBN 978-981-02-0985-8.

Author

Andri Signorell <andri@signorell.net>

See also

package entropy which implements various estimators of entropy

Examples


Entropy(as.matrix(rep(1/8, 8)))
#> [1] 3

# http://r.789695.n4.nabble.com/entropy-package-how-to-compute-mutual-information-td4385339.html
x <- as.factor(c("a","b","a","c","b","c")) 
y <- as.factor(c("b","a","a","c","c","b")) 

Entropy(table(x), base=exp(1))
#> [1] 1.098612
Entropy(table(y), base=exp(1))
#> [1] 1.098612
Entropy(x, y, base=exp(1))
#> [1] 1.791759

# Mutual information is 
Entropy(table(x), base=exp(1)) + Entropy(table(y), base=exp(1)) - Entropy(x, y, base=exp(1))
#> [1] 0.4054651
MutInf(x, y, base=exp(1))
#> [1] 0.4054651

Entropy(table(x)) + Entropy(table(y)) - Entropy(x, y)
#> [1] 0.5849625
MutInf(x, y, base=2)
#> [1] 0.5849625

# http://en.wikipedia.org/wiki/Cluster_labeling
tab <- matrix(c(60,10000,200,500000), nrow=2, byrow=TRUE)
MutInf(tab, base=2) 
#> [1] 0.0002806552

d.frm <- Untable(as.table(tab))
str(d.frm)
#> 'data.frame':	510260 obs. of  2 variables:
#>  $ Var1: Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
#>  $ Var2: Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
#>  - attr(*, "out.attrs")=List of 2
#>   ..$ dim     : int [1:2] 2 2
#>   ..$ dimnames:List of 2
#>   .. ..$ Var1: chr [1:2] "Var1=A" "Var1=B"
#>   .. ..$ Var2: chr [1:2] "Var2=A" "Var2=B"
MutInf(d.frm[,1], d.frm[,2])
#> [1] 0.0002806552

table(d.frm[,1], d.frm[,2])
#>    
#>          A      B
#>   A     60  10000
#>   B    200 500000

MutInf(table(d.frm[,1], d.frm[,2]))
#> [1] 0.0002806552


# Ranking mutual information can help to describe clusters
#
#   r.mi <- MutInf(x, grp)
#   attributes(r.mi)$dimnames <- attributes(tab)$dimnames
# 
#   # calculating ranks of mutual information
#   r.mi_r <- apply( -r.mi, 2, rank, na.last=TRUE )
#   # show only first 6 ranks
#   r.mi_r6 <- ifelse( r.mi_r < 7, r.mi_r, NA) 
#   attributes(r.mi_r6)$dimnames <- attributes(tab)$dimnames
#   r.mi_r6