Entropy.Rd
Computes Shannon entropy and the mutual information of two variables. The entropy quantifies the expected value of the information contained in a vector. The mutual information is a quantity that measures the mutual dependence of the two random variables.
Entropy(x, y = NULL, base = 2, ...)
MutInf(x, y, base = 2, ...)
a vector or a matrix of numerical or categorical type. If only x is supplied it will be interpreted as contingency table.
a vector with the same type and dimension as x. If y is not NULL
then the entropy of table(x, y, ...)
will be calculated.
base of the logarithm to be used, defaults to 2.
further arguments are passed to the function table
, allowing i.e. to set useNA
.
The Shannon entropy equation provides a way to estimate the average minimum number of bits needed to encode a string of symbols, based on the frequency of the symbols.
It is given by the formula \(H = - \sum(\pi log(\pi))\) where \(\pi\) is the
probability of character number i showing up in a stream of characters of the given "script".
The entropy is ranging from 0 to Inf.
a numeric value.
Shannon, Claude E. (July/October 1948). A Mathematical Theory of Communication, Bell System Technical Journal 27 (3): 379-423.
Ihara, Shunsuke (1993) Information theory for continuous systems, World Scientific. p. 2. ISBN 978-981-02-0985-8.
package entropy which implements various estimators of entropy
Entropy(as.matrix(rep(1/8, 8)))
#> [1] 3
# http://r.789695.n4.nabble.com/entropy-package-how-to-compute-mutual-information-td4385339.html
x <- as.factor(c("a","b","a","c","b","c"))
y <- as.factor(c("b","a","a","c","c","b"))
Entropy(table(x), base=exp(1))
#> [1] 1.098612
Entropy(table(y), base=exp(1))
#> [1] 1.098612
Entropy(x, y, base=exp(1))
#> [1] 1.791759
# Mutual information is
Entropy(table(x), base=exp(1)) + Entropy(table(y), base=exp(1)) - Entropy(x, y, base=exp(1))
#> [1] 0.4054651
MutInf(x, y, base=exp(1))
#> [1] 0.4054651
Entropy(table(x)) + Entropy(table(y)) - Entropy(x, y)
#> [1] 0.5849625
MutInf(x, y, base=2)
#> [1] 0.5849625
# http://en.wikipedia.org/wiki/Cluster_labeling
tab <- matrix(c(60,10000,200,500000), nrow=2, byrow=TRUE)
MutInf(tab, base=2)
#> [1] 0.0002806552
d.frm <- Untable(as.table(tab))
str(d.frm)
#> 'data.frame': 510260 obs. of 2 variables:
#> $ Var1: Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
#> $ Var2: Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
#> - attr(*, "out.attrs")=List of 2
#> ..$ dim : int [1:2] 2 2
#> ..$ dimnames:List of 2
#> .. ..$ Var1: chr [1:2] "Var1=A" "Var1=B"
#> .. ..$ Var2: chr [1:2] "Var2=A" "Var2=B"
MutInf(d.frm[,1], d.frm[,2])
#> [1] 0.0002806552
table(d.frm[,1], d.frm[,2])
#>
#> A B
#> A 60 10000
#> B 200 500000
MutInf(table(d.frm[,1], d.frm[,2]))
#> [1] 0.0002806552
# Ranking mutual information can help to describe clusters
#
# r.mi <- MutInf(x, grp)
# attributes(r.mi)$dimnames <- attributes(tab)$dimnames
#
# # calculating ranks of mutual information
# r.mi_r <- apply( -r.mi, 2, rank, na.last=TRUE )
# # show only first 6 ranks
# r.mi_r6 <- ifelse( r.mi_r < 7, r.mi_r, NA)
# attributes(r.mi_r6)$dimnames <- attributes(tab)$dimnames
# r.mi_r6