Learning & Reasoning/R

6.3.2 Extended Example: Extracting a Subtable "The Art of R Programming"

이현봉 2013. 2. 15. 16:35

> cttab   # is a two dimensional table 
          Voted.For.X.Last.Time   # "Voted.For.X.Last.Time" dimension has 2 levels
Vote.for.X No Yes               # "Vote.for.X" dimension also has 2 levels (No, Yes) 
  No        2   0
  Not Sure  0   1
  Yes       1   1
> class(cttab)       # cttab is a class of table
[1] "table"

We would like to have a function "subtable" that does ;
> st = subtable(cttab,list(Vote.for.X=c("No","Yes"),
+ Voted.for.X.Last.Time=c("No","Yes")))
> st
          Voted.for.X.Last.Time
Vote.for.X No Yes
       No   2   0
       Yes  1   1
> class(st)
[1] "table"

where
* cttab : The table of interest, of class "table".
* list(...) : A list specifying the desired subtable extraction. Each component
of this list is named after some dimension of "cttab", and the value of
that component is a vector of the names of the desired levels.

만약 "Not Sure" row만 가장 단순하게 제외시키고 싶다면 다음과 같이 해도 됨;
> ctt = cttab[c(1,3),]
> ctt
          Voted.For.X.Last.Time
Vote.for.X No Yes
       No   2   0
       Yes  1   1
> class(ctt)
[1] "matrix"
> clatt(ctt) = "table"
> class(ctt)
[1] "table"

Let's look closely "subtable" function in the book. First, rename cttab to tbl and list(...) to subnames for simplicity.
> subnames = list(Vote.for.X=c("No","Yes"),Voted.for.X.Last.Time=c("No","Yes"))
> subnames
$Vote.for.X
[1] "No"  "Yes"

$Voted.for.X.Last.Time
[1] "No"  "Yes"
> tbl = cttab
> tbl
          Voted.For.X.Last.Time
Vote.for.X No Yes
  No        2   0
  Not Sure  0   1
  Yes       1   1

-----------------

> tblarray <- unclass(tbl)    # tbl's class attribute is stripped
> tblarray Voted.For.X.Last.Time Vote.for.X No Yes No 2 0 Not Sure 0 1 Yes 1 1
> class(tblarray)  # tblarray is just an array now. 
[1] "matrix"

> dcargs <- list(tblarray)
> dcargs
[[1]]
          Voted.For.X.Last.Time
Vote.for.X No Yes
  No        2   0
  Not Sure  0   1
  Yes       1   1
> class(dcargs)     # dcargs is a list now
[1] "list"

> ndims <- length(subnames) # number of dimensions
> for (i in 1:ndims) {
+ dcargs[[i+1]] <- subnames[[i]]   # insert components to dcargs list
+ }
> dcargs
[[1]]
          Voted.For.X.Last.Time
Vote.for.X No Yes
  No        2   0
  Not Sure  0   1
  Yes       1   1

[[2]]                 # row index component
[1] "No"  "Yes"

[[3]]                 # col index component
[1] "No"  "Yes"


> subarray <- do.call("[",dcargs)  
* do.call use here is subtle. Since dcargs is a list of 3 components, the first is an array to act on, and the next two is indexing ranges for row and column (for 2D array)
> subarray Voted.For.X.Last.Time Vote.for.X No Yes No 2 0 Yes 1 1 > class(subarray) [1] "matrix"

# now we'll build the new table, consisting of the subarray, the
# numbers of levels in each dimension, and the dimnames() value, plus
# the "table" class attribute

> dims <- lapply(subnames,length) # for each subnames's components, apply length()

# Build a new array with the actual arguments we've been working on
> subtbl <- array(subarray,dims,dimnames=subnames)
> subtbl
          Voted.for.X.Last.Time
Vote.for.X No Yes
       No   2   0
       Yes  1   1
> class(subtbl)
[1] "matrix
> class(subtbl) <- "table"  # finally give subtbl its "table" class attribute
> class(subtbl)
[1] "table
> return(subtbl) # All done. return what we've built