Saturday, 19 November 2016

R convert data to factor will corrupt all other data.frame columns



I have a data.frame, all columns are numeric. I want to convert one integer column to factor, but doing so will convert all other columns to class character. Is there anyway to just convert one column to factor?



The example is from Converting variables to factors in R:




myData <- data.frame(A=rep(1:2, 3), B=rep(1:3, 2), Pulse=20:25)
myData$A <-as.factor(myData$A)


The result



apply(myData,2,class)
# A B Pulse
# "character" "character" "character"



sessionInfo()



R version 3.1.2 (2014-10-31) 
Platform: x86_64-apple-darwin13.4.0 (64-bit)

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] splines stats graphics grDevices utils datasets methods base ...


str(myData$A)
# Factor w/ 2 levels "1","2": 1 2 1 2 1 2

Answer



Your code actually works when I test it.



This is my output from str(myData):



    'data.frame':   6 obs. of  3 variables:
$ A : Factor w/ 2 levels "1","2": 1 2 1 2 1 2

$ B : int 1 2 3 1 2 3
$ Pulse: int 20 21 22 23 24 25


Your issue is because, as ?apply states:




‘apply’ attempts to coerce
to an array via ‘as.matrix’ if it is two-dimensional (e.g., a data
frame)





This is done before executing the function on each column. And when you run as.matrix(myData) you end up with everything forced to one class, in this case character data:



is.character(as.matrix(myData))
#[1] TRUE

No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...