Tuesday, 21 March 2017

r - Converting string to numeric

As csgillespie said. stringsAsFactors is default on TRUE, which converts any text to a factor. So even after deleting the text, you still have a factor in your dataframe.



Now regarding the conversion, there's a more optimal way to do so. So I put it here as a reference :



> x <- factor(sample(4:8,10,replace=T))

> x
[1] 6 4 8 6 7 6 8 5 8 4
Levels: 4 5 6 7 8
> as.numeric(levels(x))[x]
[1] 6 4 8 6 7 6 8 5 8 4


To show it works.



The timings :




> x <- factor(sample(4:8,500000,replace=T))
> system.time(as.numeric(as.character(x)))
user system elapsed
0.11 0.00 0.11
> system.time(as.numeric(levels(x))[x])
user system elapsed
0 0 0



It's a big improvement, but not always a bottleneck. It gets important however if you have a big dataframe and a lot of columns to convert.

No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...