Strings

Markus Wamser

2017-10-03

A few examples of the string helper functions and a comparison to substr and other R base functions.

First of all, we want the package loaded and attached. Same for the benchmarks.

library(Wmisc)
library(microbenchmark)

A string from (http://slipsum.com/).

s <- "You think water moves fast? You should see ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."

Head and Tail

demo

strHead(s)
## [1] "Y"
strHeadLower(s)
## [1] "y"
strTail(s)
## [1] "ou think water moves fast? You should see ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."

compare to built-in substr

substr(s,1,1)
## [1] "Y"
tolower(substr(s,1,1))
## [1] "y"
substring(s,2)
## [1] "ou think water moves fast? You should see ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."

benchmark

The benchmark results will vary greatly depending on the version of R used. With R 3.3 the built-in functions should be preferred.

microbenchmark(substr(s,1,1),strHead(s),times=100000)
## Unit: microseconds
##             expr   min    lq     mean median    uq       max neval
##  substr(s, 1, 1) 1.182 1.576 1.897011  1.970 1.970  2855.704 1e+05
##       strHead(s) 3.151 3.939 6.963364  3.939 4.333 85141.762 1e+05
microbenchmark(tolower(substr(s,1,1)),strHeadLower(s),times=100000)
## Unit: microseconds
##                      expr   min    lq     mean median    uq      max neval
##  tolower(substr(s, 1, 1)) 2.757 3.151 3.450931  3.152 3.545  148.871 1e+05
##           strHeadLower(s) 3.545 4.333 5.151262  4.333 4.726 4405.841 1e+05
microbenchmark(substring(s,2),strTail(s),times=100000)
## Unit: microseconds
##             expr   min    lq     mean median    uq      max neval
##  substring(s, 2) 3.939 4.726 4.939813  4.727 5.120  385.566 1e+05
##       strTail(s) 5.120 5.515 6.961758  5.908 5.909 5163.186 1e+05

Take and Drop

substr(s,1,42)
## [1] "You think water moves fast? You should see"
strTake(s,42)
## [1] "You think water moves fast? You should see"
substring(s,43)
## [1] " ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."
strDrop(s,42)
## [1] " ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."
microbenchmark(substr(s,1,42),strTake(s,42),times=100000)
## Unit: microseconds
##              expr   min    lq     mean median    uq      max neval
##  substr(s, 1, 42) 1.576 1.970 2.113599  1.970 2.364   161.08 1e+05
##    strTake(s, 42) 3.939 4.726 6.451249  4.727 4.727 92959.40 1e+05
microbenchmark(substring(s,43),strDrop(s,42),times=100000)
## Unit: microseconds
##              expr   min    lq     mean median    uq      max neval
##  substring(s, 43) 3.939 4.333 4.767279  4.727 4.727  216.611 1e+05
##    strDrop(s, 42) 5.121 5.908 7.010260  5.909 6.302 5893.752 1e+05