Strings

Markus Wamser

2017-02-10

A few examples of the string helper functions and a comparison to substr and other R base functions.

First of all, we want the package loaded and attached. Same for the benchmarks.

library(Wmisc)
library(microbenchmark)

A string from (http://slipsum.com/).

s <- "You think water moves fast? You should see ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."

Head and Tail

demo

strHead(s)
## [1] "Y"
strHeadLower(s)
## [1] "y"
strTail(s)
## [1] "ou think water moves fast? You should see ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."

compare to built-in substr

substr(s,1,1)
## [1] "Y"
tolower(substr(s,1,1))
## [1] "y"
substring(s,2)
## [1] "ou think water moves fast? You should see ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."

benchmark

The benchmark results will vary greatly depending on the version of R used. With R 3.3 the built-in functions should be preferred.

microbenchmark(substr(s,1,1),strHead(s),times=100000)
## Unit: microseconds
##             expr   min    lq     mean median    uq       max neval
##  substr(s, 1, 1) 1.182 1.576 1.921486  1.970 1.970  2522.124 1e+05
##       strHead(s) 3.151 3.545 5.359704  3.546 3.939 83443.543 1e+05
microbenchmark(tolower(substr(s,1,1)),strHeadLower(s),times=100000)
## Unit: microseconds
##                      expr   min    lq     mean median    uq      max neval
##  tolower(substr(s, 1, 1)) 2.757 3.151 3.418916  3.152 3.545   132.33 1e+05
##           strHeadLower(s) 3.151 3.939 5.601933  3.939 4.333 83933.08 1e+05
microbenchmark(substring(s,2),strTail(s),times=100000)
## Unit: microseconds
##             expr   min    lq     mean median    uq      max neval
##  substring(s, 2) 3.939 4.727 4.982714  4.727 5.121  137.056 1e+05
##       strTail(s) 4.726 5.121 6.295885  5.515 5.908 4536.988 1e+05

Take and Drop

substr(s,1,42)
## [1] "You think water moves fast? You should see"
strTake(s,42)
## [1] "You think water moves fast? You should see"
substring(s,43)
## [1] " ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."
strDrop(s,42)
## [1] " ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."
microbenchmark(substr(s,1,42),strTake(s,42),times=100000)
## Unit: microseconds
##              expr   min    lq     mean median    uq      max neval
##  substr(s, 1, 42) 1.576 1.970 2.125501  1.970 2.363 3687.485 1e+05
##    strTake(s, 42) 3.545 4.333 5.130718  4.333 4.727 4408.203 1e+05
microbenchmark(substring(s,43),strDrop(s,42),times=100000)
## Unit: microseconds
##              expr   min    lq     mean median    uq      max neval
##  substring(s, 43) 3.939 4.333 4.863169  4.727 4.727 3565.396 1e+05
##    strDrop(s, 42) 4.727 5.515 6.462964  5.908 5.909 4394.420 1e+05