• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

R/H07-Feb-2019-1,715559

build/H03-May-2022-

data/H03-May-2022-

inst/H09-Feb-2019-1,031682

man/H09-Feb-2019-1,5531,344

tests/H27-Nov-2014-849630

vignettes/H09-Feb-2019-897639

DESCRIPTIONH A D10-Feb-20191.2 KiB3332

LICENSEH A D02-Jan-201717.7 KiB340281

MD5H A D10-Feb-20195.8 KiB113112

NAMESPACEH A D07-Feb-20191,009 5452

NEWS.mdH A D09-Feb-20199.1 KiB253173

README.mdH A D07-Feb-20196.5 KiB208160

README.md

1
2<!-- README.md is generated from README.Rmd. Please edit that file -->
3
4# stringr <a href='https:/stringr.tidyverse.org'><img src='man/figures/logo.png' align="right" height="139" /></a>
5
6<!-- badges: start -->
7
8[![CRAN
9status](https://www.r-pkg.org/badges/version/stringr)](https://cran.r-project.org/package=stringr)
10[![Travis build
11status](https://travis-ci.org/tidyverse/stringr.svg?branch=master)](https://travis-ci.org/tidyverse/stringr)
12[![AppVeyor Build
13Status](https://ci.appveyor.com/api/projects/status/github/tidyverse/stringr?branch=master&svg=true)](https://ci.appveyor.com/project/tidyverse/stringr)
14[![Codecov test
15coverage](https://codecov.io/gh/tidyverse/stringr/branch/master/graph/badge.svg)](https://codecov.io/gh/tidyverse/stringr?branch=master)
16[![Lifecycle:
17stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://www.tidyverse.org/lifecycle/#stable)
18<!-- badges: end -->
19
20## Overview
21
22Strings are not glamorous, high-profile components of R, but they do
23play a big role in many data cleaning and preparation tasks. The stringr
24package provide a cohesive set of functions designed to make working
25with strings as easy as possible. If you’re not familiar with strings,
26the best place to start is the [chapter on
27strings](http://r4ds.had.co.nz/strings.html) in R for Data Science.
28
29stringr is built on top of
30[stringi](https://github.com/gagolews/stringi), which uses the
31[ICU](http://site.icu-project.org) C library to provide fast, correct
32implementations of common string manipulations. stringr focusses on the
33most important and commonly used string manipulation functions whereas
34stringi provides a comprehensive set covering almost anything you can
35imagine. If you find that stringr is missing a function that you need,
36try looking in stringi. Both packages share similar conventions, so once
37you’ve mastered stringr, you should find stringi similarly easy to use.
38
39## Installation
40
41``` r
42# Install the released version from CRAN:
43install.packages("stringr")
44
45# Install the cutting edge development version from GitHub:
46# install.packages("devtools")
47devtools::install_github("tidyverse/stringr")
48```
49
50## Cheatsheet
51
52<a href="https://github.com/rstudio/cheatsheets/blob/master/strings.pdf"><img src="https://raw.githubusercontent.com/rstudio/cheatsheets/master/pngs/thumbnails/strings-cheatsheet-thumbs.png" width="630" height="242"/></a>
53
54## Usage
55
56All functions in stringr start with `str_` and take a vector of strings
57as the first argument.
58
59``` r
60x <- c("why", "video", "cross", "extra", "deal", "authority")
61str_length(x)
62#> [1] 3 5 5 5 4 9
63str_c(x, collapse = ", ")
64#> [1] "why, video, cross, extra, deal, authority"
65str_sub(x, 1, 2)
66#> [1] "wh" "vi" "cr" "ex" "de" "au"
67```
68
69Most string functions work with regular expressions, a concise language
70for describing patterns of text. For example, the regular expression
71`"[aeiou]"` matches any single character that is a vowel:
72
73``` r
74str_subset(x, "[aeiou]")
75#> [1] "video"     "cross"     "extra"     "deal"      "authority"
76str_count(x, "[aeiou]")
77#> [1] 0 3 1 2 2 4
78```
79
80There are seven main verbs that work with patterns:
81
82  - `str_detect(x, pattern)` tells you if there’s any match to the
83    pattern.
84
85    ``` r
86    str_detect(x, "[aeiou]")
87    #> [1] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
88    ```
89
90  - `str_count(x, pattern)` counts the number of patterns.
91
92    ``` r
93    str_count(x, "[aeiou]")
94    #> [1] 0 3 1 2 2 4
95    ```
96
97  - `str_subset(x, pattern)` extracts the matching components.
98
99    ``` r
100    str_subset(x, "[aeiou]")
101    #> [1] "video"     "cross"     "extra"     "deal"      "authority"
102    ```
103
104  - `str_locate(x, pattern)` gives the position of the match.
105
106    ``` r
107    str_locate(x, "[aeiou]")
108    #>      start end
109    #> [1,]    NA  NA
110    #> [2,]     2   2
111    #> [3,]     3   3
112    #> [4,]     1   1
113    #> [5,]     2   2
114    #> [6,]     1   1
115    ```
116
117  - `str_extract(x, pattern)` extracts the text of the match.
118
119    ``` r
120    str_extract(x, "[aeiou]")
121    #> [1] NA  "i" "o" "e" "e" "a"
122    ```
123
124  - `str_match(x, pattern)` extracts parts of the match defined by
125    parentheses.
126
127    ``` r
128    # extract the characters on either side of the vowel
129    str_match(x, "(.)[aeiou](.)")
130    #>      [,1]  [,2] [,3]
131    #> [1,] NA    NA   NA
132    #> [2,] "vid" "v"  "d"
133    #> [3,] "ros" "r"  "s"
134    #> [4,] NA    NA   NA
135    #> [5,] "dea" "d"  "a"
136    #> [6,] "aut" "a"  "t"
137    ```
138
139  - `str_replace(x, pattern, replacement)` replaces the matches with new
140    text.
141
142    ``` r
143    str_replace(x, "[aeiou]", "?")
144    #> [1] "why"       "v?deo"     "cr?ss"     "?xtra"     "d?al"      "?uthority"
145    ```
146
147  - `str_split(x, pattern)` splits up a string into multiple pieces.
148
149    ``` r
150    str_split(c("a,b", "c,d,e"), ",")
151    #> [[1]]
152    #> [1] "a" "b"
153    #>
154    #> [[2]]
155    #> [1] "c" "d" "e"
156    ```
157
158As well as regular expressions (the default), there are three other
159pattern matching engines:
160
161  - `fixed()`: match exact bytes
162  - `coll()`: match human letters
163  - `boundary()`: match boundaries
164
165## RStudio Addin
166
167The [RegExplain RStudio
168addin](https://www.garrickadenbuie.com/project/regexplain/) provides a
169friendly interface for working with regular expressions and functions
170from stringr. This addin allows you to interactively build your regexp,
171check the output of common string matching functions, consult the
172interactive help pages, or use the included resources to learn regular
173expressions.
174
175This addin can easily be installed with devtools:
176
177``` r
178# install.packages("devtools")
179devtools::install_github("gadenbuie/regexplain")
180```
181
182## Compared to base R
183
184R provides a solid set of string operations, but because they have grown
185organically over time, they can be inconsistent and a little hard to
186learn. Additionally, they lag behind the string operations in other
187programming languages, so that some things that are easy to do in
188languages like Ruby or Python are rather hard to do in R.
189
190  - Uses consistent function and argument names. The first argument is
191    always the vector of strings to modify, which makes stringr work
192    particularly well in conjunction with the pipe:
193
194    ``` r
195    letters %>%
196      .[1:10] %>%
197      str_pad(3, "right") %>%
198      str_c(letters[2:11])
199    #>  [1] "a  b" "b  c" "c  d" "d  e" "e  f" "f  g" "g  h" "h  i" "i  j" "j  k"
200    ```
201
202  - Simplifies string operations by eliminating options that you don’t
203    need 95% of the time.
204
205  - Produces outputs than can easily be used as inputs. This includes
206    ensuring that missing inputs result in missing outputs, and zero
207    length inputs result in zero length outputs.
208