1# tidyselect 1.1.1
2
3* Fix for CRAN checks.
4
5* tidyselect has been re-licensed as MIT (#217).
6
7
8# tidyselect 1.1.0
9
10* Predicate functions must now be wrapped with `where()`.
11
12  ```{r}
13  iris %>% select(where(is.factor))
14  ```
15
16  We made this change to avoid puzzling error messages when a variable
17  is unexpectedly missing from the data frame and there is a
18  corresponding function in the environment:
19
20  ```{r}
21  # Attempts to invoke `data()` function
22  data.frame(x = 1) %>% select(data)
23  ```
24
25  Now tidyselect will correctly complain about a missing variable
26  rather than trying to invoke a function.
27
28  For compatibility we will support predicate functions starting with
29  `is` for 1 version.
30
31* `eval_select()` gains an `allow_rename` argument. If set to `FALSE`,
32  renaming variables with the `c(foo = bar)` syntax is an error.
33  This is useful to implement purely selective behaviour (#178).
34
35* Fixed issue preventing repeated deprecation messages when
36  `tidyselect_verbosity` is set to `"verbose"` (#184).
37
38* `any_of()` now preserves the order of the input variables (#186).
39
40* The return value of `eval_select()` is now always named, even when
41  inputs are constant (#173).
42
43
44# tidyselect 1.0.0
45
46This is the 1.0.0 release of tidyselect. It features a more solidly
47defined and implemented syntax, support for predicate functions, new
48boolean operators, and much more.
49
50
51## Documentation
52
53* New Get started vignette for client packages. Read it with
54  `vignette("tidyselect")` or at
55  <https://tidyselect.r-lib.org/articles/tidyselect.html>.
56
57* The definition of the tidyselect language has been consolidated. A
58  technical description is now available:
59  <https://tidyselect.r-lib.org/articles/syntax.html>.
60
61
62## Breaking changes
63
64* Selecting non-column variables with bare names now triggers an
65  informative message suggesting to use `all_of()` instead. Referring
66  to contextual objects with a bare name is brittle because it might
67  be masked by a data frame column. Using `all_of()` is safe (#76).
68
69tidyselect now uses vctrs for validating inputs. These changes may
70reveal programming errors that were previously silent. They may also
71cause failures if your unit tests make faulty assumptions about the
72content of error messages created in tidyselect:
73
74* Out-of-bounds errors are thrown when a name doesn't exist or a
75  location is too large for the input.
76
77* Logical vectors now fail properly.
78
79* Selected variables now must be unique. It was previously possible to
80  return duplicate selections in some circumstances.
81
82* The input names can no longer contain `NA` values.
83
84Note that we recommend `testthat::verify_output()` for monitoring
85error messages thrown from packages that you don't control. Unlike
86`expect_error()`, `verify_output()` does not cause CMD check failures
87when error messages have changed. See
88<https://www.tidyverse.org/blog/2019/11/testthat-2-3-0/> for more
89information.
90
91
92## Syntax
93
94* The boolean operators can now be used to create selections (#106).
95
96  - `!` negates a selection.
97  - `|` takes the union of two selections.
98  - `&` takes the intersection of two selections.
99
100  These patterns can currently be achieved using `-`, `c()` and
101  `intersect()` respectively. The boolean operators should be more
102  intuitive to use.
103
104  Many thanks to Irene Steves (@isteves) for suggesting this UI.
105
106* You can now use predicate functions in selection contexts:
107
108  ```r
109  iris %>% select(is.factor)
110  iris %>% select(is.factor | is.numeric)
111  ```
112
113  This feature is not available in functions that use the legacy
114  interface of tidyselect. These need to be updated to use
115  the new `eval_select()` function instead of `vars_select()`.
116
117* Unary `-` inside nested `c()` is now consistently syntax for set
118  difference (#130).
119
120* Improved support for named elements. It is now possible to assign
121  the same name to multiple elements, if the input data structure
122  doesn't require unique names (i.e. anything but a data frame).
123
124* The selection engine has been rewritten to support a clearer
125  separation between data-expressions (calls to `:`, `-`, and `c`) and
126  env-expressions (anything else). This means you can now safely use
127  expressions of the type:
128
129  ```r
130  data %>% select(1:ncol(data))
131  data %>% pivot_longer(1:ncol(data))
132  ```
133
134  Even if the data frame `data` contains a column also named `data`,
135  the subexpression `ncol(data)` is still correctly evaluated.
136  The `data:ncol(data)` expression is equivalent to `2:3` because
137  `data` is looked up in the relevant context without ambiguity:
138
139  ```r
140  data <- tibble(foo = 1, data = 2, bar = 3)
141  data %>% dplyr::select(data:ncol(data))
142  #> # A tibble: 1 x 2
143  #>    data   bar
144  #>   <dbl> <dbl>
145  #> 1     2     3
146  ```
147
148  While this example above is a bit contrived, there are many realistic
149  cases where these changes make it easier to write safe code:
150
151  ```{r}
152  select_from <- function(data, var) {
153    data %>% dplyr::select({{ var }} : ncol(data))
154  }
155  data %>% select_from(data)
156  #> # A tibble: 1 x 2
157  #>    data   bar
158  #>   <dbl> <dbl>
159  #> 1     2     3
160  ```
161
162
163## User-facing improvements
164
165* The new selection helpers `all_of()` and `any_of()` are strict
166  variants of `one_of()`. The former always fails if some variables
167  are unknown, while the latter does not. `all_of()` is safer to use
168  when you expect all selected variables to exist. `any_of()` is
169  useful in other cases, for instance to ensure variables are selected
170  out:
171
172  ```
173  vars <- c("Species", "Genus")
174  iris %>% dplyr::select(-any_of(vars))
175  ```
176
177  Note that `all_of()` and `any_of()` are a bit more conservative in
178  their function signature than `one_of()`: they do not accept dots.
179  The equivalent of `one_of("a", "b")` is `all_of(c("a", "b"))`.
180
181* Selection helpers like `all_of()` and `starts_with()` are now
182  available in all selection contexts, even when they haven't been
183  attached to the search path. The most visible consequence of this
184  change is that it is now easier to use selection functions without
185  attaching the host package:
186
187  ```r
188  # Before
189  dplyr::select(mtcars, dplyr::starts_with("c"))
190
191  # After
192  dplyr::select(mtcars, starts_with("c"))
193  ```
194
195  It is still recommended to export the helpers from your package so
196  that users can easily look up the documentation with `?`.
197
198* `starts_with()`, `ends_with()`, `contains()`, and `matches()` now
199  accept vector inputs (#50). For instance these are now equivalent
200  ways of selecting all variables that start with either `"a"` or `"b"`:
201
202  ```{r}
203  starts_with(c("a", "b"))
204  starts_with("a") | starts_with("b")
205  ```
206
207* `matches()` has new argument `perl` to allow for Perl-like regular
208  expressions (@fmichonneau, #71)
209
210* Better support for selecting with S3 vectors. For instance, factors
211  are treated as characters.
212
213
214## API
215
216New `eval_select()` and `eval_rename()` functions for client
217packages. These replace `vars_select()` and `vars_rename()`, which are
218now deprecated. These functions:
219
220* Take the full data rather than just names. This makes it possible to
221  use function predicates in selection context.
222
223* Return a numeric vector of locations rather than a vector of
224  names. This makes it possible to use tidyselect with inputs that
225  support duplicate names, like regular vectors.
226
227
228## Other features and fixes
229
230* The `.strict` argument of `vars_select()` now works more robustly
231  and consistently.
232
233* Using arithmetic operators in selection context now fails more
234  informatively (#84).
235
236* It is now possible to select columns in data frames containing
237  duplicate variables (#94). However, the duplicates can't be part of
238  the final selection.
239
240* `eval_rename()` no longer ignore the names of unquoted character
241  vectors of length 1 (#79).
242
243* `eval_rename()` now fails when a variable is renamed to an existing
244  name (#70).
245
246* `eval_rename()` has better support for existing duplicates (but
247  creating new duplicates is an error).
248
249* `eval_select()`, `eval_rename()` and `vars_pull()` now detect
250  missing values uniformly (#72).
251
252* `vars_pull()` now includes the faulty expression in error messages.
253
254* The performance issues of `eval_rename()` with many arguments have
255  been fixed. This make `dplyr::rename_all()` with many columns much
256  faster (@zkamvar, #92).
257
258* tidyselect is now much faster with many columns, thanks to a
259  performance fix in `rlang::env_bind()` as well as internal fixes.
260
261* `vars_select()` ignores vectors with only zeros (#82).
262
263
264# tidyselect 0.2.5
265
266This is a maintenance release for compatibility with rlang 0.3.0.
267
268
269# tidyselect 0.2.4
270
271* Fixed a warning that occurred when a vector of column positions was
272  supplied to `vars_select()` or functions depending on it such as
273  `tidyr::gather()` (#43 and tidyverse/tidyr#374).
274
275* Fixed compatibility issue with rlang 0.2.0 (#51).
276
277
278# tidyselect 0.2.3
279
280* Internal fixes in prevision of using `tidyselect` within `dplyr`.
281
282* `vars_select()` and `vars_rename()` now correctly support unquoting
283  character vectors that have names.
284
285* `vars_select()` now ignores missing variables.
286
287
288# tidyselect 0.2.2
289
290* `dplyr` is now correctly mentioned as suggested package.
291
292
293# tidyselect 0.2.1
294
295* `-` now supports character vectors in addition to strings. This
296  makes it easy to unquote column names to exclude from the set:
297
298  ```{r}
299  vars <- c("cyl", "am", "disp", "drat")
300  vars_select(names(mtcars), - !!vars)
301  ```
302
303* `last_col()` now issues an error when the variable vector is empty.
304
305* `last_col()` now returns column positions rather than column names
306  for consistency with other helpers. This also makes it compatible
307  with functions like `seq()`.
308
309* `c()` now supports character vectors the same way as `-` and `seq()`.
310  (#37 @gergness)
311
312
313# tidyselect 0.2.0
314
315The main point of this release is to revert a troublesome behaviour
316introduced in tidyselect 0.1.0. It also includes a few features.
317
318
319## Evaluation rules
320
321The special evaluation semantics for selection have been changed
322back to the old behaviour because the new rules were causing too
323much trouble and confusion. From now on data expressions (symbols
324and calls to `:` and `c()`) can refer to both registered variables
325and to objects from the context.
326
327However the semantics for context expressions (any calls other than
328to `:` and `c()`) remain the same. Those expressions are evaluated
329in the context only and cannot refer to registered variables.
330
331If you're writing functions and refer to contextual objects, it is
332still a good idea to avoid data expressions. Since registered
333variables are change as a function of user input and you never know
334if your local objects might be shadowed by a variable. Consider:
335
336```
337n <- 2
338vars_select(letters, 1:n)
339```
340
341Should that select up to the second element of `letters` or up to
342the 14th? Since the variables have precedence in a data expression,
343this will select the 14 first letters. This can be made more robust
344by turning the data expression into a context expression:
345
346```
347vars_select(letters, seq(1, n))
348```
349
350You can also use quasiquotation since unquoted arguments are
351guaranteed to be evaluated without any user data in scope. While
352equivalent because of the special rules for context expressions,
353this may be clearer to the reader accustomed to tidy eval:
354
355```{r}
356vars_select(letters, seq(1, !! n))
357```
358
359Finally, you may want to be more explicit in the opposite direction.
360If you expect a variable to be found in the data but not in the
361context, you can use the `.data` pronoun:
362
363```{r}
364vars_select(names(mtcars), .data$cyl : .data$drat)
365```
366
367## New features
368
369* The new select helper `last_col()` is helpful to select over a
370  custom range: `vars_select(vars, 3:last_col())`.
371
372* `:` and `-` now handle strings as well. This makes it easy to
373  unquote a column name: `(!!name) : last_col()` or `- !!name`.
374
375* `vars_select()` gains a `.strict` argument similar to
376  `rename_vars()`.  If set to `FALSE`, errors about unknown variables
377  are ignored.
378
379* `vars_select()` now treats `NULL` as empty inputs. This follows a
380  trend in the tidyverse tools.
381
382* `vars_rename()` now handles variable positions (integers or round
383  doubles) just like `vars_select()` (#20).
384
385* `vars_rename()` is now implemented with the tidy eval framework.
386  Like `vars_select()`, expressions are evaluated without any user
387  data in scope. In addition a variable context is now established so
388  you can write rename helpers. Those should return a single round
389  number or a string (variable position or variable name).
390
391* `has_vars()` is a predicate that tests whether a variable context
392  has been set (#21).
393
394* The selection helpers are now exported in a list
395  `vars_select_helpers`.  This is intended for APIs that embed the
396  helpers in the evaluation environment.
397
398
399## Fixes
400
401* `one_of()` argument `vars` has been renamed to `.vars` to avoid
402  spurious matching.
403
404
405# tidyselect 0.1.1
406
407tidyselect is the new home for the legacy functions
408`dplyr::select_vars()`, `dplyr::rename_vars()` and
409`dplyr::select_var()`.
410
411
412## API changes
413
414We took this opportunity to make a few changes to the API:
415
416* `select_vars()` and `rename_vars()` are now `vars_select()` and
417  `vars_rename()`. This follows the tidyverse convention that a prefix
418  corresponds to the input type while suffixes indicate the output
419  type. Similarly, `select_var()` is now `vars_pull()`.
420
421* The arguments are now prefixed with dots to limit argument matching
422  issues. While the dots help, it is still a good idea to splice a
423  list of captured quosures to make sure dotted arguments are never
424  matched to `vars_select()`'s named arguments:
425
426  ```
427  vars_select(vars, !!! quos(...))
428  ```
429
430* Error messages can now be customised. For consistency with dplyr,
431  error messages refer to "columns" by default. This assumes that the
432  variables being selected come from a data frame. If this is not
433  appropriate for your DSL, you can now add an attribute `vars_type`
434  to the `.vars` vector to specify alternative names. This must be a
435  character vector of length 2 whose first component is the singular
436  form and the second is the plural. For example, `c("variable",
437  "variables")`.
438
439
440## Establishing a variable context
441
442tidyselect provides a few more ways of establishing a variable
443context:
444
445* `scoped_vars()` sets up a variable context along with an an exit
446  hook that automatically restores the previous variables. It is the
447  preferred way of changing the variable context.
448
449  `with_vars()` takes variables and an expression and evaluates the
450  latter in the context of the former.
451
452* `poke_vars()` establishes a new variable context. It returns the
453  previous context invisibly and it is your responsibility to restore
454  it after you are done. This is for expert use only.
455
456  `current_vars()` has been renamed to `peek_vars()`. This naming is a
457  reference to [peek and poke](https://en.wikipedia.org/wiki/PEEK_and_POKE)
458  from legacy languages.
459
460
461## New evaluation semantics
462
463The evaluation semantics for selecting verbs have changed. Symbols are
464now evaluated in a data-only context that is isolated from the calling
465environment. This means that you can no longer refer to local variables
466unless you are explicitly unquoting these variables with `!!`, which
467is mostly for expert use.
468
469Note that since dplyr 0.7, helper calls (like `starts_with()`) obey
470the opposite behaviour and are evaluated in the calling context
471isolated from the data context. To sum up, symbols can only refer to
472data frame objects, while helpers can only refer to contextual
473objects. This differs from usual R evaluation semantics where both
474the data and the calling environment are in scope (with the former
475prevailing over the latter).
476