cmc__data <- new.env(parent = emptyenv()) #' Metadata cache for a CRAN-like repository #' #' This is an R6 class that implements the metadata cache of a CRAN-like #' repository. For a higher level interface, see the [meta_cache_list()], #' [meta_cache_deps()], [meta_cache_revdeps()] and [meta_cache_update()] #' functions. #' #' The cache has several layers: #' * The data is stored inside the `cranlike_metadata_cache` object. #' * It is also stored as an RDS file, in the session temporary directory. #' This ensures that the same data is used for all queries of a #' `cranlike_metadata_cache` object. #' * It is stored in an RDS file in the user's cache directory. #' * The downloaded raw `PACKAGES*` files are cached, together with HTTP #' ETags, to minimize downloads. #' #' It has a synchronous and an asynchronous API. #' #' @section Usage: #' ``` #' cmc <- cranlike_metadata_cache$new( #' primary_path = NULL, replica_path = tempfile(), #' platforms = default_platforms(), r_version = getRversion(), #' bioc = TRUE, cran_mirror = default_cran_mirror(), #' repos = getOption("repos"), #' update_after = as.difftime(7, units = "days")) #' #' cmc$list(packages = NULL) #' cmc$async_list(packages = NULL) #' #' cmc$deps(packages, dependencies = NA, recursive = TRUE) #' cmc$async_deps(packages, dependencies = NA, recursive = TRUE) #' #' cmc$revdeps(packages, dependencies = NA, recursive = TRUE) #' cmc$async_revdeps(packages, dependencies = NA, recursive = TRUE) #' #' cmc$update() #' cmc$async_update() #' cmc$check_update() #' cmc$asnyc_check_update() #' #' cmc$summary() #' #' cmc$cleanup(force = FALSE) #' ``` #' #' @section Arguments: #' * `primary_path`: Path of the primary, user level cache. Defaults to #' the user level cache directory of the machine. #' * `replica_path`: Path of the replica. Defaults to a temporary directory #' within the session temporary directory. #' * `platforms`: see [default_platforms()] for possible values. #' * `r_version`: R version to create the cache for. #' * `bioc`: Whether to include BioConductor packages. #' * `cran_mirror`: CRAN mirror to use, this takes precedence over `repos`. #' * `repos`: Repositories to use. #' * `update_after`: `difftime` object. Automatically update the cache if #' it gets older than this. Set it to `Inf` to avoid updates. Defaults #' to seven days. #' * `packages`: Packages to query, character vector. #' * `dependencies`: Which kind of dependencies to include. Works the same #' way as the `dependencies` argument of [utils::install.packages()]. #' * `recursive`: Whether to include recursive dependencies. #' * `force`: Whether to force cleanup without asking the user. #' #' @section Details: #' #' `cranlike_metadata_cache$new()` creates a new cache object. Creation #' does not trigger the population of the cache. It is only populated on #' demand, when queries are executed against it. In your package, you may #' want to create a cache instance in the `.onLoad()` function of the #' package, and store it in the package namespace. As this is a cheap #' operation, the package will still load fast, and then the package code #' can refer to the common cache object. #' #' `cmc$list()` lists all (or the specified) packages in the cache. #' It returns a tibble, see the list of columns below. #' #' `cmc$async_list()` is similar, but it is asynchronous, it returns a #' `deferred` object. #' #' `cmc$deps()` returns a tibble, with the (potentially recursive) #' dependencies of `packages`. #' #' `cmc$async_deps()` is the same, but it is asynchronous, it #' returns a `deferred` object. #' #' `cmc$revdeps()` returns a tibble, with the (potentially recursive) #' reverse dependencies of `packages`. #' #' `cmc$async_revdeps()` does the same, asynchronously, it returns an #' `deferred` object. #' #' `cmc$update()` updates the the metadata (as needed) in the cache, #' and then returns a tibble with all packages, invisibly. #' #' `cmc$async_update()` is similar, but it is asynchronous. #' #' `cmc$check_update()` checks if the metadata is current, and if it is #' not, it updates it. #' #' `cmc$async_check_update()` is similar, but it is asynchronous. #' #' `cmc$summary()` lists metadata about the cache, including its #' location and size. #' #' `cmc$cleanup()` deletes the cache files from the disk, and also from #' memory. #' #' @section Columns: #' The metadata tibble contains all available versions (i.e. sources and #' binaries) for all packages. It usually has the following columns, #' some might be missing on some platforms. #' * `package`: Package name. #' * `title`: Package title. #' * `version`: Package version. #' * `depends`: `Depends` field from `DESCRIPTION`, or `NA_character_`. #' * `suggests`: `Suggests` field from `DESCRIPTION`, or `NA_character_`. #' * `built`: `Built` field from `DESCIPTION`, if a binary package, #' or `NA_character_`. #' * `imports`: `Imports` field from `DESCRIPTION`, or `NA_character_`. #' * `archs`: `Archs` entries from `PACKAGES` files. Might be missing. #' * `repodir`: The directory of the file, inside the repository. #' * `platform`: This is a character vector. See [default_platforms()] for #' more about platform names. In practice each value of the `platform` #' column is either #' * `"source"` for source packages, #' * a platform string, e.g. `x86_64-apple-darwin17.0` for macOS #' packages compatible with macOS High Sierra or newer. #' * `needscompilation`: Whether the package needs compilation. #' * `type`: `bioc` or `cran` currently. #' * `target`: The path of the package file inside the repository. #' * `mirror`: URL of the CRAN/BioC mirror. #' * `sources`: List column with URLs to one or more possible locations #' of the package file. For source CRAN packages, it contains URLs to #' the `Archive` directory as well, in case the package has been #' archived since the metadata was cached. #' * `filesize`: Size of the file, if known, in bytes, or `NA_integer_`. #' * `sha256`: The SHA256 hash of the file, if known, or `NA_character_`. #' * `deps`: All package dependencies, in a tibble. #' * `license`: Package license, might be `NA` for binary packages. #' * `linkingto`: `LinkingTo` field from `DESCRIPTION`, or `NA_character_`. #' * `enhances`: `Enhances` field from `DESCRIPTION`, or `NA_character_`. #' * `os_type`: `unix` or `windows` for OS specific packages. Usually `NA`. #' * `priority`: "optional", "recommended" or `NA`. (Base packages are #' normally not included in the list, so "base" should not appear here.) #' * `md5sum`: MD5 sum, if available, may be `NA`. #' * `sysreqs`: For CRAN packages, the `SystemRequirements` field, the #' required system libraries or other software for the package. For #' non-CRAN packages it is `NA`. #' * `published`: The time the package was published at, in GMT, #' `POSIXct` class. #' #' The tibble contains some extra columns as well, these are for internal #' use only. #' #' @export #' @examplesIf pkgcache:::run_examples() #' dir.create(cache_path <- tempfile()) #' cmc <- cranlike_metadata_cache$new(cache_path, bioc = FALSE) #' cmc$list() #' cmc$list("pkgconfig") #' cmc$deps("pkgconfig") #' cmc$revdeps("pkgconfig", recursive = FALSE) cranlike_metadata_cache <- R6Class( "cranlike_metadata_cache", public = list( initialize = function(primary_path = NULL, replica_path = tempfile(), platforms = default_platforms(), r_version = getRversion(), bioc = TRUE, cran_mirror = default_cran_mirror(), repos = getOption("repos"), update_after = as.difftime(7, units = "days")) cmc_init(self, private, primary_path, replica_path, platforms, r_version, bioc, cran_mirror, repos, update_after), deps = function(packages, dependencies = NA, recursive = TRUE) synchronise(self$async_deps(packages, dependencies, recursive)), async_deps = function(packages, dependencies = NA, recursive = TRUE) cmc_async_deps(self, private, packages, dependencies, recursive), revdeps = function(packages, dependencies = NA, recursive = TRUE) synchronise(self$async_revdeps(packages, dependencies, recursive)), async_revdeps = function(packages, dependencies = NA, recursive = TRUE) cmc_async_revdeps(self, private, packages, dependencies, recursive), list = function(packages = NULL) synchronise(self$async_list(packages)), async_list = function(packages = NULL) cmc_async_list(self, private, packages), update = function() synchronise(self$async_update()), async_update = function() cmc_async_update(self, private), check_update = function() synchronise(self$async_check_update()), async_check_update = function() cmc_async_check_update(self, private), summary = function() cmc_summary(self, private), cleanup = function(force = FALSE) cmc_cleanup(self, private, force) ), private = list( get_cache_files = function(which = c("primary", "replica")) cmc__get_cache_files(self, private, match.arg(which)), async_ensure_cache = function(max_age = private$update_after) cmc__async_ensure_cache(self, private, max_age), get_current_data = function(max_age) cmc__get_current_data(self, private, max_age), get_memory_cache = function(max_age) cmc__get_memory_cache(self, private, max_age), load_replica_rds = function(max_age) cmc__load_replica_rds(self, private, max_age), load_primary_rds = function(max_age) cmc__load_primary_rds(self, private, max_age), load_primary_pkgs = function(max_age) cmc__load_primary_pkgs(self, private, max_age), update_replica_pkgs = function() cmc__update_replica_pkgs(self, private), update_replica_rds = function(alert = TRUE) cmc__update_replica_rds(self, private, alert), update_primary = function(rds = TRUE, packages = TRUE, lock = TRUE) cmc__update_primary(self, private, rds, packages, lock), update_memory_cache = function() cmc__update_memory_cache(self, private), copy_to_replica = function(rds = TRUE, pkgs = FALSE, etags = FALSE) cmc__copy_to_replica(self, private, rds, pkgs, etags), ## We use this to make sure that different versions of pkgcache can ## share the same metadata cache directory. It is used to calculate ## the hash of the cached RDS file. cache_version = "3", data = NULL, data_time = NULL, data_messaged = NULL, update_deferred = NULL, chk_update_deferred = NULL, primary_path = NULL, replica_path = NULL, platforms = NULL, r_version = NULL, bioc = NULL, repos = NULL, update_after = NULL, dirs = NULL, lock_timeout = 10000 ) ) #' @importFrom filelock lock unlock cmc_init <- function(self, private, primary_path, replica_path, platforms, r_version, bioc, cran_mirror, repos, update_after) { "!!DEBUG Init metadata cache in '`replica_path`'" r_version <- as.character(r_version) private$primary_path <- primary_path %||% get_user_cache_dir()$root private$replica_path <- replica_path private$platforms <- platforms private$r_version <- get_minor_r_version(r_version) private$bioc <- bioc private$repos <- cmc__get_repos(repos, bioc, cran_mirror, r_version) private$update_after <- update_after private$dirs <- get_all_package_dirs(platforms, r_version) invisible(self) } cmc_async_deps <- function(self, private, packages, dependencies, recursive) { assert_that( is_character(packages), is_dependencies(dependencies), is_flag(recursive)) "!!DEBUG Getting deps" private$async_ensure_cache()$ then(~ extract_deps(., packages, dependencies, recursive)) } cmc_async_revdeps <- function(self, private, packages, dependencies, recursive) { assert_that( is_character(packages), is_dependencies(dependencies), is_flag(recursive)) "!!DEBUG Getting revdeps" private$async_ensure_cache()$ then(~ extract_revdeps(., packages, dependencies, recursive)) } cmc_async_list <- function(self, private, packages) { assert_that(is.null(packages) || is_character(packages)) "!!DEBUG Listing packages" private$async_ensure_cache()$ then(function(x) { if (is.null(packages)) x$pkgs else x$pkgs[x$pkgs$package %in% packages,] }) } cmc_async_update <- function(self, private) { self; private if (!is.null(private$update_deferred)) return(private$update_deferred) private$update_deferred <- async(private$update_replica_pkgs)()$ then(~ private$update_replica_rds())$ then(~ private$update_primary())$ then(~ private$data)$ catch(error = function(err) { err$message <- msg_wrap( conditionMessage(err), "\n\n", "Could not load or update metadata cache. If you think your local ", "cache is broken, try deleting it with `meta_cache_cleanup()`, or ", "the `$cleanup()` method.") stop(err) })$ finally(function() private$update_deferred <- NULL)$ share() } cmc_async_check_update <- function(self, private) { self; private if (!is.null(private$update_deferred)) return(private$update_deferred) if (!is.null(private$chk_update_deferred)) return(private$chk_update_deferred) private$chk_update_deferred <- async(private$update_replica_pkgs)()$ then(function(ret) { ## Some might be NULL, if failure was allowed and indeed it happened. ## For these we just pretend that they did not change, so they do ## not trigger an update. The metadata RDS builder is robust for ## these files to be empty or non-existing. stat <- viapply(ret, function(x) x$response$status_code %||% 304L) rep_files <- private$get_cache_files("replica") pkg_times <- file_get_time(rep_files$pkgs$path) if (! file.exists(rep_files$rds) || any(file_get_time(rep_files$rds) < pkg_times) || any(stat < 300)) { private$update_replica_rds(alert = FALSE) private$update_primary() private$data } else { private$async_ensure_cache() } })$ finally(function() private$chk_update_deferred <- NULL)$ share() } cmc_summary <- function(self, private) { dirs <- private$get_cache_files("primary") pgz <- dir(dirs$meta, recursive = TRUE, pattern = "^PACKAGES\\.gz$", full.names = TRUE) all <- dir(dirs$meta, recursive = TRUE, full.names = TRUE) list( cachepath = dirs$meta, lockfile = dirs$lock, current_rds = dirs$rds, raw_files = pgz, rds_files = dir(dirs$meta, pattern = "\\.rds$", full.names = TRUE), size = sum(file.size(all)) ) } #' @importFrom cli cli_alert_info cmc_cleanup <- function(self, private, force) { if (!force && !interactive()) { stop("Not cleaning up cache, please specify `force = TRUE`") } cache_dir <- private$get_cache_files("primary")$meta if (!force) { msg <- glue::glue( "Are you sure you want to clean up the cache in `{cache_dir}` (y/N)? ") ans <- readline(msg) if (! ans %in% c("y", "Y")) stop("Aborted") } local_cache_dir <- private$get_cache_files("replica") unlink(local_cache_dir, recursive = TRUE, force = TRUE) private$data <- NULL private$data_messaged <- NULL cli_alert_info("Cleaning up cache directory {.path {cache_dir}}.") unlink(cache_dir, recursive = TRUE, force = TRUE) } #' @importFrom digest digest #' @importFrom utils URLencode repo_encode <- function(repos) { paste0( vcapply(repos$name, URLencode, reserved = TRUE), "-", substr(vcapply(repos$url, digest), 1, 10) ) } cran_metadata_url <- function() { Sys.getenv( "R_PKG_CRAN_METADATA_URL", "https://cran.r-pkg.org/metadata/") } cmc__get_cache_files <- function(self, private, which) { root <- private[[paste0(which, "_path")]] repo_hash <- digest(list(repos = private$repos, dirs = private$dirs, version = private$cache_version)) str_platforms <- paste(private$platforms, collapse = "+") rds_file <- paste0("pkgs-", substr(repo_hash, 1, 10), ".rds") repo_enc <- rep(repo_encode(private$repos), each = nrow(private$dirs)) pkgs_dirs <- rep(private$dirs$contriburl, nrow(private$repos)) pkgs_files <- file.path(pkgs_dirs, "PACKAGES.gz") pkgs_files2 <- file.path(pkgs_dirs, "PACKAGES") mirror <- rep(private$repos$url, each = nrow(private$dirs)) name <- tolower(rep(private$repos$name, each = nrow(private$dirs))) type <- rep(private$repos$type, each = nrow(private$dirs)) r_version <- rep(private$dirs$rversion, nrow(private$repos)) bioc_version <- rep(private$repos$bioc_version, each = nrow(private$dirs)) pkg_path <- file.path(root, "_metadata", repo_enc, pkgs_files) meta_path <- ifelse( type == "cran" | name == "rspm", file.path(root, "_metadata", repo_enc, pkgs_dirs, "METADATA2.gz"), NA_character_) meta_etag <- ifelse( !is.na(meta_path), paste0(meta_path, ".etag"), NA_character_) meta_url <- ifelse( !is.na(meta_path), paste0(cran_metadata_url(), pkgs_dirs, "/METADATA2.gz"), NA_character_) list( root = root, meta = file.path(root, "_metadata"), lock = file.path(root, "_metadata.lock"), rds = file.path(root, "_metadata", rds_file), pkgs = tibble::tibble( path = pkg_path, etag = file.path(root, "_metadata", repo_enc, paste0(pkgs_files, ".etag")), basedir = pkgs_dirs, base = pkgs_files, mirror = mirror, url = paste0(mirror, "/", pkgs_files), fallback_url = paste0(mirror, "/", pkgs_files2), platform = rep(private$dirs$platform, nrow(private$repos)), type = type, r_version = r_version, bioc_version = bioc_version, meta_path = meta_path, meta_etag = meta_etag, meta_url = meta_url ) ) } #' Load the cache, asynchronously, with as little work as possible #' #' 1. If it is already loaded, and fresh return it. #' 2. Otherwise try the replica RDS. #' 3. Otherwise try the primary RDS. #' 4. Otherwise try the primary PACKAGES files. #' 5. Otherwise update the replica PACKAGES files, #' the replica RDS, and then also the primary PACKAGES and RDS. #' #' @param self self #' @param private private self #' @param max_age Maximum age allowed to consider the data current. #' @return Metadata. #' @keywords internal cmc__async_ensure_cache <- function(self, private, max_age) { max_age r <- try_catch_null(private$get_current_data(max_age)) %||% try_catch_null(private$get_memory_cache(max_age)) %||% try_catch_null(private$load_replica_rds(max_age)) %||% try_catch_null(private$load_primary_rds(max_age)) %||% try_catch_null(private$load_primary_pkgs(max_age)) if (is.null(r)) { self$async_update() } else { async_constant(r) } } cmc__get_current_data <- function(self, private, max_age) { "!!DEBUG Get current data?" if (is.null(private$data)) stop("No data loaded") if (is.null(private$data_time) || Sys.time() - private$data_time > max_age) { stop("Loaded data outdated") } "!!DEBUG Got current data!" if (! isTRUE(private$data_messaged)) { private$data_messaged <- TRUE } private$data } cmc__get_memory_cache <- function(self, private, max_age) { "!!DEBUG Get from memory cache?" rds <- private$get_cache_files("primary")$rds hit <- cmc__data[[rds]] if (is.null(hit)) stop("Not in the memory cache") if (is.null(hit$data_time) || Sys.time() - hit$data_time > max_age) { stop("Memory cache outdated") } private$data <- hit$data private$data_time <- hit$data_time private$data_messaged <- NULL private$data } #' Try to load the package metadata asynchronously, from the replica RDS #' #' If the replica has the RDS data, it is loaded and returned. #' Otherwise throws an error. #' #' @param self Self. #' @param private Private self. #' @param max_age Maximum age allowed for the RDS file to be considered #' as current. #' @return The metadata. #' @keywords internal #' @importFrom cli cli_process_start cli_process_done cmc__load_replica_rds <- function(self, private, max_age) { "!!DEBUG Load replica RDS?" rds <- private$get_cache_files("replica")$rds if (!file.exists(rds)) stop("No replica RDS file in cache") time <- file_get_time(rds) if (Sys.time() - time > max_age) stop("Replica RDS cache file outdated") sts <- cli_process_start("Loading metadata database") private$data <- readRDS(rds) private$data_time <- time private$data_messaged <- NULL "!!DEBUG Loaded replica RDS!" private$update_memory_cache() cli_process_done(sts) private$data } #' Load the metadata from the primary cache's RDS file #' #' If it exists and current, then the replica RDS is updated to it as well, #' and the data is returned. Otherwise throws an error. #' #' @inheritParams cmc__load_replica_rds #' @return Metadata. #' @keywords internal #' @importFrom cli cli_process_start cli_process_done cmc__load_primary_rds <- function(self, private, max_age) { "!!DEBUG Load primary RDS?" pri_files <- private$get_cache_files("primary") rep_files <- private$get_cache_files("replica") mkdirp(dirname(pri_files$lock)) l <- lock(pri_files$lock, exclusive = FALSE, private$lock_timeout) if (is.null(l)) stop("Cannot acquire lock to copy RDS") on.exit(unlock(l), add = TRUE) if (!file.exists(pri_files$rds)) stop("No primary RDS file in cache") time <- file_get_time(pri_files$rds) if (Sys.time() - time > max_age) stop("Primary RDS cache file outdated") ## Metadata files might be missing or outdated, that's ok (?) pkgs_times <- file_get_time(pri_files$pkgs$path) if (any(is.na(pkgs_times)) || any(pkgs_times >= time)) { unlink(pri_files$rds) stop("Primary PACKAGES missing or newer than replica RDS, removing") } sts <- cli_process_start("Loading metadata database") file_copy_with_time(pri_files$rds, rep_files$rds) unlock(l) private$data <- readRDS(rep_files$rds) private$data_time <- time private$data_messaged <- NULL private$update_memory_cache() cli_process_done(sts) private$data } #' Load metadata from the primary cache's PACKAGES files #' #' If they are not available, or outdated, it throws an error. #' Otherwise they are copied to the replica cache, and then used #' to create the RDS file. The RDS file is then written back to the #' primary cache and also loaded. #' #' @param self self #' @param private private self #' @param max_age Max age to consider the files current. #' @return Metadata. #' @keywords internal #' @importFrom cli cli_process_start cli_process_done cmc__load_primary_pkgs <- function(self, private, max_age) { "!!DEBUG Load replica PACKAGES*?" pri_files <- private$get_cache_files("primary") rep_files <- private$get_cache_files("replica") ## Lock mkdirp(dirname(pri_files$lock)) l <- lock(pri_files$lock, exclusive = FALSE, private$lock_timeout) if (is.null(l)) stop("Cannot acquire lock to copy PACKAGES files") on.exit(unlock(l), add = TRUE) ## Check if PACKAGES exist and current. It is OK if metadata is missing pkg_files <- pri_files$pkgs$path if (!all(file.exists(pkg_files))) { stop("Some primary PACKAGES files don't exist") } time <- file_get_time(pkg_files) if (any(Sys.time() - time > max_age)) { stop("Some primary PACKAGES files are outdated") } ## Copy to replica, if we cannot copy the etags, that's ok sts <- cli_process_start("Loading metadata database") private$copy_to_replica(rds = FALSE, pkgs = TRUE, etags = TRUE) ## Update RDS in replica, this also loads it private$update_replica_rds(alert = FALSE) ## Update primary, but not the PACKAGES private$update_primary(rds = TRUE, packages = FALSE, lock = FALSE) cli_process_done(sts) private$data } #' Update the PACKAGES files in the replica cache #' #' I.e. download them, if they have changed. #' #' @param self self #' @param private private self #' @keywords internal cmc__update_replica_pkgs <- function(self, private) { "!!DEBUG Update replica PACKAGES" tryCatch( private$copy_to_replica(rds = TRUE, pkgs = TRUE, etags = TRUE), error = function(e) e) rep_files <- private$get_cache_files("replica") pkgs <- rep_files$pkgs meta <- !is.na(pkgs$meta_url) dls <- data.frame( stringsAsFactors = FALSE, url = c(pkgs$url, pkgs$meta_url[meta]), fallback_url = c(pkgs$fallback_url, rep(NA_character_, sum(meta))), path = c(pkgs$path, pkgs$meta_path[meta]), etag = c(pkgs$etag, pkgs$meta_etag[meta]), timeout = rep(c(200, 100), c(nrow(pkgs), sum(meta))), mayfail = TRUE ) download_files(dls)$ then(function(result) { missing_pkgs_note(pkgs, result) result }) } # E.g. "R 4.1 macos packages are missing from CRAN and Bioconductor" missing_pkgs_note <- function(pkgs, result) { bad <- vlapply(result[seq_len(nrow(pkgs))], inherits, "error") if (!any(bad)) return() repo_name <- function(type, url) { if (type == "cran") return("CRAN") if (type == "bioc") return("Bioconductor") sub("^https?://([^/]*).*$", "\\1", url) } msgs <- lapply(which(bad), function(i) { list( paste0( if (pkgs$r_version[i] != "*") paste0("R ", pkgs$r_version[i], " "), pkgs$platform[i] ), repo_name(pkgs$type[i], pkgs$mirror[i]) ) }) what <- vcapply(msgs, "[[", 1) where <- vcapply(msgs, "[[", 2) for (wt in unique(what)) { wh <- unique(where[what == wt]) cli_alert_info("{wt} packages are missing from {wh}") } } #' Update the replica RDS from the PACKAGES files #' #' Also loads it afterwards. #' #' @param self self #' @param private private self #' @param alert whether to show message about the update #' @keywords internal #' @importFrom cli cli_process_start cli_process_done cmc__update_replica_rds <- function(self, private, alert) { "!!DEBUG Update replica RDS" if (alert) sts <- cli_process_start("Updating metadata database") rep_files <- private$get_cache_files("replica") data_list <- lapply_rows( rep_files$pkgs, function(r) { rversion <- if (r$platform == "source") "*" else private$r_version tryCatch( read_packages_file(r$path, mirror = r$mirror, repodir = r$basedir, platform = r$platform, rversion = rversion, type = r$type, meta_path = r$meta_path), error = function(x) { message() warning(msg_wrap( "Cannot read metadata information from `", r$path, "`. ", "The file is corrupt. Try deleting the metadata cache with ", "`pkgcache::meta_cache_cleanup()` or the `$cleanup()` method"), immediate. = TRUE) NULL } ) }) data_list <- data_list[!vlapply(data_list, is.null)] if (length(data_list) == 0) stop("No metadata available") private$data <- merge_packages_data(.list = data_list) saveRDS(private$data, file = rep_files$rds, version = 2) private$data_time <- file_get_time(rep_files$rds) private$data_messaged <- NULL private$update_memory_cache() if (alert) cli_process_done(sts) private$data } #' Update the primary cache from the replica cache #' #' @param self self #' @param private private self #' @param rds Whether to update the RDS file. #' @param packages Whether to update the PACKAGES files (+ ETag files). #' @return Nothing. #' #' @keywords internal cmc__update_primary <- function(self, private, rds, packages, lock) { "!!DEBUG Updata primary cache" if (!rds && !packages) return() pri_files <- private$get_cache_files("primary") rep_files <- private$get_cache_files("replica") if (lock) { mkdirp(dirname(pri_files$lock)) l <- lock(pri_files$lock, exclusive = TRUE, private$lock_timeout) if (is.null(l)) stop("Cannot acquire lock to update primary cache") on.exit(unlock(l), add = TRUE) } if (rds) { file_copy_with_time(rep_files$rds, pri_files$rds) } if (packages) { file_copy_with_time(rep_files$pkgs$path, pri_files$pkgs$path) file_copy_with_time(rep_files$pkgs$etag, pri_files$pkgs$etag) file_copy_with_time(na_omit(rep_files$pkgs$meta_path), na_omit(pri_files$pkgs$meta_path)) file_copy_with_time(na_omit(rep_files$pkgs$meta_etag), na_omit(pri_files$pkgs$meta_etag)) } invisible() } cmc__update_memory_cache <- function(self, private) { rds <- private$get_cache_files("primary")$rds cmc__data[[rds]] <- list(data = private$data, data_time = private$data_time) } cmc__copy_to_replica <- function(self, private, rds, pkgs, etags) { pri_files <- private$get_cache_files("primary") rep_files <- private$get_cache_files("replica") mkdirp(dirname(pri_files$lock)) l <- lock(pri_files$lock, exclusive = FALSE, private$lock_timeout) if (is.null(l)) stop("Cannot acquire lock to copy primary cache") on.exit(unlock(l), add = TRUE) if (rds) { file_copy_with_time(pri_files$rds, rep_files$rds) } if (pkgs) { file_copy_with_time(pri_files$pkgs$path, rep_files$pkgs$path) file_copy_with_time(na_omit(pri_files$pkgs$meta_path), na_omit(rep_files$pkgs$meta_path)) } if (etags) { file_copy_with_time(pri_files$pkgs$etag, rep_files$pkgs$etag) file_copy_with_time(na_omit(pri_files$pkgs$meta_etag), na_omit(rep_files$pkgs$meta_etag)) } } extract_deps <- function(pkgs, packages, dependencies, recursive) { realdep <- interpret_dependencies(dependencies) dep <- tolower(realdep$direct) new <- packages repeat { new <- setdiff( pkgs$deps$package[pkgs$deps$upstream %in% new & pkgs$deps$type %in% dep], packages) if (!length(new)) break packages <- c(packages, new) if (!recursive) break dep <- tolower(realdep$indirect) } packages <- setdiff(packages, "R") res <- pkgs$pkgs[pkgs$pkgs$package %in% packages, ] base <- intersect(packages, base_packages()) attr(res, "base") <- base attr(res, "unknown") <- setdiff(packages, c(res$package, base)) res } extract_revdeps <- function(pkgs, packages, dependencies, recursive) { realdep <- interpret_dependencies(dependencies) dep <- tolower(realdep$direct) new <- packages repeat { new <- setdiff( pkgs$deps$upstream[pkgs$deps$ref %in% new & pkgs$deps$type %in% dep], packages) if (!length(new)) break packages <- c(packages, new) if (!recursive) break dep <- tolower(realdep$indirect) } packages <- setdiff(packages, "R") res <- pkgs$pkgs[pkgs$pkgs$package %in% packages, ] base <- intersect(packages, base_packages()) attr(res, "base") <- base attr(res, "unknown") <- setdiff(packages, c(res$package, base)) res } cmc__get_repos <- function(repos, bioc, cran_mirror, r_version) { repos[["CRAN"]] <- cran_mirror repos <- unlist(repos) bioc_names <- bioconductor$get_repos() res <- tibble( name = names(repos), url = unname(repos), type = ifelse( names(repos) == "CRAN", "cran", ifelse(names(repos) %in% bioc_names, "bioc", "cranlike") ), r_version = "*", bioc_version = NA_character_ ) if (bioc) { for (rver in r_version) { bioc_version <- as.character(bioconductor$get_bioc_version(rver)) bioc_repos <- bioconductor$get_repos(bioc_version) bioc_res <- tibble( name = names(bioc_repos), url = unname(bioc_repos), type = "bioc", r_version = rver, bioc_version = bioc_version ) res <- rbind(res, bioc_res) } } res <- res[!duplicated(res$url), ] res } #' Query CRAN(like) package data #' #' It uses CRAN and BioConductor packages, for the current platform and #' R version, from the default repositories. #' #' `meta_cache_list()` lists all packages. #' #' `meta_cache_update()` updates all metadata. Note that metadata is #' automatically updated if it is older than seven days. #' #' `meta_cache_deps()` queries packages dependencies. #' #' `meta_cache_revdeps()` queries reverse package dependencies. #' #' `meta_cache_summary()` lists data about the cache, including its location #' and size. #' #' `meta_cache_cleanup()` deletes the cache files from the disk. #' #' @param packages Packages to query. #' @param dependencies Dependency types to query. See the `dependencies` #' parameter of [utils::install.packages()]. #' @param recursive Whether to query recursive dependencies. #' @param force Whether to force cleanup without asking the user. #' @return A data frame (tibble) of the dependencies. For #' `meta_cache_deps()` and `meta_cache_revdeps()` it includes the #' queried `packages` as well. #' #' @export #' @examplesIf pkgcache:::run_examples() #' meta_cache_list("pkgdown") #' meta_cache_deps("pkgdown", recursive = FALSE) #' meta_cache_revdeps("pkgdown", recursive = FALSE) meta_cache_deps <- function(packages, dependencies = NA, recursive = TRUE) { get_cranlike_metadata_cache()$deps(packages, dependencies, recursive) } #' @export #' @rdname meta_cache_deps meta_cache_revdeps <- function(packages, dependencies = NA, recursive = TRUE) { get_cranlike_metadata_cache()$revdeps(packages, dependencies, recursive) } #' @export #' @rdname meta_cache_deps meta_cache_update <- function() { invisible(get_cranlike_metadata_cache()$update()$pkgs) } #' @export #' @rdname meta_cache_deps meta_cache_list <- function(packages = NULL) { get_cranlike_metadata_cache()$list(packages) } #' @export #' @rdname meta_cache_deps meta_cache_cleanup <- function(force = FALSE) { get_cranlike_metadata_cache()$cleanup(force = force) } #' @export #' @rdname meta_cache_deps meta_cache_summary <- function() { get_cranlike_metadata_cache()$summary() }