Skip to content

Commit

Permalink
Merge pull request #16 from morrowcj/maxdist_fitcor
Browse files Browse the repository at this point in the history
vastly improved speed of fitCor and fixed memory error.
  • Loading branch information
morrowcj authored Dec 16, 2022
2 parents 07261a1 + 1a05f78 commit 0e6475e
Show file tree
Hide file tree
Showing 6 changed files with 116 additions and 32 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: remotePARTS
Title: Spatiotemporal Autoregression Analyses for Large Data Sets
Version: 1.0.1
Version: 1.0.2
Authors@R:
c(person(given = "Clay",
family = "Morrow",
Expand Down
7 changes: 7 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
# v1.0.2

* fixed a bug where `fitCor()` was calculating a full distance matrix instead
of one among only the sampled pixels.
* added option to suppress saving the `nls` model in `fitCor()`
* added $\rho$ to the output of `fitAR_map()`

# v1.0.1

fixed bug where `fitGLS_partition` would not work properly when `ncores > 1` and
Expand Down
48 changes: 26 additions & 22 deletions R/fitCor.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
#' For accurate results, \code{resids} and \code{coords} must be paired matrices.
#' Rows of both matrices should correspond to the same pixels.
#'
#' Distances between pixels are calculated with the function specified by
#' Distances between sapmled pixels are calculated with the function specified by
#' \code{distm_FUN}. This function can be any that takes a coordinate
#' matrix as input and returns a distance matrix between points. Some options
#' provided by \code{remotePARTS} are \code{distm_km()}, which returns distances
Expand All @@ -48,15 +48,26 @@
#' results, we recommend taking a random sample of pixels manually and passing in
#' those values as \code{index}.
#'
#' Parameter estimates always match the scale of distances calculated by \code{distm_FUN}.
#' It is very important that you re-scale these estimates if needed. For example,
#' Caution: Note that a distance matrix, of size \eqn{n \times n} must be fit to the
#' sampled data where \eqn{n} is either \code{fit.n} or \code{length(index)}.
#' Take your computer's memory and processing time into consideration when choosing
#' this size.
#'
#' Parameter estimates are always returned in the same scale of distances
#' calculated by \code{distm_FUN}. It is very important that these estimates
#' are re-scaled by users if output of \code{distm_FUN} use units different from
#' the desired scale. For example,
#' if the function \code{covar_FUN = function(d, r, a){-(d/r)^a}} is used
#' with \code{distm_FUN = "distm_scaled"}, the estimated range parameter \code{r}
#' can be re-scaled to units of your map by multiplying \code{r} by the
#' \code{max.distance}. The shape parameter \code{a} would not need re-scaling.
#' will be based on a unit-map. Users will likely want to re-scaled it to map
#' units by multiplying \code{r} by the maximum distance among points on your map.
#'
#' note that when \code{distm_FUN = "distm_scaled"}, \code{max.distance} is
#' the maximum distance before re-scaling (in km by default).
#' If the \code{distm_FUN} is on the scale of your map (e.g., "distm_km"),
#' re-scaling is not needed but may be preferable, since it is scaled to the
#' maximum distance among the sampled data rather than the true maximum
#' distance. For example, dividing the range parameter by \code{max.distance}
#' and then multiplying it by the true max distance may provide a better range
#' estimate.
#'
#' @return \code{fitCor} returns a list object of class "remoteCor", which contains
#' these elements:
Expand All @@ -65,7 +76,7 @@
#' \item{call}{the function call}
#' \item{mod}{the \code{nls} fit object, if \code{save_mod=TRUE}}
#' \item{spcor}{a vector of the estimated spatial correlation parameters}
#' \item{max.distance}{the maximum distance between pixels. Units are determined by \code{distm_FUN}}
#' \item{max.distance}{the maximum distance among the sampled pixels, as calculated by \code{dist_FUN}.}
#' \item{logLik}{the log-likelihood of the fit}
#' }
#'
Expand Down Expand Up @@ -135,18 +146,9 @@ fitCor <- function(resids, coords, distm_FUN = "distm_scaled", covar_FUN = "cova
sub.inx = index
} else { # random pixels
# dimensions
n = nrow(resids); p = ncol(resids)
fit.n = ifelse(fit.n < n, fit.n, n)

# max distance
max.d = dist.f(coords)
if("max.dist" %in% names(attributes(max.d))){
max.d = attr(max.d, "max.dist")
} else {
max.d = max(max.d, na.rm = TRUE)
}

# subset
n = nrow(resids)
fit.n = min(fit.n, n) # prevent fit.n > n
# random subset
sub.inx = sample.int(n, fit.n)
}
resids = resids[sub.inx, ]
Expand All @@ -157,6 +159,7 @@ fitCor <- function(resids, coords, distm_FUN = "distm_scaled", covar_FUN = "cova

# calculate scaled distance
D = dist.f(coords)
max_d = max(D) # max distance among sub-samples

# convert matrices to vectors and combine to data frame
w = data.frame(dist = D[upper.tri(D, diag = TRUE)],
Expand All @@ -176,13 +179,14 @@ fitCor <- function(resids, coords, distm_FUN = "distm_scaled", covar_FUN = "cova
if (err.bool) {
stop("nls failed to find a solution with the following error: ", "'", err$message, "'",
"\n Are values given by start on the same scale as distances calculated with distm_FUN?",
"\n If so, try different starting values or a different covar_FUN")
"\n If so, try different starting values or a different covar_FUN.",
"\n Simply re-running the funciton with different samples may also yield a solution.")
}

spcor = coef(fit)

out <- list(call = call, mod = if(save_mod){fit}else{NULL}, spcor = spcor,
max.distance = max.d, logLik = logLik(fit))
max.distance = max_d, logLik = logLik(fit))
class(out) <- append("remoteCor", class(out))
return(out)
}
Expand Down
29 changes: 29 additions & 0 deletions R/spatial-functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -176,3 +176,32 @@ covar_exppow <- function(d, range, shape){
exp(-(d/range)^shape)
}


#' calculate maximum distance among a table of coordinates
#'
#' @param coords the coordinate matrix (or dataframe) from which a maximum distance is desired.
#' @param dist_FUN the distance function used to calculate distances
#'
#' @details First the outermost points are found by fitting a convex hull in
#' Euclidean space. Then, the distances between these outer points is calculated
#' with \code{dist_FUN}, and the maximum of these distances is returned
#'
#' This is a fast, simple way of determining the maximum distance.
#'
#' @return The maximum distance between two points (units determined by
#' \code{dist_FUN})
#'
#' @examples
#' coords <- matrix(stats::rnorm(20e6), ncol = 2) # cloud of 20 million pixels
#' remotePARTS:::max_dist(coords)
#'
#' remotePARTS:::max_dist(coords, dist_FUN = "distm_scaled")
max_dist <- function(coords, dist_FUN = "distm_km"){
convex_hull = grDevices::chull(coords)
boundary_coords = coords[convex_hull, ]
# match distance function
dist.f = match.fun(dist_FUN)
D = dist.f(boundary_coords)
max_d = max(D)
return(max(D))
}
29 changes: 20 additions & 9 deletions man/fitCor.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

33 changes: 33 additions & 0 deletions man/max_dist.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 0e6475e

Please sign in to comment.