Finetune step size

This tutorial demonstrates how to fine-tune the step size parameter in StarTrail. We note that this is a heuristic approach, as tuning the step size parameter is inherently challenging due to the fact that we could not observe spatial gradients.

Recommended Step Size

For regular grid data, we recommend using the minimal separation ι or 0.8 × ι, as used in all our analyses. However, if your spatial coordinates are irregular (e.g., having different densities in different regions), we present below a simulation-based approach to help you select a potentially better step size for your data.

Calculate Minimal Separation

First, calculate the minimal separation between spots in your spatial coordinates:

coord_dist = cdist_r(coords,coords)
coord_dist_temp  = coord_dist
diag(coord_dist_temp)=10000
min_sep = min(coord_dist_temp)
print(min_sep)

Generate Synthetic Pattern

Next, we generate a synthetic spatial pattern to test different step sizes. This allows us to evaluate performance against a known ground truth:

tau = 1
# Synthetic pattern
y <- rnorm(nrow(coords), 10*(sin(3*pi*coords[,1])+cos(3*pi*coords[,2])), tau)
thread = 10
m.r = fit_NNGP(coords, y, neighbor = 10, threads = thread) # here we use 10 neighbors

Evaluate Different Step Sizes

Now we test different step size scales and evaluate their performance using correlation and mean squared error (MSE) metrics:

path = './result/'

set.seed(1)
result = data.table(scale = c(0.01, 0.1, 0.5, 0.8, 1, 5, 10,20,100))
result$cor_g1 = NA; result$cor_g2 = NA; result$mse_g1 = NA; result$mse_g2 = NA

for(i in 1:nrow(result)){
    scale = result$scale[i]
    gradient_all = finite_difference(coords, min_sep*scale, m.r, threads=thread,
                                     prefix = paste0('scale', scale), path=path)
    gradient_all = cbind(coords, y, gradient_all)
    colnames(gradient_all) = c('s1', 's2', 'y', 'pred', 'g1', 'g2',
                               'g1_min', 'g1_max', 'g2_min', 'g2_max')

    result$cor_g1[i] = cor(gradient_all$g1, 30*cos(3*pi*gradient_all$s1))
    result$cor_g2[i] = cor(gradient_all$g2, -30*sin(3*pi*gradient_all$s2))
    result$mse_g1[i] = mean((gradient_all$g1 - 30*cos(3*pi*gradient_all$s1))^2)
    result$mse_g2[i] = mean((gradient_all$g2 + 30*sin(3*pi*gradient_all$s2))^2)
}

The result table contains the correlation and MSE for each step size scale. Choose the scale that provides the best balance between correlation (higher is better) and MSE (lower is better) for your specific dataset.

PREVIOUSSimulation

NEXTHuman DLPFC