Retrieve pre-computed influence scores between BANC neurons from partitioned parquet files stored on Google Cloud Storage. Influence scores quantify how much a "seed" (upstream) neuron's activity affects a "target" (downstream) neuron's steady-state response, based on the connectome's synaptic weight matrix (see Bates et al. 2020).
Usage
banc_influence(
upstream_ids = NULL,
downstream_ids = NULL,
const = 24,
min_score = 0,
method = c("arrow", "duckdb"),
local_path = NULL,
force_download = FALSE
)Arguments
- upstream_ids
Character vector of upstream (seed) neuron root IDs. If
NULL, all upstream neurons are included (use with caution — dataset is very large).- downstream_ids
Character vector of downstream (target) neuron root IDs. If
NULL, all downstream neurons are included.- const
Numeric constant for adjusted influence calculation. Adjusted influence =
max(0, log(raw_influence) + const). Default 24 corresponds to a minimum meaningful influence ofexp(-24)(approx 3.78e-11).- min_score
Minimum adjusted influence score to return. Pairs with adjusted influence below this threshold are filtered out. Default 0 returns all pairs with non-zero adjusted influence.
- method
Character, either
"arrow"or"duckdb". Controls which backend is used to read the parquet files.- local_path
Path to a local directory containing the parquet files. If
NULL(default), uses a cache directory undertools::R_user_dir("bancr", "cache").- force_download
Logical. If
TRUE, re-download parquet files from GCS even if a local cache exists. DefaultFALSE.
Value
A data.frame with columns:
- upstream_id
Character. Root ID of the upstream (seed) neuron.
- downstream_id
Character. Root ID of the downstream (target) neuron.
- raw_influence
Numeric. Raw steady-state influence score.
- adjusted_influence
Numeric.
max(0, log(raw_influence) + const).
Details
The parquet files contain columns upstream_id,
downstream_id, and raw_influence. Adjusted influence is
computed on-the-fly as log(raw_influence) + const, floored at 0.
Data is read from a local cache directory. On first use (or when
force_download=TRUE), parquet files are downloaded from GCS using
gsutil. Subsequent calls read directly from the cache.
The "arrow" method uses open_dataset with
predicate pushdown to scan only relevant chunks. The "duckdb"
method registers a DuckDB view over the parquet files for fast SQL-based
filtering. For small queries (few upstream/downstream IDs), both methods
are fast; for large scans, "duckdb" may be faster.
References
Bates, A.S., Schlegel, P., Roberts, R.J.V. et al. Complete Connectomic Reconstruction of Olfactory Projection Neurons in the Fly Brain. Curr Biol 30, 3183-3199.e6 (2020).
Examples
if (FALSE) { # \dontrun{
# Get influence of one neuron on all targets
inf <- banc_influence(upstream_ids = "720575941521131930")
head(inf)
# Get influence between specific pairs
inf <- banc_influence(
upstream_ids = c("720575941521131930", "720575941478275714"),
downstream_ids = c("720575941555924992")
)
# Use duckdb backend
inf <- banc_influence(
upstream_ids = "720575941521131930",
method = "duckdb"
)
# Only return strong connections
inf <- banc_influence(
upstream_ids = "720575941521131930",
min_score = 5
)
} # }
