Check and filter the stratified data by minimum required samples for modelling, and prepare data format for use by models.
Usage
prepare_data(
strata_data,
min_year = NULL,
max_year = NULL,
min_n_routes = 3,
min_max_route_years = 3,
min_mean_route_years = 1,
quiet = FALSE,
assume_observer_variation_log_normal = FALSE
)Arguments
- strata_data
List. Stratified data generated by
stratify()- min_year
Numeric. Minimum year to use. Default (
NULL) uses first year in data.- max_year
Numeric. Maximum year to use. Default (
NULL) uses first year in data.- min_n_routes
Numeric. Required minimum routes per strata where species has been observed. Default 3.
- min_max_route_years
Numeric. Required minimum number of years with non-zero observations of species on at least 1 route. Default 3. Only retain strata with at least one route where the species was observed at least once in this many years.
- min_mean_route_years
Numeric. Required minimum average of years per route with the species observed. Default 1. Only retain strata where the average number of years the species was observed per route is greater than this value.
- quiet
Logical. Suppress progress messages? Default
FALSE.- assume_observer_variation_log_normal
Logical. Default FALSE. If FALSE, the model will generate indices that adjust only for temporal variation in observers and so generate annual indices that are the expected counts in a given year averaged across all observers and all routes in the stratum. FALSE option may be useful in situations where one or more of the 3 assumptions above are questionable, including where the observer variation is not log-normal (e.g., heavy-tailed, or skewed). The difference can be important because it can change the relative scaling of annual indices among strata, which in turn influences the weight of each strata-level trend on trends for composite regions. If TRUE, the annual indices adjust for both spatial and temporal variation in observers by using the same
retrans_obsretransformation factor for all strata, and so generate annual indices that are the expected counts in a given year averaged across all routes in the stratum and assume all observer variation is 1) noise, 2) approximately log-normally distributed (retrans_obs = 0.5*sdobs^2), and observers are exchangeable. This alternate TRUE is a more appropriate estimate of the uncertainty in an expected count, generating a prediction error across all observers in the analysis, however even small departures from the assumption of log-normally distributed errors can introduce bias into the annual indices, relative to the observed counts in the dataset.
Value
List of prepared (meta) data to be used for modelling and further steps.
model_data- list of data formatted for use in Stan modellingmeta_data- meta data defining the analysismeta_strata- data frame listing strata meta dataraw_data- data frame of summarized counts used to createmodel_data(just formatted more nicely)
See also
Other Data prep functions:
prepare_model(),
prepare_spatial(),
stratify()
Examples
# Toy example with Pacific Wren sample data
# First, stratify the sample data
s <- stratify(by = "bbs_cws", sample_data = TRUE)
#> Using 'bbs_cws' (standard) stratification
#> Using sample BBS data...
#> Using species Pacific Wren (sample data)
#> Filtering to species Pacific Wren (7221)
#> Stratifying data...
#> Combining BCR 7 and NS and PEI...
#> Renaming routes...
# Prepare the stratified data for use in a model. In this
# toy example, we will set the minimum year as 2009 and
# maximum year as 2018, effectively only setting up to
# model 10 years of data.
p <- prepare_data(s, min_year = 2009, max_year = 2018)
