Skip to contents
library(bbsBayes2)
library(sf)       # Spatial data manipulations
library(dplyr)    # General data manipulations
library(ggplot2)  # Plotting
library(patchwork) # mutli-plot

In this vignette we’ll explore the various ways you can stratify the BBS data in preparation for running the models.

You can use existing, pre-defined stratifications, subset an existing stratification (e.g., clip the data to your area of interest), or load your own custom stratification, either using a completely new set of spatial data, or by modifying the spatial polygons of an existing strata.

This vignette assumes that the BBS data have already been downloaded and that you are familiar with the basics of the bbsBayes2 workflow

Stratifying with built-in stratifications

The built-in stratifications are bbs_usgs, bbs_cws, bcr, latlong, prov_state.

  • bbs_cws – Intersections of Political regions X Bird Conservation Regions (Stratification used by the Canadian Wildlife Service [CWS] for national status reporting)
  • bbs_usgs – Intersections of Political regions X Bird Conservation Regions (Stratification used by the United Status Geological Survey [USGS] for national status reporting)
  • bcr – Bird Conservation Regions only
  • prov_state – Political regions only - states, provinces, and territories
  • latlong – Grid-cells of 1 degree of latitude X 1 degree of longitude, aka “degree-blocks”. These are the original survey design strata for the BBS. Routes are established at randomized locations within these degree-blocks.

You can visualize these stratifications by looking at the maps included in bbsBayes2 with load_map().

ggplot(data = load_map("bbs_usgs"), aes(fill = strata_name)) +
  geom_sf() +
  scale_fill_viridis_d(guide = "none")

Map of the BBS USGS stratification showing the regions that represent spatial intersections of Bird Conservation Regions with political jurisdictions (Provinces, Territories, States)

To stratify BBS data, you can use these existing stratifications by specifying by = "name" in the stratify() function.

s <- stratify(by = "bbs_usgs", species = "Canada Jay")
#> Using 'bbs_usgs' (standard) stratification
#> Loading BBS data...
#> Filtering to species Canada Jay (4840)
#> Stratifying data...
#>   Renaming routes...

The latlong stratification - special note

The latlong stratification by = "latlong" is the finest-scale stratification built into the package, and so it divides the BBS data into many more strata-units than other stratifications. Therefore, you may wish to adjust the minimum data inclusion criteria when preparing the data. Specifically, setting min_n_routes = 1 ensures that every grid-cell with at least one BBS route can be included. There are many degree-blocks that have only one route, as this is the original sampling design goal of the BBS (at least one route within each degree-block).

s <- stratify(by = "latlong", species = "Canada Jay")
#> Using 'latlong' (standard) stratification
#> Loading BBS data...
#> Filtering to species Canada Jay (4840)
#> Stratifying data...
#>   Renaming routes...
#>   Omitting 80/119,567 route-years that do not match a stratum.
#>     To see omitted routes use `return_omitted = TRUE` (see ?stratify)
p <- prepare_data(s, min_n_routes = 1)

Custom stratifications

bbsBayes2 can stratify the BBS data using any polygon map as input.

Load a custom stratification map

To define a completely different stratification, you’ll need to provide a spatial data object with polygons defining your strata.

In our example we’ll use WBPHS stratum boundaries for 2019. This is available from available from the US Fish and Wildlife Service Catalogue: https://ecos.fws.gov/ServCat/Reference/Profile/142628

You can either download it by hand, or with the following code.

z <- "output/WBPHS_Stratum_Boundaries_2019.zip" #

download.file(url = "https://ecos.fws.gov/ServCat/DownloadFile/213149",
              destfile = z,
              cacheOK = FALSE)
unzip(z)  # Unzip - if you get a file is corrupt message, download it manually
unlink(z) # Remove the zipped file

The zip file is also available in this Google Drive

To use this file in bbsBayes2, we need to load it as an sf object using the sf package.

map <- sf::read_sf("output/WBPHS_stratum_boundaries_2019.shp")
ggplot(map, aes(fill = factor(STRAT))) +
  geom_sf() +
  scale_fill_viridis_d(guide = "none")

Map of the strata used in the Waterfowl Breeding Population and Habitat Surveys ### Identify the strata names

We see that it has one column that reflects the stratum names. First we’ll rename this column to strata_name, so that the stratify() function knows what attribute includes the names that define each stratum.

map <- rename(map, strata_name = STRAT)

Stratify the data

Now we have the spatial data and relevant information to pass to stratify().

When using a custom stratification, the by argument is just a user-defined arbitrary name to document which stratification was used. This name gets passed into the meta data of the following steps and the final fitted model. Let’s use something informative, but short (although there’s no limit). We also need to give the function our map.

s <- stratify(by = "WBPHS_2019", species = "Canada Jay", strata_custom = map)
#> Using 'wbphs_2019' (custom) stratification
#> Loading BBS data...
#> Filtering to species Canada Jay (4840)
#> Stratifying data...
#> Preparing custom strata (EPSG:4326; WGS 84)...
#>   Summarizing strata...
#>   Calculating area weights...
#>   Joining routes to custom spatial data...
#>   Renaming routes...
#>   Omitting 100,155/119,567 route-years that do not match a stratum.
#>     To see omitted routes use `return_omitted = TRUE` (see ?stratify)

Note that strata names are automatically put into lower case for consistency.

We can take a quick look at the output, by looking at the meta data and routes contained therein.

s[["meta_data"]]
#> $stratify_by
#> [1] "wbphs_2019"
#> 
#> $stratify_type
#> [1] "custom"
#> 
#> $species
#> [1] "Canada Jay"

s[["routes_strata"]]
#> # A tibble: 19,412 × 33
#>    strata_name country_num state_num route route_name  active latitude longitude   bcr route_type_id
#>    <chr>             <dbl>     <dbl> <chr> <chr>        <dbl>    <dbl>     <dbl> <dbl>         <dbl>
#>  1 3                   840         3 3-8   TOWER BLUF…      1     63.6     -144.     4             1
#>  2 3                   840         3 3-8   TOWER BLUF…      1     63.6     -144.     4             1
#>  3 3                   840         3 3-8   TOWER BLUF…      1     63.6     -144.     4             1
#>  4 3                   840         3 3-8   TOWER BLUF…      1     63.6     -144.     4             1
#>  5 3                   840         3 3-8   TOWER BLUF…      1     63.6     -144.     4             1
#>  6 3                   840         3 3-8   TOWER BLUF…      1     63.6     -144.     4             1
#>  7 3                   840         3 3-8   TOWER BLUF…      1     63.6     -144.     4             1
#>  8 3                   840         3 3-8   TOWER BLUF…      1     63.6     -144.     4             1
#>  9 3                   840         3 3-8   TOWER BLUF…      1     63.6     -144.     4             1
#> 10 3                   840         3 3-8   TOWER BLUF…      1     63.6     -144.     4             1
#> # ℹ 19,402 more rows
#> # ℹ 23 more variables: route_type_detail_id <dbl>, route_data_id <dbl>, rpid <dbl>, year <dbl>,
#> #   month <dbl>, day <dbl>, obs_n <dbl>, total_spp <dbl>, start_temp <dbl>, end_temp <dbl>,
#> #   temp_scale <chr>, start_wind <dbl>, end_wind <dbl>, start_sky <dbl>, end_sky <dbl>,
#> #   start_time <dbl>, end_time <dbl>, assistant <dbl>, quality_current_id <dbl>, run_type <dbl>,
#> #   state <chr>, st_abrev <chr>, country <chr>

Visualise the new strata and data

To get a different look we can also plot this data on top of our map using ggplot2. Note that we use factor() to ensure the strata names are categorical.

rts <- s[["routes_strata"]] %>%
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326)

ggplot() +
  geom_sf(data = map, aes(fill = factor(strata_name)), alpha = 0.3) +
  geom_sf(data = rts, aes(colour = factor(strata_name)), size = 1) +
  scale_fill_viridis_d(aesthetics = c("colour", "fill"), guide = "none")

map showing the strata in our custom stratification ### Omitted BBS routes

Based on the message we received during stratification (Omitting...) and this map, it looks as if our custom stratification is excluding some BBS data (i.e., routes with starting locations that are not overlapped by the strata map). This makes sense because the WBPHS survey area is much smaller than the region covered by the BBS. However, let’s confirm that the excluded routes are the ones we expect.

We can re-run the stratification with return_omitted = TRUE which will attach a data frame of omitted strata to the output.

s <- stratify(by = "WBPHS_2019", species = "Canada Jay", strata_custom = map,
              return_omitted = TRUE)
#> Using 'wbphs_2019' (custom) stratification
#> Loading BBS data...
#> Filtering to species Canada Jay (4840)
#> Stratifying data...
#> Preparing custom strata (EPSG:4326; WGS 84)...
#>   Summarizing strata...
#>   Calculating area weights...
#>   Joining routes to custom spatial data...
#>   Renaming routes...
#>   Omitting 100,155/119,567 route-years that do not match a stratum.
#>     Returning omitted routes.
s[["routes_omitted"]]
#> # A tibble: 100,155 × 11
#>     year strata_name country state   route route_name latitude longitude   bcr   obs_n total_spp
#>    <dbl> <chr>       <chr>   <chr>   <chr> <chr>         <dbl>     <dbl> <dbl>   <dbl>     <dbl>
#>  1  1967 <NA>        US      ALABAMA 2-1   ST FLORIAN     34.9     -87.6    27 1140018        56
#>  2  1969 <NA>        US      ALABAMA 2-1   ST FLORIAN     34.9     -87.6    27  990062        52
#>  3  1970 <NA>        US      ALABAMA 2-1   ST FLORIAN     34.9     -87.6    27  990062        52
#>  4  1971 <NA>        US      ALABAMA 2-1   ST FLORIAN     34.9     -87.6    27  990062        56
#>  5  1972 <NA>        US      ALABAMA 2-1   ST FLORIAN     34.9     -87.6    27  990062        54
#>  6  1973 <NA>        US      ALABAMA 2-1   ST FLORIAN     34.9     -87.6    27 1060057        52
#>  7  1974 <NA>        US      ALABAMA 2-1   ST FLORIAN     34.9     -87.6    27 1060057        55
#>  8  1975 <NA>        US      ALABAMA 2-1   ST FLORIAN     34.9     -87.6    27 1060057        59
#>  9  1976 <NA>        US      ALABAMA 2-1   ST FLORIAN     34.9     -87.6    27 1060057        56
#> 10  1977 <NA>        US      ALABAMA 2-1   ST FLORIAN     34.9     -87.6    27 1060057        51
#> # ℹ 100,145 more rows

Let’s take a look.

omitted <- st_as_sf(s[["routes_omitted"]], coords = c("longitude", "latitude"),
                    crs= 4326)

ggplot() +
  geom_sf(data = map, aes(fill = factor(strata_name)), alpha = 0.3) +
  geom_sf(data = rts, aes(colour = factor(strata_name)), size = 1, alpha = 0.5) +
  geom_sf(data = omitted, size = 0.75, alpha = 0.5) +
  scale_fill_viridis_d(aesthetics = c("colour", "fill"), guide = "none")

map showing BBS route starting locations that are inside and outside of the custom stratification

The map shows that most of the omitted routes are routes that are clearly outside of our desired stratification. However, it also shows that there are some BBS route start-points that are just outside of the strata (e.g., some routes in Nova Scotia and Alaska). The user can decide what to do with these sorts of minor overlap issues. For example, buffering the original stratification map might make sense in some situations. For now, we will trust our custom strata map and retain only the BBS routes with start locations inside our strata polygons.

Fitting the model

To fit the model, we follow the standard workflow using our stratified data.

p <- prepare_data(s,
                  min_year = 2000,
                  max_year = 2021) #subset a shorter time-span to speed model-fit
mp <- prepare_model(p,model = "slope",
                   model_variant = "hier")
m <- run_model(mp,
               iter_warmup = 500,
               iter_sampling = 100)

Predictions from the model using the custom stratification

Now we can start to look at the indices and trends related to our model.

We can apply the generate_indices() and generate_trends() functions to the output from our model, the same as we would with the built-in stratifications.

i <- generate_indices(m)
#> Processing region continent
#> Processing region stratum

t <- generate_trends(i)

And with one additional argument, we can also use the plot_map() function.

trend_map <- plot_map(t, strata_custom = map)
trend_map

Generating state and province predictions from a custom stratification

A useful feature of the hierarchical Bayesian models for the BBS is the ability to generate formal estimates of indices (annual relative abundance) and trends (rates of population change) for any composite region. Formal estimates meaning we can estimate the full posterior distribution, including a point estimate and its associated uncertainty (credible limits). These composite regions can be defined based on any combination of the underlying strata used to fit the model. For example, using any custom stratification, we can generate estimates for political jurisdictions (countries, states, provinces), as long as we can comfortably designate each of the strata to one of these jurisdictions.

By default, generate_indices() creates indices at two levels “continent” (the combination of all strata used in the analysis) and “stratum” (estimates for individual strata). For the two bbs stratifications (“bbs_usgs” and “bbs_cws”), we can also add “prov_state”, “bcr”, “bcr_by_country” (where appropriate). For any custom stratification, we can also add the political jurisdictions and/or create our own regional divisions and provide them as a regions_index data frame.

For example, let’s imagine we would like to calculate regional indices for each stratum, country, province/state, as well as for a custom division of eastern and western regions.

First we’ll need to tell the function which strata belong to which province or state, and then which belong to the ‘east’ and which to the ’west.

We’ll start by using a helper function assign_prov_state(). This function takes a map of strata and assigns each strata to a province or state depending on the amount of overlap. By default it will warn if the amount of overlap is less than 75%, but in this case, we will lower that cutoff to 60%. The plot gives us a chance to make a quick assessment of whether we’re happy with how the various strata have been assigned.

rindex <- assign_prov_state(map, min_overlap = 0.6, plot = TRUE)
#> The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
#> which was just loaded, will retire in October 2023.
#> Please refer to R-spatial evolution reports for details, especially
#> https://r-spatial.org/r/2023/05/15/evolution4.html.
#> It may be desirable to make the sf package available;
#> package maintainers should consider adding sf to Suggests:.
#> The sp package is now running under evolution status 2
#>      (status 2 uses the sf package in place of rgdal)

Next we’ll define the east/west divide by hand. If we plot the strata by name, we can pick out which are eastern and which western.


ggplot(rindex) +
  geom_sf(data = load_map(type = "North America")) +
  geom_sf() +
  geom_sf_text(aes(label = strata_name))
#> Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not give correct results
#> for longitude/latitude data

The western and eastern strata seem to be split numerically, such that the western strata have numbers lower than 50, eastern strata have numbers higher 74. So we’ll add a column to the rindex dataframe with “east” and “west” character names to group the strata.

rindex <- mutate(
  rindex,
  east_west = if_else(as.numeric(strata_name) < 50 | as.numeric(strata_name) > 74,
                      "west",
                      "east"))

And now double check that we correctly grouped the strata!

ggplot(data = rindex) +
  geom_sf(data = load_map(type = "North America")) +
  geom_sf(data = rindex, aes(fill = east_west), alpha = 0.5)

Then supply the rindex object to the regions_index argument of the generate_indices() function and include the relevant column names from the object as regions.

i <- generate_indices(
  m,
  regions = c("stratum", "country", "prov_state", "east_west"),
  regions_index = rindex)
#> Processing region stratum
#> Processing region country
#> Processing region prov_state
#> Processing region east_west

t <- generate_trends(i)

We can plot the population trajectories for each of these regions with plot_indices().


p <- plot_indices(i)
names(p)
#>  [1] "1"                        "14"                       "17"                      
#>  [4] "18"                       "2"                        "22"                      
#>  [7] "23"                       "24"                       "26"                      
#> [10] "3"                        "31"                       "44"                      
#> [13] "50"                       "51"                       "52"                      
#> [16] "56"                       "62"                       "63"                      
#> [19] "64"                       "66"                       "67"                      
#> [22] "68"                       "69"                       "71"                      
#> [25] "72"                       "75"                       "76"                      
#> [28] "77"                       "Canada"                   "United_States_of_America"
#> [31] "AB"                       "AK"                       "MB"                      
#> [34] "ME"                       "NB"                       "NL"                      
#> [37] "NS"                       "NT"                       "ON"                      
#> [40] "QC"                       "SD"                       "SK"                      
#> [43] "east"                     "west"

p[["east"]] + p[["west"]]

Finally we can even create geofaceted plots (which is only possible in our case because we assigned our strata to Provinces and States and calculated indices for these regions). These geofacet plots can be useful for visualizing the population trajectories of species with broad ranges across many states and provinces.

plot_geofacet(i, trends = t, multiple = TRUE)

Subsetting an existing stratification

In general, it is often useful to use all of the data for a given species, even if you’re only interested in trends for a portion of the species’ range (estimates of observer effects are informed by all of the data in the analysis). However, there may be situations where the focus of your study is limited to a particular region. For example what if you want to use one of the standard stratifications, but you only want the analysis to be influenced by data from one region, say only Canadian data?

In this case you can subset the BBS CWS stratification to only Canadian regions, and use that subset of regions as a custom stratification in the stratify() function.

In addition to maps, stratifications are available as data frames in the bbs_strata object.

names(bbs_strata)
#> [1] "bbs_usgs"   "bbs_cws"    "bcr"        "latlong"    "prov_state"
head(bbs_strata[["bbs_cws"]])
#> # A tibble: 6 × 7
#>   strata_name area_sq_km country country_code prov_state   bcr bcr_by_country
#>   <chr>            <dbl> <chr>   <chr>        <chr>      <dbl> <chr>         
#> 1 CA-AB-10        52565. Canada  CA           AB            10 Canada-BCR_10 
#> 2 CA-AB-11       149352. Canada  CA           AB            11 Canada-BCR_11 
#> 3 CA-AB-6        445135. Canada  CA           AB             6 Canada-BCR_6  
#> 4 CA-AB-8          6987. Canada  CA           AB             8 Canada-BCR_8  
#> 5 CA-BC-10       383006. Canada  CA           BC            10 Canada-BCR_10 
#> 6 CA-BC-4        193180. Canada  CA           BC             4 Canada-BCR_4

We can now modify and use this data frame as we like.

my_cws <- filter(bbs_strata[["bbs_cws"]], country == "Canada")
s <- stratify(by = "bbs_cws", species = "Canada Jay", strata_custom = my_cws)
#> Using 'bbs_cws' (subset) stratification
#> Loading BBS data...
#> Filtering to species Canada Jay (4840)
#> Stratifying data...
#>   Combining BCR 7 and NS and PEI...
#>   Renaming routes...
#>   Omitting 101,474/119,567 route-years that do not match a stratum.
#>     To see omitted routes use `return_omitted = TRUE` (see ?stratify)

Note that the stratification is now “bbs_cws” and “subset”

s[["meta_data"]]
#> $stratify_by
#> [1] "bbs_cws"
#> 
#> $stratify_type
#> [1] "subset"
#> 
#> $species
#> [1] "Canada Jay"

We can see the strata included by looking at the meta_strata

print(s[["meta_strata"]], n = Inf)
#> # A tibble: 30 × 7
#>    strata_name area_sq_km country country_code prov_state   bcr bcr_by_country
#>    <chr>            <dbl> <chr>   <chr>        <chr>      <dbl> <chr>         
#>  1 CA-AB-10        52565. Canada  CA           AB            10 Canada-BCR_10 
#>  2 CA-AB-11       149352. Canada  CA           AB            11 Canada-BCR_11 
#>  3 CA-AB-6        445135. Canada  CA           AB             6 Canada-BCR_6  
#>  4 CA-BC-10       383006. Canada  CA           BC            10 Canada-BCR_10 
#>  5 CA-BC-4        193180. Canada  CA           BC             4 Canada-BCR_4  
#>  6 CA-BC-5        199820. Canada  CA           BC             5 Canada-BCR_5  
#>  7 CA-BC-6        106917. Canada  CA           BC             6 Canada-BCR_6  
#>  8 CA-BC-9         59939. Canada  CA           BC             9 Canada-BCR_9  
#>  9 CA-BCR7-7     1743744. Canada  CA           BCR7           7 Canada-BCR_7  
#> 10 CA-MB-11        70101. Canada  CA           MB            11 Canada-BCR_11 
#> 11 CA-MB-12        15312. Canada  CA           MB            12 Canada-BCR_12 
#> 12 CA-MB-6        127190. Canada  CA           MB             6 Canada-BCR_6  
#> 13 CA-MB-8        234151. Canada  CA           MB             8 Canada-BCR_8  
#> 14 CA-NB-14        72991. Canada  CA           NB            14 Canada-BCR_14 
#> 15 CA-NL-8        157083. Canada  CA           NL             8 Canada-BCR_8  
#> 16 CA-NSPE-14      61502. Canada  CA           NSPE          14 Canada-BCR_14 
#> 17 CA-NT-3        394769. Canada  CA           NT             3 Canada-BCR_3  
#> 18 CA-NT-6        509423. Canada  CA           NT             6 Canada-BCR_6  
#> 19 CA-NU-3       1969549. Canada  CA           NU             3 Canada-BCR_3  
#> 20 CA-ON-12       206181. Canada  CA           ON            12 Canada-BCR_12 
#> 21 CA-ON-13        83859. Canada  CA           ON            13 Canada-BCR_13 
#> 22 CA-ON-8        435545. Canada  CA           ON             8 Canada-BCR_8  
#> 23 CA-QC-12       174314. Canada  CA           QC            12 Canada-BCR_12 
#> 24 CA-QC-13        28409. Canada  CA           QC            13 Canada-BCR_13 
#> 25 CA-QC-14        67711. Canada  CA           QC            14 Canada-BCR_14 
#> 26 CA-QC-8        470310. Canada  CA           QC             8 Canada-BCR_8  
#> 27 CA-SK-11       241315. Canada  CA           SK            11 Canada-BCR_11 
#> 28 CA-SK-6        177763. Canada  CA           SK             6 Canada-BCR_6  
#> 29 CA-SK-8        188615. Canada  CA           SK             8 Canada-BCR_8  
#> 30 CA-YT-4        435349. Canada  CA           YT             4 Canada-BCR_4

Modifying existing BBS maps

Stratify by custom stratification, using sf map object. For example, let’s look at an east/west divide of southern Canada with BBS CWS strata.

First we’ll start with the CWS BBS data

map <- load_map("bbs_cws")

We’ll modify this by first looking only at provinces (omitting the northern territories), transforming to the GPS CRS (4326), and ensuring the resulting polygons are valid.

new_map <- map %>%
  filter(country_code == "CA", !prov_state %in% c("NT", "NU", "YT")) %>%
  st_transform(4326)%>%
  st_make_valid()

Now we can crop this map to make a western and an eastern portion, defined by longitude and latitude (which is why we first transformed to the GPS CRS).

west <- st_crop(new_map, xmin = -140, ymin = 42, xmax = -95, ymax = 68) %>%
  mutate(strata_name = "west")
#> Warning: attribute variables are assumed to be spatially constant throughout all geometries
east <- st_crop(new_map, xmin = -95, ymin = 42, xmax = -52, ymax = 68) %>%
  mutate(strata_name = "east")
#> Warning: attribute variables are assumed to be spatially constant throughout all geometries

Now we’ll bind these together and transform back to the original CRS

new_strata <- bind_rows(west, east) %>%
  st_transform(st_crs(map))

ggplot() +
  geom_sf(data = map) +
  geom_sf(data = new_strata, aes(fill = strata_name), alpha = 1)

Looks good! Let’s use it in our stratification and take a look at the points afterwards to ensure they’ve been categorized appropriately.

s <- stratify(by = "canada_ew", species = "Canada Jay",
              strata_custom = new_strata)
#> Using 'canada_ew' (custom) stratification
#> Loading BBS data...
#> Filtering to species Canada Jay (4840)
#> Stratifying data...
#> Preparing custom strata (ESRI:102008; North_America_Albers_Equal_Area_Conic)...
#>   Summarizing strata...
#>   Calculating area weights...
#>   Joining routes to custom spatial data...
#>   Renaming routes...
#>   Omitting 103,468/119,567 route-years that do not match a stratum.
#>     To see omitted routes use `return_omitted = TRUE` (see ?stratify)

s$meta_data
#> $stratify_by
#> [1] "canada_ew"
#> 
#> $stratify_type
#> [1] "custom"
#> 
#> $species
#> [1] "Canada Jay"
routes <- s$routes_strata %>%
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326)

ggplot() +
  geom_sf(data = new_strata, aes(fill = strata_name), alpha = 1) +
  geom_sf(data = routes, aes(shape = strata_name))