Skip to contents

Introduction

BirdWeather is a global network of acoustic monitoring stations maintained and operated by volunteers that continuously record and automatically identify bird vocalizations using the BirdNET neural network. As of February 2026, the network comprises over 18,000 stations across six continents, collectively generating tens of millions of detections per month. Just recently, the database hit 2 billion detections! Unlike citizen science platforms that rely on human observers, BirdWeather stations operate continuously and autonomously — making the dataset uniquely suited to studying fine-grained temporal patterns in bird behavior, including responses to weather events, light cycles, and astronomical phenomena.

Currently, BirdWeather offers three options for accessing the database: 1) A user-friendly GUI data explorer, 2) GraphQL API, and 3) REST API. While the latter two are particularly powerful, not all users are comfortable with API calls. birdweatheR provides an R interface to the BirdWeather API, enabling researchers to download detection data, explore species activity patterns, and integrate on-board sensor readings from BirdWeather PUC units — which record temperature, barometric pressure, humidity, air quality, and spectral light levels alongside acoustic detections.

Installation

# Install from GitHub
# install.packages("devtools")
devtools::install_github("BrentPease1/birdweatheR")

Connecting to the API

All functions require an active API connection. No API KEY required. Establish a connection at the start of each session:

The connection is stored in a package-level environment and used automatically by all downstream functions. You do not need to pass it explicitly after the initial call.


Exploring the Platform

Before pulling raw detections, a few summary functions give a quick overview of the BirdWeather database.

Platform-wide summary

get_counts() returns a single-row summary of detections, species, and stations for a given time period:

get_counts(
  from = "2025-05-01T00:00:00.000Z",
  to   = "2025-05-02T00:00:00.000Z"
)
#>   detections species stations
#> 1   11190893    1664     3890

Over 11 million detections from 1,664 species across 3,890 stations in a single 24-hour period illustrates the scale of the BirdWeather network.

Top species

get_top_species() returns the most frequently detected species for a given period, with a breakdown of detection confidence:

get_top_species(
  limit = 10,
  from  = "2025-05-01T00:00:00.000Z",
  to    = "2025-05-02T00:00:00.000Z"
)
#>   species_id        common_name        scientific_name  count almost_certain
#> 1          1      House Sparrow      Passer domesticus 829043         828755
#> 2         35 Eurasian Blackbird          Turdus merula 713296         713294
#> 3        134        House Finch   Haemorhous mexicanus 519312         519242
#> 4         51  Common Chiffchaff Phylloscopus collybita 436480         436477
#>   very_likely unlikely uncertain
#> 1          88        0         0
#> 2           2        0         0
#> 3          70        0         0
#> 4           3        0         0

Discovering stations

get_stations() retrieves the full station network with coordinates, country, and station type. This is useful for identifying stations in a region of interest before pulling detections:

# Pull 10 stations
stations <- get_stations(limit = 10)

# Stations within a bounding box and time period
midwest_stations <- get_stations(
  from  = "2025-05-01T00:00:00.000Z",
  to    = "2025-05-02T00:00:00.000Z"
  ne = list(lat = 49.0, lon = -80.0),
  sw = list(lat = 36.0, lon = -97.0)
)

midwest_stations returns 569 observations of 10 variables, including station name, type (e.g., “PUC”, see below), longitude and latitude, and a station_id unique to the database that could be helpful for other queries.


Species Detection Patterns

Finding species

find_species() searches the BirdWeather species database by common or scientific name. This function allows for complete and partial matches, even including wildcards (e.g., ’whip*’)

# Search by common name
find_species("Wood Thrush")

# Partial match
find_species("thrush")

# wildcard match
find_species("whip*")

Time-of-day activity

get_tod_counts() returns detection counts binned by 30-minute intervals across a time period, revealing the diel activity pattern of a species. Here we examine the American Robin (Turdus migratorius) — a species well known for its early dawn and late evening chorus:

robin_id <- find_species("American Robin")$species_id

robin_tod <- get_tod_counts(
  species_id = robin_id,
  from       = "2025-05-01T00:00:00.000Z",
  to         = "2025-05-31T00:00:00.000Z"
)
library(ggplot2)

ggplot(robin_tod, aes(x = hour, y = count)) +
  geom_col(fill = "tomato") +
  scale_x_continuous(breaks = seq(0, 24, by = 3),
                     labels = paste0(seq(0, 24, by = 3), ":00")) +
  labs(
    title    = "American Robin — Time of Day Activity",
    subtitle = "May 2025, all BirdWeather stations",
    x        = "Hour of Day",
    y        = "Total Detections"
  ) +
  theme_minimal()

Contrasting that with a nocturnal species such as the Barred Owl (Strix varia) shows the power of continuous detections, creating a contrast with traditional sampling techniques and other avian databases:

baow_id <- find_species("Strix varia")$species_id

baow_tod <- get_tod_counts(
  species_id = baow_id,
  from       = "2025-05-01T00:00:00.000Z",
  to         = "2025-05-31T00:00:00.000Z"
)
library(ggplot2)

ggplot(baow_tod, aes(x = hour, y = count)) +
  geom_col(fill = "blue4") +
  scale_x_continuous(breaks = seq(0, 24, by = 3),
                     labels = paste0(seq(0, 24, by = 3), ":00")) +
  labs(
    title    = "Barred Owl — Time of Day Activity",
    subtitle = "May 2025, all BirdWeather stations",
    x        = "Hour of Day",
    y        = "Total Detections"
  ) +
  theme_minimal()

One shortcoming of the TOD function, however, is when a species is poorly detected. In this case, a manual approach would be required (e.g., using get_detections)

get_daily_detection_counts() returns detection totals aggregated by day, optionally broken down by species. This is useful for visualizing seasonal trends without downloading raw detections:

daily <- get_daily_detection_counts( #ART 20 seconds
  from       = "2025-05-01T00:00:00.000Z",
  to         = "2025-05-31T00:00:00.000Z",
  by_species = FALSE
)

ggplot(daily, aes(x = as.Date(date), y = daily_total)) +
  geom_smooth(color = "steelblue", se = F) +
  labs(
    title = "Daily BirdWeather Detections",
    x     = NULL,
    y     = "Detections per Day"
  ) +
  theme_minimal()

This function can be a powerful summary function when combined with station_ids, species_ids, or breaking down by_species, which are all built-in arguments to the function.

Raw Detections

For research, the most useful function will likely be get_detections. This function allows for raw data downloads from the BirdWeather database, where then the scientific questions are seemingly endless.

get_detections

get_detections is a function that retrieves bird detections from the BirdWeather API with optional filters. Importantly, since we are dealing with millions of rows, the function handles pagination automatically up to the user-specified limit. If no limit is specified, the function will return the total number of detections matching the query. Returns a fully flat data.table with all nested fields (coords, species, station) expanded into individual columns.

Current functionality allows for filtering on: * Dates (beginning and end) * Stations * Types of Stations * Species * Continents * Countries * BirdNET Confidence Scores * Bounding Box * Download Limits (e.g., 1000 rows)

# Get detections for a date range
detections <- get_detections(
  from  = "2025-01-01T00:00:00.000Z",
  to    = "2025-01-02T00:00:00.000Z",
  limit = 1000
)

# Filter by species name - if multiple matches are found, all are shown
# and the user is prompted to rerun with a more specific name
detections <- get_detections(
  from          = "2025-05-01T00:00:00.000Z",
  to            = "2025-05-02T00:00:00.000Z",
  species_names = "Eastern Whip-poor-will",
  limit         = 1000
)

# For faster pulls, look up species ID first and pass it directly
ewpw_id <- find_species("Eastern Whip-poor-will")$species_id

detections <- get_detections(
  from        = "2025-05-01T00:00:00.000Z",
  to          = "2025-05-02T00:00:00.000Z",
  species_ids = ewpw_id,
  limit       = 1000
)

# Filter by continent with a confidence threshold
detections <- get_detections(
  from           = "2025-01-01T00:00:00.000Z",
  to             = "2025-01-02T00:00:00.000Z",
  continents     = "North America",
  confidence_gte = 0.9,
  limit          = 1000
)
returned columns

get_detections will return the following columns:

Column Description
id Detection ID
timestamp Detection timestamp in station timezone
confidence BirdNET confidence
score Calculated identification likelihood score
species_id Species ID
common_name Species common name
scientific_name Species scientific name
station_id Station ID
station_name Station name
station_type Station type
station_country Country
station_continent Continent
station_state State / province
station_lat, station_lon Station coordinates
location_privacy fuzzy lat/lons

Sensor Data

BirdWeather PUC (Portable Universe Codec) units, BirdWeather’s AI-powered bioacoustics platform, in addition to being equipped with dual microphones, WiFi, and GPS, are also packaged with on-board environmental sensors that record temperature, humidity, barometric pressure, air quality index, and sound pressure level at approximately 42-second intervals. PUCs also record light levels at the sensor, including broadband light levels, near-infrared light, and the ability to isolate bands f1-f8. This onboard sensor data enables researchers to directly link bird vocal activity to local environmental conditions — a capability not available in any other large-scale acoustic monitoring dataset.

Environmental sensor data

get_environment_data() retrieves temperature, humidity, barometric pressure, air quality, and sound pressure level readings for a PUC station over a specified time window:

env_data <- get_environment_data(
  station_id = "1733",
  from       = "2025-05-01T00:00:00.000Z",
  to         = "2025-05-02T00:00:00.000Z"
)

Light sensor data

get_light_data() retrieves spectral light sensor readings from PUC stations, including broadband clear light, near-infrared, and individual spectral bands (f1–f8):

light_data <- get_light_data(
  station_id = "1733",
  from       = "2025-05-01T00:00:00.000Z",
  to         = "2025-05-02T00:00:00.000Z"
)

Further Reading

For worked examples of these functions applied to real research questions — including migration phenology, activity timing with human footprint, and behavioral responses to the 2024 total solar eclipse — see the Example Applications vignette.

For an example of large-scale ecological synthesis using BirdWeather detections, see Pease & Gilbert (2025), who used over 60 million BirdWeather detections across 583 diurnal species to demonstrate that light pollution prolongs avian vocal activity by up to 50 minutes in the brightest landscapes (Science, 387, 848–853).

Additionally, a full analysis of the 2023 Annular and 2024 Total Eclipses using data from the Eclipse Soundscapes project and BirdWeather can be found in Gilbert et al., 2026.

For questions, bug reports, or feature requests, please visit the package repository at github.com/BrentPease1/birdweatheR.