Categories
Biodiversity Ecology Learning

How to Calculate Bioacoustic Measurement Uncertainty


  • Biodiversity measurements and metrics are in the process of building acceptance and trust.
  • Qualification of measurement uncertainty constitutes a crucial part of building trust in any measurement.
  • Measurement uncertainty is calculated by creating an uncertainty budget, whereby each line describes a single source of uncertainty.
  • These values are then combined to give an overall uncertainty value.
  • At Carbon Rewild we actively qualify our measurement uncertainty, get in contact if you want to discuss the process in detail.

This blog forms part of a series on the topic of measurement uncertainty for biodiversity measurements, specifically focusing on bioacoustics. The first blog in the series can be found here.

As mentioned in the previous blog, our society relies on a host of different measurements, many of which we do not question or even notice that we are making use of.

Many of these measurements have ‘measurement uncertainties’ attached to them which govern how accurate and precise these values are. Typically this takes the form of a plus and minus value between which the ‘correct’ measurement lies. For example, as the crow flies it is between 212 miles and 214 miles between the centres of London and Paris [3]. This can be written as 213 ± 1 miles – i.e. 213 miles plus or minus 1 mile.

But how are these uncertainty values calculated? The easiest way is to find a known reference value and compare your measurements against a reference value. However this is not always possible as many measurements are multi-dimensional and their accuracy and precision depends on multiple factors. Typically reference measurements can only be used in very controlled situations. As soon as other factors are involved, such as environmental ones, then incorporating each one of these factors becomes much more difficult. And biodiversity measurements could be considered exclusively environmental!

Uncertainty Budgets

An uncertainty budget is a well recognised method for calculating measurement uncertainty. It forms part of the certification and accreditation process for some pieces of measurement equipment to meet the ISO 17025 standard. ISO 17025 is described as being for “any organisation that performs testing, sampling or calibration and wants reliable results” [1]. So it is a well documented way of building trust in measurements, perfect for the growing area of biodiversity measurement.

Uncertainty budgets are documents which detail line-by-line any factor that contributes to the uncertainty of a measurement. Each line or item will then have a value associated with it. Getting to a value for each source of uncertainty may itself take lots of analysis and extra testing. But once you have created a list of uncertainties and values then you combine them using well documented methods.

Crucially, the hard parts are:

  1. Agreeing on the sources of uncertainty and,
  2. Calculating the values for each line of the uncertainty budget.

A Non-Biodiversity Example

Before we delve into the sources of uncertainty for bioacoustic measurements, it might be clearer to have an example with a ‘traditional’ measurement.

Length measurements are common in everyday life and you will rarely consider the effect of measurement uncertainty. But if you are measuring a precision engineered part then you will be very concerned about it, for example, when using a micrometer to measure the diameter a turned metal part.

In this simplified case there are three sources of uncertainty we will look at:

  1. Temperature – the metal part and the micrometer itself will expand and contract with temperature.
  2. Repeatability – depending on how the micrometer is used, setup, or contacts the part will affect the measurement.
  3. Resolution – when reading off the final measurement the number of decimal places is a source of uncertainty itself.

Each of these will have a value. Resolution, in this simplified case, would be the smallest measurable change. Repeatability can be found by repeatedly measuring a part and plotting the distribution of results. And temperature effects can be calculated by performing multiple measurements at different temperatures (within a sensible range) or by modelling the effect of temperature. 

Once each value has been calculated then all the values can be combined using the Root Sum Square (RSS) method:

Source of UncertaintyValue
Temperature0.3%
Repeatability0.4%*
Resolution0.1%*
Combine Uncertainty (RSS)0.51%

*these values aren’t usually percentages in length measurements, but these have been simplified for illustrative purposes.

Sources of Uncertainty in Bioacoustic Measurements

So what about biodiversity measurements? What are the sources of uncertainty? Below we will describe the sources of uncertainty which will affect the species richness (or taxonomic richness) ascertained by a bioacoustic identification.

It is important to note that there are many different values that can be calculated with bioacoustic data, not just species richness – all of which can help us to understand ecological health in different ways. The method set out in this blog will eventually be applied to all measurements produced by Carbon Rewild’s bioacoustic data.

AI Identification

Artificial intelligence is an amazing tool and Carbon Rewild uses it to sift through months of data to find all the species that are present at a site. But as excited as people get, artificial intelligence is still prone to errors. Therefore, we need to acutely understand how it behaves and how it contributes to measurement uncertainty.

Classification AIs have their own set of metrics to understand ‘correctness’ – for example, sensitivity, specificity, accuracy, precision, ROC, recall, f1-scores etc. (researchers and practitioners will be intimately aware of the WIkipedia graphic explaining many of the possible evaluation metrics [2]). But those do not always easily translate into a measurement uncertainty, especially for a multi-classifier.

As the species AIs are multi-classifiers relying on different training sets of variable quality for each species, there is an inherent bias away from rarer species or species which do not have as complete datasets. Furthermore, differences in the frequency/pitch of bird calls make identification less consistent between species as ones with lower frequencies are more likely to be masked by background noise than those with higher frequencies.

Naturally occurring sounds, e.g. from other species, along with anthropogenic noises like machinery, can also be misinterpreted/misclassified, adding further complexity. Given these challenges, there will always be some degree of uncertainty in AI-driven classification, even as advancements are made.

As more audio data is available we will be able to refine the measurement uncertainty further. More training and test data generally equals higher confidence identifications and improved filtering of results increases the likelihood rare species will be detected.

Identification Verification

Once our AI and other algorithms have processed the data additional verification steps occur, including the final step, human verification. These steps further increase the validity of the species richness measurement by filtering out species incorrectly identified by the AI.

In any given survey there will be several species in which the AI is confident but are not true positive identifications, these are known as false positives. This can be for a range of reasons, but most commonly a sound is recorded which is very similar to a species’ call. Therefore, extra verification is needed, but there still might be mis-identifications at this point as well. This value of uncertainty has been found to be much lower than the AIs, however, must still be included in the calculation to ensure statistical rigour.

In theory, both the uncertainty due to AI identification and verification could be treated as a single value, however, they both come from different sources. Therefore, by splitting them out they can be refined individually, demonstrating tangible progress in either case.

Temporal Sampling Method

The sampling method can make a large difference to the measurement uncertainty. In order to even out variations in the weather and/or get a more representative picture of the species richness, one wants to measure for as long as possible.

However, memory, battery, accessibility, and ultimately economic constraints exist. Therefore by only recording at intervals you can capture a representative enough total period of time. Due to these factors, rare or migratory species who may only be present and vocalising for short periods of time may be missed.

Environmental and Background Noise Effects

Environmental effects can also affect the total measurement uncertainty in a similar way to the temporal sampling method. Weather events such as persistent wind and rain risk masking vocalisations or suppressing activity altogether. Similarly background noise (especially from man made sources) can mask calls. However, depending on the source, that background noise is not always constant. For example:

  • Flight paths can change giving lulls in aircraft noise;
  • Wind direction can change resulting in a previously nosy road being a bit quieter;
  • Outside of rush hour or over the weekends roads may also be quieter.

Carbon Rewild mitigates these effects by balancing the total surveying time, and the duty cycle (the percentage of time the recorder is recording).

A short total time and high duty cycle risks a multi day weather event or background noise disrupting the survey, resulting in few species being recorded. Whereas, a long total time with a very low duty cycle risks missing too many species which may only vocalise a few times during the survey. Therefore there is a tradeoff.

Resource and setup time is also an important consideration here. A long total recording time and a high duty cycle will place significant demands on battery life and memory, requiring individuals to frequently access the recording units to change memory cards and batteries. Even if the units can transmit data wirelessly, this will require considerably more power, leading to increased effort and time needed for regular servicing of the recording units.

Spatial Sampling and Placement Errors

Device placement also risks suppressing the number of species detected. Especially if several comparison surveys are conducted year after year and the devices are not redeployed in the same location. For example, if the recorder is oriented toward a woodland then it might pick up different species compared to if it were oriented away from the woodland. Even if it is located in exactly the same location as before.

Hardware Changes

Microphones and other electronic components are potentially subject to damage and degradation over time. Especially when placed outside and subjected to large temperature swings and moisture. Therefore, there is the possibility that microphones with reduced sensitivity will not be able to provide suitable audio for the analysis process.

Since Carbon Rewild rigorously tests and refurbishes all recording units between deployments, this process helps to minimise uncertainty from equipment-related issues.

Conclusion

Measurement uncertainty is a powerful way of building trust in a measurement. Carbon Rewild is committed to using techniques developed in other industries to enhance biodiversity measurement and its wider acceptance.

One common method that can be employed is creating an uncertainty budget. This requires listing all the possible sources of uncertainty and calculating values for each one. Once the individual measurement uncertainties have been calculated for each category, then the overall measurement uncertainty can be calculated.

Please get in touch with Carbon Rewild if you would like to discuss our measurement uncertainties in more detail. The values are dependent on your survey requirements. Since we have broken down the uncertainties into individual categories we can tailor your survey to your own requirements for monitoring and measuring biodiversity.

References

  1. ISO, ISO/IEC 17025 Testing and calibration laboratories, 2024
  2. Wikipedia, Precision and Recall, 2024
  3. Specifically the direct route between Parliament Square and Notre Dame de Paris, following the Earth’s curvature, as measured using Google Maps. But it does begs the question what is the ‘centre’ of either of these two cities? London – London Bridge, St Paul’s Elizabeth Tower (Big Ben), Tower of London, Charing Cross; Paris – Place de la Concorde, Hotel de Ville, Eiffel Tower, Arc de Triomphe