Technical Blog: Overview of Carbon Rewild’s Analysis Process

Measuring the change in biodiversity predominantly involves quantifying the change in species richness or diversity across a site, as well as measuring the change of relative abundance of each species [1, 2].

At Carbon Rewild our aim is to make the measuring of biodiversity accessible to all via the use of bioacoustic monitoring. This involves taking audio recordings and quantifying which species can be heard and how frequent the vocalisations are.

Over the past few years, we have developed a system by which data is collected by sending out audio recorders to landowners. The recorders are deployed at the site by the recipient for passive monitoring during the survey period. They are then retrieved and returned to us for data analysis.

Bioacoustic monitoring allows landowners to easily assess the biodiversity of the land under their management, which enables them to assess their positive change and improvements on the natural world.

But how do we do the analysis? How do we turn one month’s audio data recordings into a report which tells us which species have been detected? How do we classify the activity of the species over the course of the recording period?

Overview of the Analysis Process

Our analysis process can be divided into two parts from which we get two outcomes:

Part 1 – Species Detection – identification of the species detected during the survey period
Part 2 – Measuring Activity via vocalisations – a characterisation of species activity and/or relative measure of abundance

Species detection involves assessing which species are identifiable in the audio data. This is done be a three step process:

AI analysis
Custom filtering
Human verification

The relative level of activity of each verified species is then measured by assessing the number of calls or vocalisations identified.

Part 1 – Species Detection

Once we have received the data, the first step is to pass the audio data through an artificial intelligence (AI) algorithm to filter the data.

The AI algorithms will generate a list of species that it thinks are present in the recordings over a specified confidence threshold.

There are, however, limitations in automatic detection using AI due to various factors:

Some species are misidentified as other species
Anthropogenic noises being identified as wild animal vocalisations, such as:
- Cars, motorbikes, and planes
- Alarms
- Machinery noise
- Domestic animals – dogs chickens, sheep
Noise from running water, rain and wind masking a call or alter a recording

Due to these limitations we cannot take any AI’s output on face value. As a result we have to carry out two further filtering steps to improve the identification accuracy.

A series of custom rules, algorithms, and filters which use prior data and knowledge to assess confidence in the results
Human verification of species present. Once the results have been filtered we still need to manually listen to a subset of species in order to verify that they have been detected at a given site.

It is important to note that we do not check every single recording of a species. We only verify a subset of recordings to check that a certain species is present over the course of the survey period (i.e. one month).

Once the species identification process has been completed we can take a deeper dive into the data associated with each species.

Step 2 – Measuring Activity

The principal data attached to each verified species is the number of individual identifications/vocalisations found by the AI. For some species this can be in the 1000s, for others it can be only a handful.

Once we have verified that a given species has been successfully identified then we re-evaluate the total number of calls identified by the AI.

Like all measurements, however, this number is not perfect. Some calls will be false positives (reducing the overall number of calls) whereas other genuine calls will not be detected due to masking by other species, interference by background noise, or false negatives (increasing the overall number of calls).

As a result we are currently devising a way of quantifying the error in the number of calls using standard engineering terminology.

Furthermore, we are able to access when each of these calls were made. Therefore, allowing us to build up a picture of how vocally active a species is over the course of a day.

Multiple Surveys and Comparison

Once we have quantified the number of calls and its error we are able to far more easily determine if a change in the number of calls has occurred.

By repeating the survey under the same conditions we are able to repeat the analysis and determine whether or not a step change in the number of calls greater than the error has occurred.

If we see a change in the activity/number of calls greater than the previous measurement (plus the upper error limit) then we can be confident that an increase in activity of that species has been observed.

Furthermore, since we can see what times of day the calls occur, we can also see if there has been a change in the pattern or behaviour of the species throughout the day.

Biodiversity Net Gain (BNG)

However, nature is not always so straightforward. We must be aware that two or more surveys of a single species are not always directly comparable, even if they are carried out at the same time of year. For example, differing weather conditions and temperatures before or during the survey may affect the number of vocalisations.

As a result, quantifying improvements in biodiversity requires us to look at many different species and at many different or extended time points. Crucially, our data is able to be used in biodiversity net gain calculations, which combine species richness (Part 1) and relative activity/abundance (Part 2) to enable clear comparisons of complex ecosystems.

Furthermore, by combining this basic data with other data such as conservation status, we are able to build a multidimensional picture of the health and direction of an ecosystem and translate that into a value of biodiversity net gain [3]

Our analysis method (along with the appreciation of the errors involved in making these measurements) will form the bedrock of biodiversity measurement and characterisation. And as we collect more and more recording and data points we will be able to refine our accuracy and precision further still.

Conclusion

In order for individuals and organisations to trust our analysis process we are working hard to ensure that:

We correctly identify the species that have been recorded on our recording devices, and;
Accurately and precisely determine the activity/abundance of the identified species

As a result Carbon Rewild’s species identification process is a two part process:

Part 1 – Species Detection – A binary classification of whether or not a species is present during the survey period followed by;

Part 2 – A Measurement of Activity/Abundance using the proxy of the number of calls of the verified species with a quantified measurement uncertainty

This hybrid approach separates out the species identification from the measurement of activity/abundance. Allowing an accurate determination of whether a species is present or not, before going on to assess its level of activity. This ensures that there are multiple checks and balances in the process and that the AI is not solely responsible for the veracity of the measurement.

By combining the outcomes of Part 1 and Part 2 we can get to the most important outcome – measurement of biodiversity and biodiversity net gain. Ultimately enabling us to build a multidimensional picture of the health and direction of an ecosystem and supporting its continued improvement.

References

1. J. L. Harper, D. L. Hawksworth, Biodiversity Measurement and Estimation, 1994, Preface, D. L. Hawksworth, Royal Society and Chapman Lee

2. A. E. Magurran, Measuring Biological Diversity, 2004, Blackwell Publishing

3. Wallacea Trust, Wallacea Trust Biodiversity Credit Methodology, 2022