Transferability of tsetse habitat models between different regions in Kenya and Rwanda

Abstract:

Accurate and reliable information on the distribution of tsetse habitats is crucial for the effective management of African Trypanosomiasis in sub-Saharan Africa. However, conducting large-scale surveillance of tsetse flies to develop distribution maps is impractical due to vast areas infested and limited resources available. To address this challenge, we evaluated the applicability of tsetse habitat models developed in the intensively sampled Shimba Hills National Reserve in Kenya for both the wet and the dry season, to two other regions in Kenya (Ruma National Park and Nguruman Conservancy) and one region in Rwanda (Akagera National Park). The models utilized satellite-based estimates of vegetation greenness, land cover, and land surface temperature, combined with tsetse occurrence data, to predict habitat suitability. An independent dataset of tsetse occurrence was used to benchmark the performance of the transferred models. The performance of the transferred models was significantly influenced by the similarity in environmental conditions between the model’s development area and the transfer area. In regions with high dissimilarity, such as Nguruman Conservancy during the dry season, model transfer was unsuccessful with an F1-score of zero. In all other regions and seasons, the transferred models showed satisfactory performance, with F1-score values exceeding 0.65. Nevertheless, site-specific models outperformed (>0.8 F1-score) the transferred models, indicating that models specifically developed with data for each location can provide more accurate information on tsetse distribution. In conclusion, our study demonstrates that tsetse habitat models can be transferred with relatively good accuracies to seasons and regions that exhibit environmental similarity with the model training area. Despite the higher accuracy of site-specific models, transferring models to similar sites remains a meaningful exercise in the absence of detailed surveillance data.