...

Using Artificial Intelligence to Find the Debris of War

the halo trust

Equipping an NGO with a scalable ML-based solution to empower them on their mission to track small arms and lights weapons in war-stricken zones.

halo-trust-case-main

The HALO Trust is a non-governmental organisation that clears the world of landmines and explosives left behind by war. In the past 30 years, they have saved countless lives, created safe environments for children to grow up in, and given whole communities the opportunity to rebuild.

The debris of war is more than landmines and explosives. Its remains permeate communities for years, continuing to destroy lives and elongate the tail of conflict.

One of the most significant factors in making a difference is ridding the world of uncontrolled small arms and light weapons (SALW) within crisis zones and security challenged areas. In Africa alone, it is estimated that there are over 100 million uncontrolled SALWs.

The effect of this is multi-faceted, and not limited to continued fighting. People are displaced, fleeing their homes through fear of injury and death. The risk of gender-based violence (GBV) rises alarmingly with 45.6% of women in Africa being subjected to GBV as a result of armed conflict. Finally, the predominance of uncontrolled weapons erodes social cohesion and communal trust, tipping the balance towards violent confrontations every day.

What we did

Multi-threaded
cloud application
BigQuery setup
Machine learning
Web scraping and text analysis

The Challenge

The HALO Trust seek out evidence of uncontrolled SALWs but the manual process they used to apply had a series of challenges and inefficiencies. They would search a wide variety of sources online that indicate whether arms have either been seized or stolen.

This process produces high quality incident reports, but is limited in its scalability due to the heavy reliance on experts, number of incidents and the geographical reach required.

Conflict is complex and ever-changing, and HALO knew it needed to change with it. Together, we set out to design data-driven ways to assist their experts in identifying arms diversion globally in a rapid and scalable way.

The solution

Multi-threaded cloud application

A team of Infinite Lambda data engineers and data scientists explored ways to extract real-time information from multiple news and social media sources.

We developed a multi-threaded cloud application which could:

  • Crawl news websites and collect their content into a unified format;
  • Scale from a few websites to several hundreds effortlessly.

The data engineering angle

We extracted data from numerous REST APIs and converted them into a standardised JSON format. This applied to both the data and meta-data components related to the data sources and search terms.
All of the standardised JSON data was synced up with a BigQuery table. This table would get updated in real time whenever there was new data from a source such as Twitter, NewsAPI and Google RSS.

The data science angle

Taking advantage of several thousands of manually collected, reviewed and categorised articles, we built a pipeline which could categorise and assign a relevance score to any new article given as an input.

To achieve this, we used lower dimensional mathematical representation of the article texts called embeddings. In essence, this would translate the semantics of the articles into numerical form, which would allow us to perform mathematical operations on them. Thus, we could train classifiers to capture similarity between documents and different characteristics of the texts.

The machine learning angle

We developed ML algorithms to further enhance the accuracy and relevance. Once the searches were complete, the raw data would be fed through the ML engine in order to refine the results with articles of relevance. Results would be displayed in order of relevance, allowing the user to evaluate the results they see as relevant or not. This human intervention would complete the learning loop and allow the tool to become increasingly accurate.

The technology we useD

halo-trust-case-inside3
The Result

Making data work for the humanitarian sector in an affordable and scalable way

The tool we created allowed the user to search multiple sources with multiple search terms and geographies relevant to SALWs. Moreover, the user could schedule automated searches across these sources according to requirement.

It was very early in the project that we managed to execute a pipeline which scored and categorised thousands of articles within a few minutes highlighting only the highly relevant ones.

This tool has the potential to seek out uncontrolled arms and assist HALO in reducing the numbers of SALWs worldwide, ultimately helping to save lives and livelihoods. The techniques we have used blaze the trail to finding new ways to make data work for the humanitarian sector in an affordable and scalable manner.

The use of machine learning and artificial intelligence has great potential to support the developing world where the lack of digital data hampers the delivery of humanitarian assistance and development. We ought to remain innovative to continue delivering value for money to our donors and beneficiaries

Luan Jaupi, Head of ICT, HALO

don’t wait

Let’s walk the walk together

We are a generation of engineers and technologists who are passionate about transforming organisations with digital-age solutions and seeing them thrive on the cloud.

see Related Stories

We have helped over 50 organisations to deliver projects at different scales with over £100m in ROI.

World Health Organisation
Carbon Analytics Platform for a Vehicle Fleet
The Francis Crick Institute
Scalable Global Trusted Research Environment
Railsr
Robust Data Platform for a Digital Bank
Oddbox
Building a Single Source of Truth
Moshi
Integrated Data Analytics Platform
Autolus
A Single Source of Truth in the Cloud

We have helped over 50 organisations to deliver projects at different scales with over £100m in ROI.

BUILDING A DATA VAULT WITH dbt CLOUD

Thursday, 1 June, 2023

9 am EST | 2 pm BST