Crypto news

25.06.2026
15:03

Bristol Failure: AI models for assessing risk of crimes against children disabled due to fatal errors

The authorities in Bristol and the Avon and Somerset police force were forced to scrap the use of at least two artificial intelligence models designed to assess the risk of crimes against children. The reason was catastrophically low accuracy and complete opacity of the algorithms, which independent auditors could not verify due to the lack of source code and a list of variables.

How the system worked and why it failed

At its core was the Think Family Database, launched by Bristol City Council in 2016. It combined police reports and social data — from housing status and mental health issues to information about school truancy and free school meals. It is estimated that the database could contain records on nearly 500,000 residents, with data collection occurring without direct citizen consent, based on legal norms for information sharing between government agencies.

Based on this database, 23 machine learning models were built, including predictions for theft, failure to appear in court, and the risk of domestic violence. However, the models designed to assess threats to children proved to be the most vulnerable. In addition to police and municipal data, they were fed anonymized information from the charity Barnardo's on 1,000 children who had already been victims of crimes. The final scoring was influenced by factors such as a child's status as needing help, chronic school truancy, and mental disorders.

As early as 2016, the police ethics committee warned of the risk of algorithmic bias due to the chosen variables. Later, an audit conducted by the consulting organization Social Finance confirmed the worst fears: the accuracy of the models was deemed the "weakest link," and their practical value was questionable. By the time of the audit, both models had already been deactivated.

Data problems and lack of oversight

Social Finance linked the degradation of the models' quality to changes in the dataset. When attempting to scale the system to the entire Avon and Somerset region, the police could not agree on data sharing with all local councils. As a result, social indicators disappeared from the models, and the algorithms began to operate primarily on a "police core," making them even less reliable.

Bristol city services staff complained that vulnerable children were not appearing in the results. One report noted that minors who had recently been victims of crimes could receive a lower risk score than individuals involved in theft cases. Other employees openly stated they were unwilling to rely on the assessments due to the complete opacity of the methodology.

A separate audit conducted by the company Eticas, based on 36,000 performance evaluations across 13 models, showed that the accuracy of positive predictions for most of them was extremely low. For example, a model designed to identify potential burglars had demonstrated accuracy below 10% for over three years — meaning the system incorrectly flagged over 90% of people as risks. The police explained this by stating that the model had not been implemented and that the assessments were the result of an automatic check of a "static file."

Context and my expertise

This incident occurs against the backdrop of the launch of the national PoliceAI center with a budget of £75 million, which aims to scale AI tools for 43 police forces in England and Wales. Notably, this center is led by the former chief constable of Avon and Somerset Police — the very region where this failure occurred.

My professional opinion: The Bristol story is a classic example of how rushing to implement AI in critical areas, such as child protection, ends up discrediting the technology itself. The problem here is not "bad" AI, but a systemic error in data management: the use of non-representative samples, a lack of transparency, and a failure to control the quality of the initial variables. If PoliceAI does not learn these lessons, it risks scaling not efficiency, but systemic errors across the entire country.