ADD Blogpost: From Controversial Experiments to Boring AI? AI in the Public Sector in Transition

The ADD blog provides insight into the ADD project’s research across six university partners. Meet our researchers from Aalborg University, Aarhus University, Copenhagen Business School, Roskilde University, the University of Copenhagen, and the University of Southern Denmark. Read about their projects, activities, ideas, and thoughts—and gain a new perspective on the controversies and dilemmas we face in the digital age, along with ideas on how to strengthen digital democracy.

By Helene Friis Ratner, Associate Professor and co-PI in the ADD project, Aarhus University

The Danish Minister for Digitalisation, Caroline Stage (M), wants to promote “unsexy” and “super boring” AI “that works” instead of the legally and ethically complex AI projects that have filled the media. This is a different political communication about AI than what we heard from her predecessor Marie Bjerre (V). In the summer of 2024, Bjerre urged us not to be “afraid to touch”—mentioning that AI could potentially be implemented in one of the most complex, intrusive, and sensitive areas—forced out of home placements of children and young people. A statement that, not surprisingly, led to heavy criticism.

Stage’s approach, in contrast, signals a more cautious strategy where the public administration harvests the low-hanging fruits, primarily in the administrative area. How should we understand this shift in political communication about AI? In this blog post, I will attempt to shed light on this question, based on the research we have conducted in the ADD-project at Aarhus University.

First-generation experiments with AI in the public sector

In the ADD-project, we have examined a wide range of the first projects that attempted to develop algorithms based on machine learning for the public sector. Machine learning is a branch of artificial intelligence that finds patterns in large amounts of historical data and uses these, for example, to predict events or to categorize citizens. The public sector has experimented with such predictive algorithms across many different domains, e.g., to predict critical illnesses in acute patients in healthcare, to predict the risk of readmission for elderly citizens in social services, to match unemployed citizens with companies in employment services, and to forecast the need for municipal vehicles.

In the ADD project’s first phase (2021-25), we, at Aarhus University, followed all Danish AI projects in the area of vulnerable children and youth and studied the joint public AI signature projects. In the area of vulnerable children and youth, there have been four Danish experiments:

Gladsaxe Municipality, aiming to predict children’s poor well-being before symptoms appeared, through extensive data integration (the municipality applied for exemption from data protection laws through the free municipality experiment).
The research project “RISK”, investigating whether a decision support tool that predicts children’s risk of serious well-being issues could help caseworkers with legally required risk assessments of notifications.
Two municipalities, aiming to develop email sorting programs that could predict which notifications were most likely urgent, so a caseworker could address these first. The two municipalities did not wish to introduce algorithms into the actual professional decision-making process (identifying vulnerable children or assessing notifications).

None of the four experiments are active today. While the first two were shut down or scaled back due to lack of legality or uncertainty about it, the last two did not deliver the necessary results. Furthermore, our research shows that these projects—which concern some of society’s most vulnerable citizens—faced intense public scrutiny. Over time, this led to an increased awareness among developers that artificial intelligence is not merely a “solution” but also involves many ethical dilemmas and calls for extra caution. In other words, maturity has grown in Danish projects in this area, towards greater caution and reduced algorithmic power. This is clearly reflected in the evolution from the first project, Gladsaxe’s data-driven detection algorithm, to the two municipalities’ email sorting trials.

Therefore, it was also surprising when Bjerre in the summer of 2024 mentioned forced out-of-home placements as an example where we could be less hesitant. None of the Danish experiments have dealt with this, but rather with preventive efforts and handling of notifications. Out-of-home placement is far more intrusive than the now-closed Danish projects.

The public AI signature projects

We have also researched the Danish AI signature projects. These cover 40 projects in Danish regions and municipalities, initiated between 2020-2022 with funding of DKK 187 million, spanning health, administration, climate and environment, employment, as well as social and care services. We developed a taxonomy to categorize AI based on the level of intrusion it potentially has in individual citizens’ lives and how it automates decision-making processes.

Our research shows that predictive algorithms in the signature projects were intended to predict at three different levels:

Case level: Algorithms classify cases based on text or images, e.g., classification of email content for automatic email sorting.

Individual level: Citizen-targeted predictive algorithms, e.g., predicting a citizen’s best match with companies in employment services or risk of readmission to hospital in the elderly care. Here, algorithmic prediction is used as a profiling tool.

Organizational level: Here, the focus is typically on optimizing resources for the entire organization, e.g., managing transport fleets, route planning, or energy use in buildings.

This division gave us insights into the diversity of AI experiments in the Danish public sector, but we could also see a correlation between the level of prediction and the projects’ prospects for implementation. Where citizen-targeted AI projects—with few exceptions—have been shut down, several AI projects operating at case and organizational levels are either in operation or on their way.

It is not surprising that citizen-targeted projects have been discontinued. Several were shown to lack the necessary legal basis—a broader trend also documented in the recently published database of Danish public AI projects. This does not to say that Denmark does not use citizen profiling. The method is used by the Danish Tax Agency and Udbetaling Danmark (the authority responsible for the collection, disbursement and control of a number of public benefits), which has led to criticism for disproportionate surveillance and lack of transparency.

At the same time, our research into the signature projects shows that public experiments with AI span many domains—from climate and energy to health and employment—and that many different solutions are being tested within various administrative areas. Thus, public AI experimentation is far broader and more diverse than the few projects that reach the public debate through the media.

There is usually a good reason why citizen-targeted profiling projects attract both expert and media scrutiny. When predictive algorithmic target people, social problems, and complex case processes, multiple dilemmas appear. Errors and algorithmic bias can have serious consequences for affected citizens. At the same time, these projects are also very complex—legally, in terms of municipal IT infrastructure (where many different systems must be able to exchange data in real-time), and in terms of data quality, as an algorithm is only as good as the data it is trained on.

In this light, it is not surprising that we are seeing a shift toward the so-called “boring” AI that “works.” We can see it as political and institutional maturation, reflecting a growing understanding that artificial intelligence in the public sector should be developed with both ethical consideration and legal robustness. Therefore, it is primarily administrative and organizational solutions—such as automation of route planning, case processing of building applications, and freedom of information requests—that are currently highlighted as exemplary.

What about the super boring AI?

It is well-known that the Danish AI signature projects were overtaken by technological developments. In November 2022, OpenAI launched ChatGPT, which can generate human-like text based on prompts. Instead of predictive models trained on public data and developed for specific, pre-defined areas, generative AI (GenAI) is of a different kind. It is not based on analyzing public historical data to predict future events, but on large language models trained on vast amounts of text and images harvested from the internet, aiming to understand and generate natural language. This means the technology is, by default, generic—it is not tailored to a specific task or sector but can be broadly and flexibly adapted.

Although GenAI also operates predictively by forecasting the next word in a sentence, it differs significantly from the predictive algorithms previously tested in Denmark in terms of technology, function, and use. Where predictive models in the public sector are typically trained on proprietary data and aim to optimize specific workflows (e.g., risk assessment of citizens, disease trajectory prediction, or case categorization), generative AI is characterized by broader applicability. Instead of classifying or predicting specific case matters, it generates text, summarizes documents, translates, answers questions, and assists in information retrieval. Generative AI is thus not tied to a narrow task but can be applied across domains and use cases. While predictive models operate in closed, data-driven systems with clearly defined outputs, generative AI works with open text production and interaction. This makes it more flexible but also more unpredictable and harder to control. It is well known that it can “hallucinate,” i.e., generate factually incorrect outputs, and that its outputs can be biased.

In the public sector, there are high hopes for using generative AI to (partially) automate documentation. A current example is speech-to-text technologies, where generative AI is used to transcribe and interpret audio by predicting the most likely text from spoken input. This means that a nurse, instead of manually entering observations, can dictate them, and the system would automatically generate a draft for the medical record. Similarly, one can imagine solutions where conversations between employees and citizens are recorded, transcribed, and turned into draft case notes or meeting minutes. This use of AI does not immediately interfere with professional judgment in the same way as algorithmic decision support but aims to reduce the time employees spend on administrative work—time that can instead be used for direct interaction with citizens. Therefore, the technology is often casted as a means to ease workloads and mitigate labor shortages.

But the question is whether these solutions actually “work” in practice and whether they are legally and organizationally mature for implementation. Generative AI, including speech recognition, often requires access to cloud infrastructure and models developed by American tech companies. This raises issues about data protection, especially if the systems are later fine-tuned using recordings or documents from Danish citizens and public employees. For example, is it permissible to use recordings of citizens’ and employees’ voices to retrain AI? And how robust is the technology in practice when it comes to handling dialects and pronunciation variations? Furthermore, language models often produce errors that require post-editing. If employees end up spending time correcting AI-generated documentation, the question arises: Is this truly an efficiency gain? These questions are currently being addressed in an ongoing PhD project.

Finally, it’s worth questioning the premise: that documentation is merely an administrative and time-wasting task. For many professionals, documentation is not just a record but a way of thinking and understanding practice—a form of meaning-making where observations, assessments, and priorities are recorded and reflected upon. Moreover, speech recognition and automation cannot necessarily capture the subtle elements that characterize the interaction between citizen and employee: the way a citizen enters a room, a change in tone of voice, smell, eye contact, or sudden silence. All these contribute to the tactile and embodied knowledge professionals use in their assessments. If we reduce documentation to a technical transfer from speech to text, we risk overlooking the important interplay between sensory impressions, professional judgment, and written reflection. This should be taken seriously when implementing new technologies.

Therefore, it is crucial to find the right balance between automation and professional autonomy. By involving both professionals and citizens in the development and implementation of AI solutions, it becomes more likely that the technology will support—rather than displace—judgment, experience, and relational knowledge. Autonomy can be strengthened through transparency, opt-out options, and clear boundaries for what AI should and should not be used for in specific tasks.

AI in the public sector is no longer just a matter of choosing the right technology or model but increasingly also a matter of global data infrastructures, organizational practices, and value-based priorities. The shift from ambitious but (often) controversial profiling projects to more practical and administrative applications reflects a growing recognition that AI is not a neutral tool, but a technology that participates in and shapes complex interactions between data infrastructure and legislation, professionalism, and citizens. “Boring AI” calls for thorough consideration, participatory practices, and reflection if we are to ensure that the technology actually works—technically, organizationally, and democratically.

Finally, we should not forget that “boring AI” is not necessarily (the only) answer to the political desire for efficiency. This was made abundantly clear in the public evaluation of the signature projects. The only project with a business case that showed savings equivalent to five full-time positions was Copenhagen Municipality’s project to automate case processing of building applications, which was primarily based on the somewhat older automation technology Robotic Process Automation (RPA). RPA is a technology that automates rule-based and repetitive tasks by mimicking human interaction with IT systems. While AI attempts to mimic human reasoning, RPA is more straightforward: it clicks, types, checks, and moves data. It is technologically simpler, but also more stable and predictable, and often sufficient to achieve significant efficiencies, especially in administrative workflows.

As my former colleague in the ADD project, Jakob Laage-Thomsen, has noted, there are even more “boring” solutions and a lower-hanging fruit than “boring AI.” This should remind us that AI is not always the answer. Sometimes the best solution is neither generative nor predictive—it is simpler and perhaps developed back in the 1990s.

The “boring AI that works” does not come by itself. It requires responsibility, the involvement of citizens and public sector professionals. In addition, it requires a thorough understanding of both infrastructure—not least cloud issues, technology, organizational realities, and democratic values, including the importance of citizen inclusion. The task for public and private actors is therefore not just to develop smarter solutions but to do so with care, patience, and attention to the reality in which they are to operate.

ADD Blogpost: From Controversial Experiments to Boring AI? AI in the Public Sector in Transition

By

Related content

Challenges and opportunities in addressing the climate impact of generative AI

New e-book from Viden om Data: Artificial Intelligence and Data Sharing

New Database on the Use of AI in the Public Sector