In 2025, the CENL “AI in Libraries” Network Group will again host webinars on various uses of Artificial Intelligence (AI) in national libraries.
For more information please see below and/or contact Jean-Philippe Moreux from the National Library of France, the chair of the group at jean-philippe.moreux@bnf.fr.
Alliance for Language Technologies EDIC
Abstract: The ALT-EDIC, the Alliance for Language Technologies, was proposed in December 2023 as one of the first EDICs. On 7 February 2024, the European Commission officially set up the ALT-EDIC. Coordinated by France, the ALT-EDIC counts seventeen Members States: Bulgaria, Croatia, Czechia, Denmark, Finland, France, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Netherlands, Poland, Slovenia, and Spain; one Region: Flanders; eight observing Member States: Austria, Belgium, Cyprus, Estonia, Malta, Portugal, Romania, and Slovakia.
The role of ALT-EDIC is to create a common European data infrastructure and services for language technologies in order to strengthen Europe’s technological competitiveness while supporting its cultural diversity. ALT-EDIC’s primary action involves collecting and federating language and multimodal data from across the European Union and its Member States. The consolidation of this language data will enable ALT-EDIC to foster the development of innovative Large Language Models with robust multilingual and multimodal capabilities.
(Slides: https://drive.google.com/file/d/1HlpLmEg7eAxTr8ScRUxmPhiCzBQjd6Q_/view)
Bio: Mr Kähler acquired degrees in mathematical sciences from the universities of Göttingen, Durham (UK) and Leipzig. After completing his studies, he specialized as Data Scientist and Research Software Engineer. Prior work has led him to the Federal Institute for Quality Assurance and Transparency in Health Care (IQTIG) in Berlin and the Helmholtz Center for Environmental Science (UFZ) in Leipzig, before joining the German National Library (DNB) in October 2021. Kähler is part of the Department for Automatic Indexing and Online Publications and project lead for a DNB research project that investigates the possibilities to exploit recent advances in natural language processing and novel machine learning approaches for the task of automated subject indexing.
Abstract: The topic of automated subject indexing has become prevalent in the library community. Subject indexing with large vocabularies is a complex problem that falls under the category of Extreme Multi Label Classification (XMLC). The widely used Annif toolkit provides a stable framework for tackling this challenge, allowing for a modular approach to extending backend algorithms. However, the rapid development of new AI methods demands continuous research and updates, to ensure state-of-the-art performance in library systems. In this talk, we present the results of our project, which aimed to investigate and compare various XMLC methods with a dataset of German scientific literature. We benchmarked several prominent XMLC approaches and compared them with our own LLM-based approach. Our evaluation consisted of both quantitative and qualitative assessments, providing insights into the strengths and weaknesses of each method. We also collected measurements of the resources used for training and inference. By presenting our findings, we aim to contribute to the development of more efficient and effective subject indexing solutions, ultimately enhancing the accuracy and reliability of information retrieval systems.
September 16th 2025, 2pm Paris time
Zoom: https://bnf-fr.zoom.us/meeting/register/jJ_X_ESITkmVFRIYCZZ6ZA
Bio: leading the KBR Data Science Lab which is hosted in the department of digitization of KBR.
Abstract: The BelgicaPress and BelgicaPeriodicals projects are dedicated to the large-scale digitization of historical newspapers and periodicals collections in Belgium. During this process, machine learning models are trained to facilitate the automatic recognition of front pages of newspapers and periodicals. In the meanwhile, explainable AI techniques are also applied to help with the understanding of inner-workings of the trained ML models. In this talk, we will share relevant experiences, results and observations.
October 14th 2025, 10am Paris time
Zoom: https://bnf-fr.zoom.us/meeting/register/C-3slPhDRlGZ7Ihy0Y-C2g