We offer accredited university courses, as well as workshops, bootcamps, and online resources to enable French-speaking audiences to learn about AI safety.
Our research program aims to detect behavioral problems in models, to propose effective ways to monitor them, and to identify solutions to these problems.
We offer talks, hackathons, articles and interviews to raise awareness on the risks of AI and contribute to its responsible development.
Our research program aims to identify and correct the problematic behaviors of AI models by developing monitoring tools and relevant solutions. Our model analyses and evaluations will help inform the decision-making of their designers, and promote the innovation and industrialization of advanced AI safety techniques.
We aim to highlight current risks, but also to explore the challenges that future AI models could pose, so that our research facilitates responsible progress in the field.
Our first project aims to develop a scalable supervision system for agents based on LLMs. Current LLM agents are already showing various types of failure modes. By learning to detect them now using less advanced models, we can start iterating to create robust and scalable surveillance systems for future agents.
Creation of a comprehensive dataset of LLM agent traces containing unexpected behaviors in these agents, such as prompt injection, deception, and excessive autonomy. The dataset is divided to leave out behavioral classes intended for testing supervision systems.
Experimentation with different supervision architectures to monitor advanced LLM agents using less advanced models, in order to detect hidden anomalies without explicit prior knowledge of these behaviors.
Creation of an open-source tool that easily integrates into existing LLM agent architectures, facilitating feedback loops and testing robustness in real conditions through the collection of data from the community.
Despite massive investments in AI in recent years, the opportunities to train in AI safety remain very inadequate given the scale of the challenges we face. To fill this gap, we offer programs in various formats aimed at training researchers and engineers in the latest advances in the field.
We teach at ENS Paris and the Master MVA at ENS Paris-Saclay a course on AI safety, named ”Turing seminar”. This accredited course includes presenting articles, completing research projects, and occasionally organizing debates and discussions. Sessions are designed to enrich the educational experience and encourage dynamic and in-depth interaction with the topics.
These intensive 10-day bootcamps are designed for students very talented in math and computer science, from France and elsewhere, in order to strengthen their skills in machine learning and AI safety.
The objective is to make them aware of these themes through presentations and readings, to engage them in projects related to AI safety, and to encourage them to continue their careers in this essential but neglected field.
The capabilities of artificial intelligence are improving quickly, but safety aspects are lagging behind. It is therefore crucial to highlight the need for trustworthy AI research, since the state of the art is already insufficient for industrialization in many fields (health care, transport, defense, etc.), and the risks of future models are even more concerning. This is why, along our research and education work, we are raising awareness and disseminating information to the general public and AI-relevant actors.
At the interface between awareness and research, we organize hackathons focused on AI safety challenges. These events allow us to explore this field and develop solutions for safer AI. These hackathons come in various formats, some introductory, others more advanced and focused on research.
We also organize a variety of events (talks, round tables, workshops) addressing issues related to the progress of AI. The topics cover, among other things, current and future challenges, technical challenges, and governance.
We publish articles, reports, and summaries to inform researchers, decision-makers, and citizens on the evolution of AI.