Modélisation et dysfonctionnement

Marcello Vitali-Rosati

UQAM - Institut des sciences cognitives - 18 mars 2026

… tu ne sais pas de quoi tu parles…

It is a rather curious fact in philosophy that the data which are undeniable to start with are always rather vague and ambiguous. You can, for instance, say: “There are a number of people in this room at this moment.” That is obviously in some sense undeniable. But when you come to try and define what this room is, and what it is for a person to be in a room, and how you are going to distinguish one person from another, and so forth, you find that what you have said is most fearfully vague and that you really do not know what you meant. Russell, Bertrand. 1986. The Philosophy of logical atomism: and other essays 1914-19i. Édité par John Greer Slater. The collected papers of Bertrand Russell 8. G. Allen; Unwin.

…nous ne savons presque jamais de quoi nous parlons…

Science is knowledge which we understand so well that we can teach it to a computer; and if we don’t fully understand something, it is an art to deal with it. Since the notion of an algorithm or a computer program provides us with an extremely useful test for the depth of our knowledge about any given subject, the process of going from an art to a science means that we learn how to automate something.

Knuth, Computer programming as an art

Mais comment faire pour le savoir un petit peu?

[…] the greater potential is for computers as modeling machines, not knowledge jukeboxes. McCarty, Willard. 2005. Humanities Computing. Paperback edition. Palgrave Macmillan.

La réponse est donc : le modèle!

Mais qu’est-ce qu’un modèle?

modèle représentationnel
fonctionnel
physique

Et si le modèle ne marche pas…


<span class="author">———</span>

La connaissance est un bug

La variatio dans l’Anthologie

Un projet de recherche avec Dominic Forest, Yann Audin, William Bouchard, Elsa Bouchard, Mathilde Verstraete et d’autres personnes

Définir un concept littéraire avec un algorithme

une représentation
une distance (relation)

Approches simples

Représentation: sac de mot
mesures de similarité
- Cosinus (distance entre deux points)
- Coefficient de Jaccard (rapport entre totalité de deux ensembles et intersection)
- Distance de Damerau-Levenshtein (nombre de mots à changer pour aller d’un ensemble à l’autre)

Vectoriser?

Représentations:
- Word2vec
- Glove
- et… Bert
Distances:
- cosinus
- word mover’s distance

Conclusions?

La difficulté est conceptuelle
Les méthodes plus “high tech” restent relativement opaques
Les données sont importantes!

Il n’y a pas de magie technologique… juste beaucoup de publicité trompeuse