AI vs HI ?

The “human privilege”: enjoying strawberries and cream

Arguments from Various Disabilities. These arguments take the form, “I grant you that you can make machines do all the things you have mentioned but you will never be able to make one to do X”. Numerous features X are suggested in this connexion. I offer a selection:
Be kind, resourceful, beautiful, friendly, have initiative, have a sense of humour, tell right from wrong, make mistakes , fall in love, enjoy strawberries and cream, make some one fall in love with it, learn from experience, use words properly, be the subject of its own thought, have as much diversity of behaviour as a man, do something really new. […]
No support is usually offered for these statements. I believe they are mostly founded on the principle of scientific induction. A man has seen thousands of machines in his lifetime. From what he sees of them he draws a number of general conclusions.Turing, Computer Machinery and Intelligence, 1950

Everything that can be defined unambiguously can be modeled

You insist that there is something that a machine can’t do. If you will tell me precisely what it is that a machine cannot do, then I can always make a machine which will do just that. Von Neuman

Questioning the definition of intelligence

How do we define intelligence?

How do we define the different forms of intelligent behavior?

Intelligence definition models

  • define natural language
  • define specific concepts
  • define feelings
  • define consciousness
  • define creativity

What does a particular algorithm tell us about intelligent behavior?

  • Let’s avoid essentializing the “human”, the “machine”, the “intelligence”.

Creativity: the wrong question

Can a machine be creative?

Is chatGPT creative?

Creativity: the good question

What is creativity?

How can we give a formal definition of creativity?

Temperature as creativity

\[\sigma(z_i) = \frac{e^{\beta z_{i}}}{\sum_{j=1}^K e^{z_{j}}} \ \ \ for\ i=1,2,\dots,K\]

General (very simplified) principle of an LLM

Guessing the most probable following token

Very simplified example

  1. I remember the summer nights.
  2. I remember the warm summer nights.
  3. I remember the cool summer nights.
  4. I remember the laughter.
  5. I remember the joyful laughter.
  6. I remember the duck.
  7. I remember the friends from high school.
  8. I remember the friends from college.
  9. I remember the friends from work.
  10. I remember the friends from the park.
nights = 3
laughter = 2
duck = 1
friends = 4

Learning probability

An LLM is a model that learns the probability of a token \(e_t\) given the previous token \(e_{t−1}\). That is:

\[P(e_t | e_{t-1})\]

Softmax!

\[\sigma(z_i) = \frac{e^{z_{i}}}{\sum_{j=1}^K e^{z_{j}}} \ \ \ for\ i=1,2,\dots,K\]

\[z_1 = 3\] \[z_2 = 2\] \[z_3 = 1\] \[z_4 = 4\]

The softmax is the function \[\sigma(z_i)\]

We want:

\[\sigma(3) = ?\]

\[\sigma(2) = ?\]

\[\sigma(1) = ?\]

\[\sigma(4) = ?\]

The sum of these 4 values must give 1 (so we’ll have the percentages, by multiplying by 100)

\[\frac{e^{z_{i}}}{\sum_{j=1}^K e^{z_{j}}}\]

The numerator of the fraction is the exponential function which has base \(e\), Euler’s number and exponent \(z_i\), the digit to which we are applying the equation.

The denominator is the sum of all the results of the exponential function applied to the digits we want to process.

In our case, the denominator will be:

\(e^3 + e^2 + e^1 + e^4 = 84.791024884\)

(\(e = 2.71828...\))

And so:

\(\sigma(3) = \frac{e^3}{84.791024884} = 0.23\)

\(\sigma(2) = \frac{e^2}{84.791024884} = 0.10\)

\(\sigma(1) = \frac{e^1}{84.791024884} = 0.03\)

\(\sigma(4) = \frac{e^4}{84.791024884} = 0.64\)

We therefore have the following probabilities:

  • nights = 23%
  • laughter = 10%
  • duck = 3%
  • friends = 64%

Yeah… but what about creativity?

A mathematical twist

\[\sigma(z_i) = \frac{e^{\beta z_{i}}}{\sum_{j=1}^K e^{\beta z_{j}}} \ \ \ for\ i=1,2,\dots,K\]

Temperature?

\[T = \frac{1}{\beta}\]

the higher the temperature, the more disorganized the system is

The effect

If we increase the temperature (and thus decrease \(\beta\)), the difference between the percentages will be reduced

Exemple

temperature to 5 and thus \(\beta = \frac{1}{5} = 0.2\)

Our 4 digits will be transformed as follows:

  • nights: 3 x 0.2 = 0.6
  • laughter: 2 x 0.2 = 0.4
  • duck: 1 x 0.2 = 0.2
  • friends: 4 x 0.2 = 0.8

Our denominator will therefore be:

\(e^{0.6} + e^{0.4} + e^{0.2} + e^{0.8} = 6.760887185\)

And if we do the calculations with these new numbers (it is intuitive that the exponent being smaller, the result will be lower, thus the gap lower):

\(\sigma(3) = \frac{e^{0.6}}{6.760887185} = 0.27\)

\(\sigma(2) = \frac{e^{0.4}}{6.760887185} = 0.22\)

\(\sigma(1) = \frac{e^{0.2}}{6.760887185} = 0.18\)

\(\sigma(4) = \frac{e^{0.8}}{6.760887185} = 0.33\)

The new probabilities:

  • nights = 27%
  • laughter = 22%
  • duck = 18%
  • friends = 33%

Cool demo

Completely new?

smoothing

And new words? Also possible!

What is “creativity” according to this formal definition?

A behavior that deviates from a normal probability distribution

Is it “the good definition”?

Wrong question

But for me, creativity is more like…

GIVE THE F***(ORMAL) DEFINITION!