Talk: Language as a Dynamical System - Prof. Dr. Michael Spivey (California, Merced) - 2022

Details

Title : Talk: Language as a Dynamical System - Prof. Dr. Michael Spivey (California, Merced) Author(s): InCognitus Link(s) : https://www.youtube.com/watch?v=uRV-EPdgu30

Rough Notes

Some nice experiments mentioned:

Give the user an instruction such as "Pick up the candle" from a gridworld consisting of objects with a candle 1 unit right of the origin and a candy 1 unit left of the origin, other objects not sounding like the word "candle". Eye-tracking reveal that as the user hears the first phonemes, they first look at the candy before fixating their eyes on the candle, and they would say they did not look at the candy. Replacing the candy with a control object such as a cucumber, they directly look at the candle. Plotting the probability of fixation as a function of the time from target onset, we see partial activation for the cohort object (e.g. candy above) and practically no activation for the distractor objects (e.g. cucumber above). These results are replicated (after normalizing the activations to lie in \([0,1]\)) by the TRACE model of speech processing, which is a neural network that takes phoneme features in 10ms chunks, which activate phoneme nodes whic then activate word nodes, and partial activation of words sends feedback to activate phonemes it has not even heard yet. Each node at the world or phoneme level, has an underlying population of neurons that work together as a coherent representation of a group of neurons that get partially activate and more active if the appropriate input comes. A cartoon visualization can be thought of as two populations (subsets) of neurons which intersect for "candy" and "candle", and when saying "candy", first the neurons activate for "cand". If we happen to know these neurons, and stick electrodes into them, we would have a bunch of them with high spike rates for both these words which overlap due to the similarity. Over time, the spike rates increase until they match the pattern for activation of a word like "candy". From a dynamical systems perspective, (suppose \(k\) neurons being measured) this can be thought of as a trajectory in \(\mathbb{R}^d\) that settles into the attractor state for the word "candy", with the attractor state for "candle" closeby, and the trajectory going in a sense to the middle of both of them since the user hears "cand". The speaker thinks that the dynamical systems perspective should not be thought of as a metaphor (replacing the box and arrow metaphors from before) but rather should be thought of as a description since it describes the trajectories of spike rates in experiments such as the one described here.
A continuous analog of the above experiment is to have another experiment where the users are instructed to move the pointer to the "beetle" where they are shown two objects, the "beetle" and the cohort object "beaker" or some control object. Both objectes are on the top quadrant symmetrically and the mouse pointer is in their center a few units below. Comparing the actual trajectories show that in the cohort experiment the users go to the middle of them first since the words "beetle" and "beaker" have similar first phonemes, and in the control this curvature is less visible. This is a way to imagine the 2D trajectory mentioned in the previous experiment.
Moving to sentence level incrementality, consider the following experiment, with the ambigous sentence "Put the apple on the towel in the box" and the unambigous sentence "Put the apple that's on the towel in the box". In the 1-referent context, where there are 2 towels, one with an apple on it, and a pencil (distraction) and a box, those who got the ambigous instruction first look at the apple, then the towel which did not have the apple on it(since there is ambiguit in the beginning "Put the apple on the towel…"), and then put the apple in the box. In the 2-referent context, where instead of the pencil we have have instead of the pencil a napkin with an apple on it, upon hearing "Put the apple…" users look at both apples very quickly and hearing "on the towel" from the ambigous instruction fixate on the apple on the towel and hearing "in the box" put it on the box, rarely looking at the irrelevant towel. Hence with a 1-referent context there is a garden path effect where the towel is treated as the destination of the towel. Adding a visual context of 1 more apple made the garden path effect go away.

Moving beyond experiments, consider the sentence A "cuts" B, where A,B could be:
- The lumberjack, wood.
- The pastry chef, cake.
- The butcher, meat.
- The boy, paper.
- Using a saw, the boy, wood.
- Using a saw the surgeon, the bone.
- At the restaurant the surgeon, the steak.
The word "cut" hence occupies a region in the state space which is like an attractor state from A, and a repeller state to B, kind of like a bottle-neck, allowing for the word cut to have even metaphorical meaning, e.g. "those words really cut in to me". This dynamical systems perspective is in constrast to the static tree based representation of "things".

Moving onto dynamics of language across two people, there is the following experiment. 2 speakers, a male and female, say a story which is recorded. They are both super-imposed in a way that you can see one of them if you look just right, and this is shown to 2 listeners, one of whom is asked to concentrate on the male speaker, the other on the female speaker. We see that the EEG signal of the person listening to the male speaker is more correlated with the male speaker's EEG signal than the female speakers EEG activity and vice versa, since they are thinking about the event they had just listened to. Other thant that, there are also experiments showing coupled eye movement and also coupled postural sway between conversational partners. These are 3 examples where a brain and body of 1 person having a conversation with the brain and body of another person gets correlated with the sway, eye-movement and brain activity, behaving a little bit like 1 system, a dynamical system which is coordinated in a fashion to solve a puzzle, carry out a coherent conversation, finish each others sentences once in a while etc.