Architecture of a general network model for studying synaesthesia.The network consists of two interacting modalities. Each modality has a set of input neurons and a set of output neurons. There are feedforward connections within each modality but not between them. There are recurrent connections among all output neurons. The subset of recurrent connections that connect the two modalities are referred to as cross-talk interactions. The goal of the network is to optimize the representation of the combined input from both sets of input neurons, by the neurons at the combined output layer.
For the last twenty years, theories of synaesthesia have been dominated by two general models: disinhibited feedback from multi-sensory regions to uni-sensory regions, and cross-talk theories which have emphasised the presence of atypical (and direct) structural connectivity between modalities . Whereas the former explanation has tended to be favoured for explaining acquired synaesthesia, the latter has dominated explanations of developmental synaesthesia. The approach taken in our computational model represents a significant departure from this current status quo, and has generated novel insights. Our model repositions synaesthesia not as some quirk of aberrant connectivity but rather as a functional brain state that emerges, under certain conditions, as a consequence of optimising sensory information processing. In short, this model goes beyond others by offering an account not only of howsynaesthesia emerges but also of why synaesthesia emerges. It offers a unifying account of acquired and developmental forms of synaesthesia insofar as it explains how the same outcome can emerge under different conditions within the same model.
Acquired synaesthesia is often associated with sensory deprivation due to damage to the sensory organs or pathways. Our model proposes that the same learning process that optimizes information representation naturally causes neurons in the deprived modality to enhance incoming inputs from intact modalities, leading to synaesthesia. To provide some intuition, we note that our model maximizes the output entropy of the network, which depends on two factors: one is the entropy of each single neuron, i.e. how variable the activity of single neurons is, and the other is the correlations among the neurons. Maximizing this entropy favours high single neuron entropy and low correlations among the neurons. The cross-talk connections induce correlations between the two modalities, which in general tend to reduce the output entropy. However, when one modality is deprived of input, it may be beneficial to have cross-talk connections from the intact modality to the deprived modality. The increase in the single neuron entropy due to the cross-talk connections can compensate for the higher correlations and result in a total increase of the output entropy. Loosely speaking, the deprived neurons seek for other neuronal sources of variability and enhance their connections with them. This mechanism, which emerges naturally in our computational framework, can also be useful for modelling the changes in neural representation that take place in other conditions such as phantom-limb .
Although functional accounts for acquired synaesthesia have been proposed in the past, no such comparable account has been put forward for developmental synaesthesia. Our model suggests that it arises from instability in the learning process due to high plasticity. It implies that synaesthetes have higher plasticity compared to non-synaesthetes or a relatively prolonged period of high-plasticity during childhood. Later on, as plasticity in the relevant brain areas decreases, the evolved cross-talk connections become stable. In line with this idea, whole-genome studies link some forms of synaesthesia to genes involved in plasticity, which have higher expression during early childhood . Furthermore, developmental synaesthesia does not appear to be linked to sensory impairments and, if anything, is linked to increased perceptual sensitivities (notably within the concurrent modality). For instance, grapheme-colour synaesthetes show enhanced colour discrimination abilities . In the proposed model, the recurrent connections within the concurrent modality amplify both its direct inputs and the ones from the inducer modality. Thus, an association between synaesthesia and increased perceptual sensitivity is an emergent property of the model, at least under certain scenarios, and it is important to explore the extent to which the presence of synaesthesia (cross-modal sensitivity) necessarily goes hand-in-hand with changes in intra-modal sensitivity. In terms of the underlying neurobiological mechanisms, the increased amplification by the recurrent interactions in our model is consistent with findings that indicate increased excitability and elevated glutamate concentration in the relevant cortical areas in synaesthetes [37, 38].
Traditionally, synaesthesia has not been linked to theories of learning and memory because it has been considered to reflect an innate (in its developmental form) cross-wiring of the senses. This view has been challenged on several fronts [e.g. 39, 40]. Firstly, many of the stimuli that induce synaesthesia (e.g. graphemes) are themselves learned. Secondly, for some synaesthetes the particular associations have been influenced by childhood coloured letter sets . Moreover, some general cross-modal correspondences (e.g. between pitch and vertical positions) thought to reflect innate vestiges of synaesthesia have been shown to occur as statistical regularities in the environment . Finally, synaesthetes (at least for grapheme-colour synaesthesia) are known to have better acquisition of new memories, and this may be related to increased plasticity during learning . Future simulations of the model could use partially correlated inputs to the two modalities to model childhood exposure to coloured letter sets (they are not fully correlated given that most literacy exposure is with achromatic letters). It may well be the case that there is an interaction between learning rate (an innate parameter within the synaesthete brain) and these partial associations (in the environment), which explains why most people do not go on to develop synaesthesia after exposure to these stimuli.
An interesting hypothesis that emerges from this study regards the relationship between synaesthesia and the concept of critical brain dynamics [28, 42, 43]. The goal of the learning process in our model is to find the pattern of recurrent interactions that maximizes the sensitivity of the network to changes in its external inputs. In analogy to physical systems, in which the sensitivity (often termed susceptibility) to external inputs diverges near a critical point , here, as the network maximizes its sensitivity, it also tends to approach a critical point . This critical point represents the border between normal amplification of external inputs and a regime governed by attractor dynamics. In the context of sensory processing, the super-critical attractor phase can be thought of as hallucinations that reflect the learned pattern of interactions. A useful measure for identifying critical dynamics is the time it takes the recurrent network to reach steady-state. When close to critical points, many dynamical systems display the phenomenon of critical slowing down [28, 45]. Interestingly, in simulations of the complex model in which synaesthesia evolved, when the learning process approached the optimal pattern of interactions, the dynamics of the recurrent network became substantially slower (the number of iterations required to process each input sample until reaching steady-state was ~35000–45000 compared to ~1000–4000 in the beginning of the learning process). This observation suggests that in the proposed model networks that developed synaesthesia operate closer to a critical point compared to networks that did not develop synaesthesia. The prediction is that there may be evidence of the neural signatures of critical dynamics in synaesthetes [46, 47], particularly as synaesthesia is developing.
In terms of its similarities to other models, our model resembles the direct cross-talk (or cross-activation) models proposed by others  primarily to account for developmental forms of synaesthesia. Although the model represents a direct form of cross-talk, it is an open question as to whether the model would produce similar patterns if neurons from modalities 1 and 2 were not directly connected but were themselves both connected via a third pool of neurons that receives no direct input from 1 and 2. There is some evidence for both direct and indirect types of neural architecture in synaesthesia as assessed via fMRI effective connectivity . The addition of an interconnecting hub area in future modelling attempts would give the model top-down representations that could be adapted to the (Bayesian) predictive coding framework. Unlike the present (bottom-up) model, the predictive coding approach describes perception as top-down inference that is constrained and altered by sensory signals. A non-computationally explicit account of synaesthesia in terms of predictive coding has been articulated . Moreover, the kinds of learning algorithms employed in our model are compatible with this approach .
The gradient-based learning rules used in this study are not local and are thus expected to reflect the long-term evolution of the system rather than mimicking the moment-by-moment dynamics of real neural circuits. In addition, the neurons in the model are described by simplified rate dynamics which do not capture the complex dynamics of real neurons. An important direction for future modelling work would be the examination of more biologically realistic networks that also optimize information representation. The scenarios for the evolution of synaesthesia described in this study are very general and we believe that similar scenarios would appear also in more realistic networks.In summary, these computational models permit new ways of thinking about synaesthesia both in terms of causal mechanisms and in terms of optimising perceptual function. It generates non-trivial outcomes (e.g. generating monotonic mappings not found in the input characteristics) and non-trivial predictions (e.g. relating to learning, unimodal perceptual sensitivity, hallucinatory tendencies).