Short term memory traces in neural networks

author: Surya Ganguli, Department of Applied Physics, Stanford University
published: Oct. 17, 2008,   recorded: September 2008,   views: 6535

Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


Critical cognitive phenomena such as planning and decision making rely on the ability of the brain to hold information in working memory. Many proposals exist for the maintenance of such memories in persistent activity that arises from stable fixed point attractors in the dynamics of recurrent neural networks. However such fixed points are incapable of storing temporal sequences of recent events. An alternate, and relatively less explored paradigm, is the storage of arbitrary temporal input sequences in the transient responses of a recurrent neural network. Such a paradigm raises a host of important questions. Are there any fundamental limits on the duration of such transient memory traces? How do these limits depend on the size of the network? What patterns of synaptic connections yield good performance on generic working memory tasks? To what extent do these traces degrade in the presence of noise? We use the theory of Fisher information to construct of novel measure of memory traces in neural networks. By combining Fisher information with dynamical systems theory, we find precise answers to the above questions for general linear neural networks. We prove that the temporal duration of a memory trace in any network is at most proportional to the number of neurons in the network. However, memory traces in generic recurrent networks have a short duration even when the number of neurons in the network is large. Networks that exhibit good working memory performance must have a (possibly hidden) feedforward architecture, such that the signal entering at the first layer is amplified as it propagates from one layer to the next. We prove that networks subject to a saturating nonlinearity, can achieve memory traces whose duration is proportional to the square root of the number of neurons. These networks have a feedforward architecture with divergent connectivity. By spreading excitation across many neurons in each layer, such networks achieve signal amplification without saturating single neurons.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: