Learning without neurons in physical systems

Learning is traditionally studied in biological or computational systems. The power of learning frameworks in solving hard inverse-problems provides an appealing case for the development of `physical learning' in which physical systems adopt desirable properties on their own without computational design. It was recently realized that large classes of physical systems can physically learn through local learning rules, autonomously adapting their parameters in response to observed examples of use. We review recent work in the emerging field of physical learning, describing theoretical and experimental advances in areas ranging from molecular self-assembly to flow networks and mechanical materials. Physical learning machines provide multiple practical advantages over computer designed ones, in particular by not requiring an accurate model of the system, and their ability to autonomously adapt to changing needs over time. As theoretical constructs, physical learning machines afford a novel perspective on how physical constraints modify abstract learning theory.


Physical learning
When no off-the-shelf solutions are readily available, systems are often rationally designed to provide solutions to our everyday problems. Designing such a system is known as solving an "inverse problem" since one attempts to find a system with a desired response to a perturbation, in contrast to the "forward problem" of predicting responses of a given system to perturbations. To solve inverse problems, one must sift through possible systems by varying design parameters or design degrees of freedom (d.o.f). Solutions are typically sought using centralized, top-down approaches, where a designer can access and modify the elements of a system. Numerous computational algorithms have been developed for designing a wide range of physical systems ranging from materials to robots, including simulated annealing and genetic algorithms, that typically employ standard computers and may involve physical prototypes (Fig. 1A). Inverse problems are usually much more difficult to solve compared to forward prediction problems, in particular because solutions are typically non-unique.
Learning is an established bottom-up framework for solving inverse problems (1) with specific advantages. Learning has been explored in the biological context of the brain and in artificial neural networks. However, learning is a more broadly applicable framework in which a system is changed incrementally in a way that enables adoption of a desired behavior. These incremental changes are defined as a function of the system's response to external stimuli. A learning process consists of two major parts: (a) evaluating the system's output for a given input, (b) modification of the system based on the output. The system is first evaluated by applying stimuli from a training set. Then, the response of the system to the stimuli drives the modification of learning d.o.f, such that subsequent application of stimuli result in improved responses. This process is usually conducted iteratively until the system achieves satisfactory performance.
Here, we explore the emerging area of physical learning in materials. Typically, no computers are involved in this approach. Instead, a material is physically subject to examples of the desired behavior. In response, elements of the material -the learning degrees of freedom (d.o.f ) -change as described by a system-dependent dynamical process called a 'learning rule' (Fig. 1A), underpinning the bottom-up, distributed nature of learning. A physical learning system thus has the following ingredients: 1. Physical d.o.f s that respond to external stimuli f by adopting a state or dynamic behavior s(f ), a subset of which defines the desired behavior (i.e., output). Here, h is an arbitrary non-linear function. The quality of learning is evaluated by whether the system learns the desired behavior, often quantified by a learning cost function C(s(f ; {wi})). Note that the dynamics of the physical system itself do not compute the cost function; instead, physical learning seeks to minimize this cost function through the learning rule and the choice of stimuli used for training.
Broadly, we can divide physical learning into categories based on the amount of user input during training: 1) Physical unsupervised learning (Fig. 1B), where a physical system self-adapts to instances of external stimuli with no supervisor intervention, typically by a

Spike-timing-dependent plasticity in memristor networks
Hebbian molecular self-assembly Figure 1 Physical learning vs computer-aided design (A) Materials are often computationally designed for particular properties or responses, either entirely on a computer or through an iterative design-build-test process. In contrast, given external stimuli, in physical learning, materials autonomously modify their parameters to adopt desired properties or functions. Such autonomous learning machines modify themselves based on their response to stimuli according to physical 'learning rules', which can be classified by the level of supervision. (B) Physical unsupervised learning, e.g. Molecular self-assembly with Hebbian-learned interactions (3), Hebbian growth (4) and directed aging in elastic networks (5) (credit to N. Pashine). (C) Physical supervised learning, e.g. Thumbs up/down rules in creased sheets (6), contrastive learning in flow networks (7) and Spike timing dependent plasticity in memristive neural nets, reproduced from (8) (CC BY 4.0).
Hebbian-like process (2), and thereby learns to represent some aspect of these stimuli, 2) Physical supervised learning (Fig. 1C), where an error signal is provided by a supervisor comparing the obtained and desired physical outcome, so systems can learn to exhibit specific responses for specific stimuli.

Why learn using a physical system?
Physical learning blurs the boundary between structure and function; the physical elements responsible for material properties also perform the task of adapting those material properties. Such a merger is motivated by both practical and theoretical considerations. At a practical level, physical learning can lead to a new generation of materials capable of autonomous learning that offer several advantages over computer design. 1) While silicon-based computing and learning is incredibly powerful, a physical system that learns is more appropriate to perform tasks where the inputs and outputs are physical influences from or on the environment (e.g., forces, currents, molecule production) rather than symbolic information. 2) In this new framework, physical systems can learn from real examples of the desired behavior, so that the desired behaviors themselves do not need to be modeled on a computer. 3) Similarly, a detailed model of the physical system is not needed since the response of the real physical system, imperfections included, determines physical learning. In contrast, computer-aided design is only as good as the model of the material and the model of the desired behavior and environment in which the material will be used. 4) Physical learning can allow systems to continually learn new behaviors in situ as requirements change. 5) Physical learning may be appropriate for applications with space, time or energy constraints or where robustness due to the distributed nature of learning and information processing (rather than in one electronic processor) is important.
From a theoretical perspective, studying learning in physical systems may shed new light on the fundamental requirements and limitations of learning in the presence of physical constraints. How can natural systems evolve the ability to learn? A theory of physical learning would expand our current modular conception of learning and memory in biological systems; we often look for control units (e.g., the brain) separable from the parts being controlled (e.g., the muscle). For example, even in single celled organisms, we tend to look for a separate gene regulatory or protein interaction circuits that learn and make decisions which are then carried out downstream by, say, physical self-assembly or cytoskeletal processes (9). But the downstream "muscle" itself can potentially learn and make decisions, as known to be the case in organisms ranging from ciliates to fruit flies and bee hives (10,11,12,13). Physical learning could reveal the broader scope of such non-modular learning and information processing available for free in the "muscle" (14).
Finally, this approach can provide a framework to understand atypical disordered systems and how physical systems explore parameter space to arrive at such atypical points. While we have made much progress in studying typical disordered systems through random ensembles, many examples of disordered systems in nature are highly atypical, e.g., due to evolution or other natural processes. Physical learning provides one framework for such atypical phenomena as the origin of structures and architectures (e.g. hierarchies) in networks (15) and the prevalent low-dimensionality of physical responses (16,17).
The framework outlined here also builds upon ideas previously explored in materials science where materials are subject to some conditions (e.g., compression, temperature changes) with the goal of internal rearrangements that enhanced desired properties. For example, metals have been hardened by thermomechanical protocols (18) which reshape grain size and distribution in a way that improves strength, which in turn is part of a long tradition of using annealing and 'hot stamping' protocols (19). Similar ideas apply to processing polymers by which the same polymer mixes can be 'trained' to rearrange molecules with different resultant mechanical properties by e.g. different extrusion protocols (20).
The framework here goes beyond such processing techniques by expanding the space of behaviors training can be used for, including supervised learning for input-output responses and pattern recognition of spatial or temporal correlations in chemical and mechanical stimuli. The framework also significantly expands the range of learning mechanisms beyond molecular rearrangements to include growth and degradation, and active learning of molecular interactions. Many of the advances here are a conceptual synthesis between ideas in materials processing and ideas of learning in computer science. A closely related development is the idea of memory in materials (21), discussed later.

Challenge and opportunity: local learning rules
Machine learning algorithms implemented on a computer typically compute a cost function C({wi}), a global quantity that reflects performance on a task at hand, and strives to minimize it, e.g., through gradient descent, Gradient descent procedures are efficiently implemented on computers (especially GPUs), e.g. in artificial neural networks through back-propagation (22). Note that such a process is highly non-local when wi are interpreted as parts of a physical system: the change dwi in a given element wi depends in principle on the entire neural network, including e.g., how a 'distant' neuron responded to stimuli.
Such highly non-local changes cannot be realized in physical learning without effectively mimicking a computer algorithm with physical components. Instead, we seek to understand to what extent physical systems can learn by exploiting typically local natural processes without any explicit cost function, Here, the change in a learning degree of freedom w[x, t] located at a point x, t in the physical system is restricted to change only based on the state s[x, t] of the system at the same (or close vicinity) spacetime point (Here, h is an arbitrary local function). Note however, that the local state s(f ; {w})[x, t] will generically depend on a stimulus f and learning d.o.f {w} of the entire system at distant points because of collective dynamics of the physical system; e.g., the way an elastic material deforms at point x due to a force f can depend on force components f (y) and on material properties at distant points y.
Thus, there is no explicit cost function or global optimization involved in local rules in Eq. 2. However, if the local learning rule h and stimuli f are chosen correctly, the system can nevertheless minimize a cost function as the collective physical dynamics during the response to stimuli can encode global information in the local state s(f, {w})[x, t].
Physical systems are in principle more constrained in their learning abilities than an in silico neural network (23). However, conceptually similar constraints also distinguish the brain from artificial neural networks; learning in the former is often constrained by locality (e.g., Hebb's rule (2) or Spike-Timing-Dependent Plasticity, STDP (24)). Nevertheless, biologically plausible learning rules, simulated on computers, have proven successful (25), suggesting that physical learning has potential despite locality constraints (26). Further, locality provides benefits as well, since learning can be desynchronous, more robust and scale better with system size since it does not rely on a central processor (27,28).

Relationship to machine learning, neuromorphic computing and physical computation
There have been many threads of work related to physical learning over the previous years. The field of neuromorphic computation (29,30) is closely related to physical learning in that physical elements are modified so that the system adopts a desired computational ability. However, neuromorphic computers are often considered for a distinct objective -to compete with in silico machine learning (ML) in its domain of symbolic inputs and outputs. The physical learning machines explored here can deal with problems where the inputs and outputs are physical (e.g., forces, molecular assemblies). Neuromorphic computing and in silico ML algorithms do not directly compete in this realm since they require translation of stimuli to electronic or other symbolic inputs. More saliently, these methods cannot directly display any physical output such as a elastic response or a self-assembled molecular structure in response to a stimulus. The fields of physical computation (31) and molecular computation (32, 33) explore how to implement fixed computations with physical systems. The systems explored here build on these prior works by autonomously learning what computation (or more generally, physical behavior) needs to be carried out in the first place by physically experiencing examples of the desired behavior. Another closely related physical computation framework is the field of reservoir computing (34). Reservoir computing uses physical systems with such complex internal dynamics that no physical elements need to change during a learning process; instead, learning how to interface with a fixed physical system can effectively define inputs and outputs in a way that solves inverse problems. This interface can be learned using a computer as an output filter (35).
In this review, we strive to showcase recent experimental and theoretical advances in the emerging field of physical learning. The remainder of the review is organized as follows: In section 2, we feature examples of physical learning machines, emphasizing ways of experimentally realizing physical learning. In section 3, we discuss the physically realizable learning rules powering learning machines, subdivided into unsupervised and supervised learning rules. In section 4, we describe practical aspects inherent to realizing learning machines in experiments. In section 5 we highlight some theoretical learning concepts and how they manifest in the framework of physical learning. Finally, section 6 focuses on how the physical properties of a system are modified as it undergoes physical learning.

Elastic materials: networks and sheets
Disordered elastic media and their abstractions, ranging from spring networks to creased sheets, have received much interest for their potential to mechanically respond in nonstandard, desired ways (36,37,38). Recent works have explored the feasibility of physically learned behaviors in these systems (4,5,39,40,41). Here, the learning d.o.f wi generally involve either bond lengths or bond stiffnesses in elastic networks and crease stiffnesses in creased sheets. For example, auxetic materials (negative Poisson's ratio, Fig. 1B, right) were trained by imposing the desired global deformation (the stimulus f ) on a material ( Fig. 2A), which leads to a pattern of local strain across the material (the response s(f )). A learning process, typical of these works, involves bond stiffnesses wi that change according to the strain in that bond (i.e., locally) according to rules such as dw i dt ∼ −(straini) 2 . Physical learning can also train non-linear features like bifurcated folding pathways in mechanical systems (Fig. 1C, left). Origami and Kirigami sheets with disordered crease patterns have rich folding topologies (42), starting from 2d flat sheets with creases and slits, and ending with various folded shapes in response to given force patterns. While the crease geometry is usually fixed by fabrication, the bending stiffness of creases can serve as learning d.o.f wi and may change in response to folding strain. Theory and simulations have shown that folding pathways of such sheets can be trained for specific topologies with applications to dynamical control of folding (43), mitigation of misfolding pathways (44) and even classification of mechanical force patterns, analogous to neural networks (6) (Fig. 2B).

Molecular systems and active matter
Molecular systems with specific interactions show behaviors that define the complexity of life, ranging from computation and neural network-like information processing (32, 45,46,47) to structural behaviors like self-assembly of complex and dynamic structures, targeted phase separation and active matter (48,49,50). While most work tends to assume fixed interactions, several works have explored ways of treating molecular interactions as learning d.o.f wi and thus learning complex molecular behaviors.
One broad learning mechanism involves creating new molecular species through ligation or polymerization (51,52) or by preferential multiplication of molecules that bind input molecules tightly and discarding of those who do not bind (53, 54) (Fig. 2C). These new species then mediate new interactions wi contingent on the history of the system (55,56). One recent example (3, 57) explores learning interactions needed for selfassembly using a Hebbian-inspired 'localized together, wired together' rule ( Fig. 1B, left): is the concentration of species i at x, t and wij is the interaction strength between species i, j. That is, species i, j whose concentrations are high in the same place and time start interacting more strongly. The collective nucleation dynamics in such a system has been shown to be capable of pattern recognition by assembling different structures in response to different concentration patterns (57), with mathematical connections to Hopfield's associative memory (58).
Similar ideas of learning in multi-component phase separation (59,60), e,g., based on DNA nanostars (49), are likely to be realized in the coming years. All of these works suggest that inevitable physical processes (46) such as self-assembly, phase separation and nucleation can enable learning and perform pattern recognition, despite not being set up to mimic a neural network element by element (57) as in circuit approaches.
Another important type of molecular learning involves growth. Molecular systems like hydrogels and DNA nanotubes can grow elements based on their current geometry (61,62) and can potentially continually learn multi-stability (4). Some molecular systems such as crystals with defects are capable of rudimentary evolution through repeated fracture and growth; this aspect raises the intriguing possibility of learning through evolution in nonbiological systems (63).
Learning in well-mixed molecular circuits (as opposed to the structural processes above) has a larger literature, ranging from bio-inspired associative learning (64) to recent progress towards molecular neural networks (65,47). Molecular systems can also exploit their intrinsic stochasticity needed e.g., for Boltzmann machines (66).
Active systems (e.g. self-propelled particles) can expend energy and perform actions on microscopic scales. Such systems are potentially capable of learning either by combining synthetic active matter with DNA-based elements (67) and through rearrangement of nematic or polar filaments as in natural systems such as the cytoskeleton (68,69). Related ideas have been explored through simulations (70,71,72), though truly autonomous learning has yet to be demonstrated on the molecular scale. Autonomous learning can be realized more readily in macroscopic robotic swarms (73,74) that demonstrate similar active matter phenomena. Similarly, tissue-level rearrangements through active adhesion and junction changes might also allow for training (75).

Flow networks
In both biological (e.g. vascular) and engineered (e.g. microfluidic) networks, transport of materials is often achieved via fluid flow. Properties of pipes in a flow network, such as their radii, conductance or capacitance, determine the network's ability to globally transport material from one point to another. While such optimization could be carried out on a computer, many natural systems appear to adjust individual elements based on local feedback. For example, flow might couple to the mechanical properties of the pipes, constricting or expanding them to control flow conductance and pipe capacitance locally. Such adaptation might allow the network to control the distribution of cargo. For example, in Physarum polycephalum, the thickness of tubes controls the shape of the organism and allows it to move, forage and memorize features of its surrounding (11,76,79) (Fig. 2D).
Similarly, adaptive processes in other natural flow networks such as those in leaves and vasculature (80,81) are thought to work through local rules, as there is no central controller. Theoretical work on such flow networks has explored the ability to learn different behaviors through local rules (Fig. 1C, right), including ML-like classification of stimuli (7, 82).

Neuromorphic computing
A related but distinct thread of research is neuromorphic computation, which typically use solid state, electronic or optical elements to construct networks, inspired by neuronal or computational learning (29,30). Promising implementations include crossbar arrays (83), spintronic tunnel junctions (84) and phase change materials (PCMs) in photonic systems (Fig. 2E). In these systems, elements might adapt their resistance (e.g., memristors, potentiometers) as part of learning.
While past neuromorphic systems tried to implement non-local gradient descent (85) or biological learning rules (86) using complex elements, a new generation of neuromorphic implement simpler local learning rules based on contrastive learning (87,88), allowing for regression and classification (27,28) (Fig. 2F). Neuromorphic computing seeks to compete in traditional machine learning (ML) domain on energy and robustness (89,90). In contrast, the spirit of this review is to explore learning as a metaphor for how the response of atypical disordered physical systems to physical stimuli can change in functional ways; we do not seek to solve problems in the traditional ML domain with physical systems.
A recent new approach goes beyond engineered neural network-like architectures, taking the power of backpropagation to exploit the dynamics of complex physical systems. In this approach (78), physical parameters of a mechanical, electronic or optical system are updated by a backpropagation algorithm that observes the response of the system to physical stimuli. Such training effectively creates 'deep physical networks', i.e., physical systems, capable of solving classification problems (Fig. 2G). Computer-aided backpropagation in (78) is not a focus of this review; we focus on training that is potentially implementable through local rules in situ by physical dynamics, instead of computer-aided backpropagation.
On the other hand, there are similarities. The goal of the molecular and mechanical systems reviewed here and the approach of (78) is to exploit the intrinsic dynamics of physical systems for learning and computation on physical stimuli, rather than force physical systems to mimic neural network architectures element by element.

Reservoir computing: Training computational interfaces
Physical computation approaches often use a physical system as a computational resource, and opt to only train an interface with that system. In reservoir computing, a complex dynamical network takes an external input and perform a generic high dimensional 'computation'. A filter is then trained using a computer on the output of the system, such that a desired result is obtained (34). Different types of physical systems were used in this way, including electronic, fluidic, mechanical and biological systems, for a broad range of regression and classification problems.

PHYSICAL LEARNING RULES
In learning theory, learning problems are typically divided into conceptually distinct classes, unsupervised and supervised learning. A similar distinction is worth making in the materials context since the classes correspond to potentially different applications with different physical training protocols.

Unsupervised learning
In unsupervised learning, the learning degrees of freedom are modified directly as a response to observed signals. A completely unsupervised physical learning rule is of the form, where s(f ; {wj}) is the configuration the system adopts in response to an external stimulus f , assuming material parameters {wj}. h is an arbitrary non-linear function.
Here, learning d.o.f wi adjust themselves based on the response s(f ; {wj}). A paradigmatic example of such learning is directed aging in elastic systems. For example, elastic networks have been trained to be auxetic (5) by holding materials in the auxetic configuration as strained bonds soften according to a learning rule (Eq. 3). In this case, aging reduces the energy of a system held at its desired state dw i dt ∼ − ∂ ∂w i E(s; wi), where E(s) is the energy of a desired states. Note that as the energy of elastic networks is a sum over individual bonds i, this is a local rule, where every bond i changes according to the stress (energy) it carries. Similarly, at the molecular scale, stabilizing elements can grow between parts of a material that stabilize configurations the material is held in. For example, learning rules have been demonstrated experimentally in hydrogels using polymerases (61) and DNA nanotubes (62, 91) (Fig. 3A) and theoretically modeled in (4) to continually store multiple configurations in one mechanical system, without overriding previous ones, and retrieve configurations based on partial prompts.
Unsupervised learning can create new molecular interactions (Fig. 3B). For example, proximity-based ligation (51,52) can create interaction mediating molecules for species i, j by creating a ligated i − j molecule; further, ligation is naturally localized in space and time by mass-action kinetics, resulting in a Hebbian-like learning rule: where wij is the interaction strength between molecular species i and j with concentrations si(x, y), sj(x, t). Thus, spatial and temporal proximity during learning leads to changes in stronger or weaker binding interaction between different species of molecules. Equivalent mechanisms include activation of inactive particles e.g., through strand displacement (51) or phosphorylation. Such learning was shown to allow (3, 57, 94) learning multiple selfassembly behaviors and recognize patterns through nucleation (see Fig.4D). Adaptation in flow networks have also been modeled (11,81) with unsupervised rules that only depend on the flow through edges, proportional to a pressure drop sj − s k : where wij is the conductance of an edge connecting nodes i and j. In P. polycephalum, the stimulus f might correspond to food sources. Edges that exhibit low effective conductance are pruned, as observed in simulations and experiments (11) (Fig. 3C). Unsupervised learning rules (B) (C)

Measure Outputs
Free Network

Supervised learning
In machine learning (ML), supervised learning involves labeled training examples, e.g., labeled images of cats and dogs. In the context of physical learning, supervised learning corresponds to training a material to show desired specific responses for all input stimuli f . A typical supervised learning task is classification (e.g. the Iris dataset (95)), where a system is trained to produce a specific response sA(f ) for all stimuli f ∈ FA in class FA and a different response sB(f ) for stimuli f ∈ FB in class FB. Here, FA and FB are user-specified sets of stimuli and the challenge is two-fold: the physical learning process must identify correlations or features that separate the stimuli in FA and FB and further, must result in a material that responds only to these features in a specified way (Fig. 4A).
As in ML, one can evaluate learning performance by a cost function C({wi}) that is a measure of the number of stimuli f for which the system (with given {wi}) evokes an incorrect response. Then, the goal of physical supervised learning is for the material to naturally adapt wi such that the cost function C is minimized.
To achieve such a goal in physical systems, one needs a 'supervisor' who makes the learning process contingent in some way on the response s(f ) to stimuli f being 'right' or 'wrong' (as defined by the cost function).
The simplest form of such supervision is a 'thumbs up/thumbs down' feedback in which the supervisor only decides the sign of the autonomous unsupervised rules described earlier: ) otherwise (Thumbs up/Thumbs down rule) 6.
Such a rule was shown to be powerful enough to learn and classify subtle force correlations applied to disordered mechanical systems (6). In this theoretical study, spatial force patterns belonging to one of two sets FA, FB are applied to a disordered creased sheet; a supervisor decides whether the sheet is folded into a desired structure sA or sB for stimuli in FA or FB respectively. If the folded structure is correct for the stimuli, the supervisor lets creases soften according to their strain, say, by immersing the folded structure in one chemical environment. Otherwise, the supervisor immerses the folded structure in a different environment that stiffens creases based on strain. A simplified version of this local rule was realized by letting a liquid glue flow out of creases that significantly fold in a desired configuration in creased sheets (Fig. 3D).
Similar supervised rules can be implemented in molecular systems, e.g., by controlled strand displacement (65) (Fig. 3E). Through such mechanisms, e.g., the unsupervised molecular Hebbian learning rule in Eq. 4 can be modified so that interactions between co-localized molecules are either strengthened or weakened depending on the judgement of a supervisor.

Contrastive learning.
A more powerful supervised learning framework, requiring greater supervision but promising better results, is inspired by contrastive Hebbian learning (96). In contrastive learning, a system is trained by observing the contrast between its current responses to an input s(f ) and a nudged 'improved' responseŝ(f ). This nudging takes the form of additional (weak) forces or constraints, applied by the supervisor on the system, such that it better represents the desired configuration (lowering the cost function). For physical systems that are defined by an energy function E(s; wi), a simple contrastive learning rule takes the form  Training physical learning machines for machine learning inspired tasks by mapping complex datasets to physical stimuli. (A) Data from the Iris flowers (95) dataset is mapped to physical stimuli (here, force patterns) that are then used to train a creased thin sheet to show distinct responses for distinct classes of stimuli. Here, the sheet adopts a heterogeneous crease stiffness profile through a local learning rule, and eventually correctly classifies the input stimuli through folding responses (6). (B,C) Similar classifications be learned in flow and electric networks by mapping the Iris data to input pressures or voltages at some nodes, and reading out the pressures or voltages at some other output nodes, in (B) simulations and (C) experiments (27). (D) 2500-pixel images were mapped to concentration patterns of 2500 molecular species (grayscale value of pixel i = molecular concentration a i ). Molecular interactions were trained by a Hebbian rule, so the molecular system classifies different (potentially distorted) stimuli by showing different self-assembly behaviors in the non-equilibrium nucleation-dominated regime of self-assembly (57).
While this approach is frequently used to train computational models (e.g. restricted Boltzmann machines (97)), it was shown to give rise to local, physically realizable learning rules (98,25). Furthermore, in certain limits contrastive learning well approximates the gradient of a global cost function (99), suggesting that physical local learning is not fundamentally limited in performance compared to global computational ML. Contrastive learning and its derivative approaches were shown to successfully train (in-silico) various physical models, including Hopfield nets (99), flow networks (87, 100) (Fig. 4B), mechanical spring networks (7), cell assemblies and photonic neural networks (101). Early experimental systems implementing contrastive learning in linear resistor and elastic networks (Fig. 3F,G) demonstrated the success of this approach (92,27,28), e.g. in classifying the Iris dataset (Fig. 4C). These experiments further hint at the potential scalability of physical contrastive learning to large networks and complex sets of tasks.
A plausible generalization to systems without an energy function (i.e. non-symmetric dynamical systems) but still at steady state is given by spike-timing-dependent-plasticity (STDP (24)) (102): where wij is the strength of a 'synapse' (coupling) connecting 'neurons' j to i, and h(·) a non-linear function. This learning rule was shown computationally to enable learning in non-symmetric spiking recurrent neural networks (Fig. 3H) by continuously evaluating the activations of 'pre-synaptic' and 'post-synaptic' nodes (88). Such rules have been implemented in physical memristive substrates (103,104); e.g., optical transmissivity of chalcogenide phase change materials (PCM) changes as a function of the number of pulses applied to, affording an approximate implementation of STDP in optical synapses (93) (Fig. 3I).

Physical implementation of supervision
The principal difficulty in realizing physical learning machines is the requirement that the underlying physics allows for a useful local learning rule.
Many unsupervised rules explored here may be implemented by inevitable processes in natural systems. For example, directed aging in elastic systems (5,39,40) exploits inevitable aging processes in many materials. Similarly, growth-based rules in molecular systems naturally follow local geometry (4) while flow networks in many natural systems can expand or constrict according to local flow properties (11). Proximity-based ligation (51) allows for molecular interactions to increase based on spatial or temporal proximity, e.g., enabling learning of self-assembly behaviors by simply co-localizing particles in the desired arrangement during a training period (3,94).
Supervised learning places greater demands on the material since it requires the same physical system to employ a learning rule with either sign, e.g., strengthening or weakening interactions, depending on context. For example, thumbs-up-thumbs-down supervised learning in creased sheets (6) required the same elastic material to be capable of stiffening (thumbs down) or softening (thumbs up) due to strain, based on a thumbs-up-or-down 'signal' from the supervisor. Such sign switching might be achieved by the supervisor placing the system in different chemical environments (or at different temperatures (5)) based on whether the system shows the right or wrong response to the a stimulus.
Greater supervision like contrastive rules require more to realize their promise: in addition to the sign problem, contrastive rules require two copies of the system and the ability to change learning d.o.f wi based on the quantitative difference between the systems. While such duplication might be possible in some cases (27), a more broadly relevant implementation could apply the free and clamped conditions sequentially (99), if learning elements have a memory component. Another implementation might involve physical signals of multiple modalities to obtain the contrastive signal; e.g., in flow networks, tracer particles could be injected at the output and advected by the flow, carrying the error signal information to the learning d.o.f (82). Finally, approximations of idealized theoretical learning rules might suffice in real materials; e.g., correctly getting just the sign of change in wi based on change in energy might be sufficient for contrastive learning wi (27, 105).

Locality in solid and liquid-like systems
While physical learning rules need to be local, the nature of the locality requirement varies across systems. Solid-like systems, e.g., elastic networks and sheets and flow networks, typically have fixed neighborhood geometry. Each learning element wi has a fixed location in space and time x, t and locality implies that wi can only change based on the state of the system s[x, t] in the vicinity.
Liquid-like systems, like molecular systems, have no fixed neighbors in space, since components typically diffuse freely like in molecular systems or re-arrange in a more limited manner as in jammed packings (41). While training might seem difficult in such liquid-like systems without fixed neighbors, there are two ways forward. In molecular systems with many species of components (59,60,106), the learning d.o.f wij are interactions Jij between species i, j. The locality constraint dictates that Jij only change based on transient spatial or temporal coincidences of molecules i, j as in Eq. 4, e.g., as considered in proximity-based ligation and in Hebbian learning of self-assembly (3,57). Note that the resulting learned interaction graph, encoded by Jij, has no real-space interpretation and does not need to be embeddable in 3 dimensions.
A distinct challenge arises in liquid-like systems with only a few species or components, such as jammed sphere packings and actin or microtubule networks. Here, training often relies on the geometric arrangements, e.g., memories stored in cytoskeletal networks under shear (41,69). The further extension of learning protocols to such liquid-like systems without fixed neighbors is important to broaden the scope of physical learning to all scales.

Exploiting noise
Compared to computer algorithms, physical systems experience higher noise that might alter how they learn in response to external stimuli (107). Noise can affect both the physical response s(f ) to stimuli f and the consequent change in learning d.o.f wi.
Such inevitable fluctuations in physical systems can be a blessing since several learning tasks require randomness. For example, molecular systems with finite numbers of molecules can naturally act as 'chemical Boltzmann machines' needed for the unsupervised learning of probability distributions (66). Similarly, Ref. (57) shows how the intrinsically stochastic nature of nucleation of crystals can solve complex pattern recognition problems, much the way stochastic local search algorithms solve satisfiability problems on a computer. Noise improves the robustness of encoded memories in disordered jammed packings (108). Noise during physical training might also be beneficial for generalization, as often found with stochastic gradient descent in in silico ML (109). The physical learning rules discussed earlier have been shown to be generally robust to noise, with an error floor (27,100). Potential mitigation strategies include modulating the learning rate and the magnitude of stimuli during training (Fig. 5A).
The distributed nature of physical learning also intrinsically makes it robust to other noise sources, like malfunctioning learning elements or damage. Similarly, unlike in silico ML, physical learning is generally asynchronous since different learning d.o.f wi are updated independently without a central clocked processor, which might be beneficial for discrete learning d.o.f (28).

Material constraints: dynamic range of weights w, over-training
A significant constraint on learning is the dynamic range of the learning d.o.f w achievable in real materials without other compromises. For example, interactions in molecular systems can vary over a finite range (relative to kBT ) before associations become irreversible. Similarly, in mechanical systems, the range of bond stiffnesses might be limited, e.g., to avoid fracture. Such constraints might be overcome to some extent by choice of materials, such as shape memory polymers (110,111), hydrogels and poly-carbonates (112, 113). Existing works (3, 6, 100) have found that requirements for tasks studied so far are within experi-mentally available ranges. Another solution is through architecture -e.g., larger networks with more learning elements, each having a moderate dynamic range, might alleviate the need for larger dynamic ranges.
Some learning d.o.f wi, such as stiffnesses, are constrained to be positive, an issue discussed before for the brain (114). Solutions proposed in ML, like shifting the task to positive values (115), or decomposing negative weights as the difference of two positive weights (116), might have physical analogs as well. For example, positive molecular binding energies can be effectively shifted to negative values by adjusting entropic costs of binding (3).
Another failure mode for learning physical systems is overtraining (6,117). In materials that lose the variability of the learning d.o.f, overtraining can result in a network that can no longer adapt to new tasks (118).

Memory, learning and generalization
A closely related concept to learning is memory. In the materials context, memory has a longer history of research, especially in disordered mechanical systems, polymer melts and glasses (21). Examples include colloidal systems 'remembering' the amplitude of prior shear and glasses 'remembering' a temperature at which they were aged.
What is the relationship between learning and memory? On one hand, memory is a manifest requirement for learning; the learning d.o.f wi must encode a memory of past stimuli f . On the other hand, successful learning places additional demands on the nature of memory. Learning requires that the memory have a specific physical retrieval mechanism, namely in response to stimuli for which a functional response is sought. Further, learning often allows physical systems to generalize and show the correct response to stimuli never seen before (3,4,6,7,47), much the way an in silico neural network can generalize to novel cat images in a test set not seen during training. Such generalization requires selective memory for informative features in the stimuli; complete memory of all features of the training stimuli (over-fitting) would be counter-productive. In summary, learning requires selective and functional memory with a retrieval mechanism.

Learning out of equilibrium
Any learning process is associated with two distinct timescales: (1) a response timescale τresponse associated with the physical d.o.f responding to stimuli f , i.e., mapping inputs to outputs f → s(f, {w}), and (2) a learning timescale τ learn associated with updating the learning d.o.f wi based on the response, i.e., dwi/dt ∼ 1 τ learn g(s(f, {w})). In silico ML has a clear separation between these timescales as the first process is typically fully completed before weights are updated.
In contrast, physical and biological systems might not have a clean separation between τ learn and τresponse; learning might effectively occur out-of-equilibrium and not be separable from the physical dynamics of stimuli-response. However, learning can be successful despite the lack of separation of timescales, as investigated for neuronal networks (119,120,121,122) and tested recently for physical learning with contrastive rules (100). Even learning rates comparable to the physical response rate (τ learn ∼ τresponse) have little effect on the trained performance, though higher rates might lead to oscillations (Fig. 5B).

Out of equilibrium effects: Time reversal symmetry, steady states, non-reciprocal interactions
Beyond the relative rate of learning and physical dynamics discussed above, there are two distinct notions of being out-of-equilibrium relevant for learning: (a) the non-equilibrium nature of the learning process, (b) the non-equilibrium nature of the underlying physical system (in the absence of learning). (a) The learning process: Successful learning is almost always out-of-equilibrium since it is irreversible; learning starts with a random set of learning d.o.f and ends with a distinct set of learning d.o.f wi from which it is likely impossible to reconstruct the initial wi exactly. As Landauer pointed out (123), such a many-to-one procedure in the space of weights wi can be interpreted as erasure and must consume free energy. Currently, it is unclear exactly how many-to-one this process is for different materials and tasks. At a more practical level, the learning process is bound to be dissipative in nature. For example, the molecular versions might involve polymerization, ligation or strand displacement (65,51,54) while mechanical systems involve dissipative re-arrangements in foams and other complex materials (5). The theoretical and practical constraints on learning by dissipation deserve further study.
(b) The underlying physical system can be characterized as being at equilibrium or not without accounting for the learning process itself; many open questions remain on the distinction between the two cases. For example, equilibrium systems with time-reversal symmetry can typically avail of more highly supervised protocols such as the contrastive Hebbian rule and echo backpropagation (96,101). Some of the systems studied here -e.g., elastic and flow networks -are often studied at steady state and preserve time-reversal symmetry. Consequently, influences can propagate from output to input as easily as input to output, enabling contrastive learning (Eq. 7). In systems without time-reversal symmetry -e.g., molecular systems that are not at steady state or are powered by molecular motors, a perturbation of the output may not affect the input. Such systems can still be trained in an unsupervised way or through the thumbs-up-thumbs-down supervision described earlier.
On the other hand, relaxing the constraint of being at equilibrium could potentially help learning. For example, the capacity of pattern recognition tasks in molecular systems may be improved using non-equilibrium nucleation-dominated self-assembly (57) and show associated trade-offs between complexity of pattern recognition, accuracy and speed of recognition (Fig. 5C). Similarly, non-reciprocal interactions (124) should increase the range of learnable behaviors; for example, dynamic phases in Kuramoto-like networks of coupled oscillators could potentially be learned (125). An open question is whether non-equilibrium behaviors are also more learnable; e.g., these systems may be more expressive (94) or show greater degeneracy of parameters for a given behavior.

Dynamic architectures, continual learning and forgetting
Neural networks often have a fixed architecture (e.g., 2d convolutional network) with learning merely updating synaptic weights within that architecture. While some solid-state physical systems might be similarly constrained, physical systems that learn through growth or molecular interactions, have a freedom in architecture not usually present in in silico ML. For example, growing networks of nanotubes or grown gels (62,91) or learned selfassembly (3) can change topology, geometry and even dimensionality during training. It is possible that learning can construct networks with architectures compatible to performing desired tasks.
Further, physical systems naturally 'forget' learned experiences, e.g., through degradation. Such erasure and forgetting can allow physical systems to naturally learn new tasks or functions without exceeding any capacity set by the number of d.o.f (4, 126).

Expressivity, capacity, and hidden nodes
A key property of any learning system is its 'expressivity', i.e., the complexity of inputoutput relationships that can be modeled with the available learning d.o.f (127). A system with higher expressivity might also be easier to train. In fact, the success of ML is often attributed to overparameterization (128). A related quantification is 'capacity' (58,129), e.g., related to the largest number of distinct behaviors that can be learned simultaneously.
As found for neural networks, both expressivity and capacity increase with the number of d.o.f for physical learning (3) (Fig. 5D). Work in specific systems like learned selfassembly (3,57,130) shows that frustrated interactions and dimensionality of these d.o.f can dramatically increase expressivity and capacity. However, we do not yet have broad principles for controls expressivity are and what kinds of physical interactions increase it.
Another way to increase expressivity and capacity is the introduction of hidden nodes that do not directly couple to input stimuli or the output behavior, as in restricted Boltzmann machines (97). In the materials context, similar ideas might apply, e.g., applying force patterns to only boundary and not the bulk (131,132) or mapping inputs only to some molecules (47, 57), might allow learning of more complex behaviors. Learning creates physical signatures in the substrate beyond the desired functionality. (A) Learning induces spatial heterogeneity in the physical substrate, here a self-folding sheet with stiff creases (6). (B) System connectivity and topology may significantly change, as learning effectively prunes many unused or counterproductive edges in a network (7). (C) Physical learning creates soft modes in a system, so that most forces couple preferentially to these few modes, reducing the effective response dimension D of the system. (D) Depending on the difficulty of the task, a learning system may trade off performance for energy. Here, a regression task performed by a flow network shows such a trade-off when a parameter r that regulates power consumption is varied.

PHYSICAL SIGNATURES OF PAST LEARNING
Physical learning leads to disorder as the learning d.o.f wi typically become heterogeneous during learning. However, while the usual approach to disordered systems averages over random ensembles (133), the learning process can lead to atypical instances of disorder. Here we discuss some signatures of learning in physical systems.
We note that many of these learning signatures are analogous to signatures of evolution and evolvability (134). In both cases, functional systems are arrived at through an iterative procedure in parameter space wi. Hence learning (and evolved) systems may be highly atypical in the context of 'random' ensembles typically studied in statistical physics; instead, these systems occupy functional regions accessible by the dynamical learning process. The requirement of accessibility results in features that may be incidental or irrelevant to the task being trained or selected for; such features would not be present in random systems or systems designed by a highly non-local algorithm in parameter space. Much like with evolution, these signatures can be used to examine if a natural disordered system is the product of physical learning.

Network geometry and topology
Heterogeneities: In solid-state systems such as elastic or flow networks, a learning process can leave profound signatures on the real space architecture. For example, like with synaptic weights in a neural network, learning inevitably leads to spatial heterogeneity in local elastic moduli or tube radii (Fig. 6A), up to complete pruning of unused edges (Fig. 6B). In liquid-state systems, the learning process can create atypical patterns of heterogeneous and promiscuous molecular interactions in chemical space; while a naive analysis might suggest a lack of function due to non-specificity (135,136), the learned interactions are actually structured so as to be functional as seen in learned and evolved systems (3,137). Studying learning in relatively simple physical models may offer new insights on when and why such atypical heterogeneities develop, including potentially network motifs or hierarchical structures found in natural networks (15,138,139).
Adaptability: Systems alternately trained for multiple incompatible behaviors in an alternating sequence can discover rare but highly adaptive or 'mutable' networks for each task (140,141,142,143); such adaptive networks are able to switch from one behavior to another with far fewer changes in parameters wi than would be typical for generic designed networks that perform those tasks.

Network dynamics
Not-so-glassy landscapes: Generic frustrated many-body systems tend to have glassy landscapes with many randomly placed minima. However, learning can lead to exponentially fewer minima that are not randomly placed. For example, when molecular interactions are learned in a Hebbian way to self-assemble one of many structures, the number of resulting minima is exponentially fewer than naively expected (3). Further, when the number of minima do proliferate near capacity, they are not random; rather, they correspond to chimeric assemblies of structures at other minima, analogous to spurious states in Hopfield associative memory. Similarly, jammed packings (41) can become ultra-stable by aging in that state while disordered creased sheets (6,43,44,144), upon training, show exponentially fewer branches at the flat state bifurcation point than expected of a random system.
Soft modes: Trained systems often show 'soft' modes even if training did not explicitly seek such softness (16,17,145,146). Soft modes correspond to normal modes with low energy eigenvalue in equilibrium systems, or more generally, a small Lyapunov exponent. Consequently, trained systems respond more strongly to random forces than a random system; responses to random forces are often low dimensional along these soft modes (147,148) (Fig. 6C). Furthermore, the energy required in flow and resistor networks to actuate desired behaviors is often reduced by supervised learning (Fig. 6D).

CONCLUSION
In this review we discussed the general notions of autonomous physical learning machines, the physical constraints that affect them, and how some of them can be overcome. While clearly inspired by neuroscience and machine learning, we attempted to convey that physical learning is separate and unique: learning machines are physical, like biological learning networks, but can learn to solve inverse problems and produce responses that have no analogy in biology or machine learning. Moreover, treating learning as a physical phenomenon encourages research on fundamental questions on physically realizable learning models, how learning manifests as a collective behavior in systems with simple constituents, and how learning induces physical changes in these systems. We believe that such questions underlie a fundamental physical theory of learning, and that their answers may be independent of the specific details of implementation of the learning machine. Physical learning, both theoretical and especially experimental, requires multidisciplinary approaches. We hope recent and upcoming research in this field will interest not only condensed matter physicists, but computer scientists and neuroscientists as well.

DISCLOSURE STATEMENT
The authors are not aware of any affiliations, memberships, funding, or financial holdings that might be perceived as affecting the objectivity of this review.