Neural Mechanisms That Make Perceptual Decisions Flexible

,


INTRODUCTION
A hallmark of our intelligence is the ability to make appropriate decisions under different circumstances.We gather information from the external world, combine it with internal knowledge, and commit to mental propositions and plans of action-decisions that shape our lives.Among the experimental paradigms developed to uncover the neural mechanisms of decision making, sensory psychophysics has played a key role in shaping our current understanding (1).A common finding has been that many sensory discriminations (e.g., judging the direction of moving dots) are based on the accumulation of sensory evidence toward a decision bound (1)(2)(3).Models built on this key computation have been remarkably successful, offering a unified explanation for various aspects of behavior, including choice, response time, and confidence (4).
However, in the experiments that gave rise to these models, subjects often repeated the same decision-making process in a stationary task environment with stable stimulus-action associations and rules.This stationarity removed the need to determine appropriate stimulus features, actions, and rules for each decision, substantially reducing the complexity of the decision-making process.However, such stationarity rarely happens in our daily lives, where consecutive decisions often differ from each other.Even for the same sensory inputs, our actions can greatly vary depending on context.For example, when we run into a colleague in a hallway, we may stop to chat, but not during a fire drill.Decision making in a nonstationary environment often has a hierarchical structure, where we must first infer the relevant inputs, actions, payoffs, task solutions, and decision policies before acting on the inputs.We can implement these context-dependent hierarchical decisions rapidly and effortlessly and, thus, there should be neural mechanisms that flexibly alter behavior without rewiring the decision-making neural circuits.How can decision-making models be extended to account for such context dependency?
A growing body of research has now begun to explore flexible decision making, but empirical and theoretical results are diverse and often hard to connect.Many studies have examined how decision-making processes unfold under different task rules.Task rules could be modified in a number of ways.For example, one can change relevant sensory features, the association between stimuli and actions, the reward associated with different choices, and so on.Adjustments of behavior to any of these manipulations are attributed to context-dependent computations, but the computations could be vastly different, and they may not share the same neural underpinnings.
The first section of this review introduces perceptual decisions, where the rich history of past studies and the maturity of existing empirical and theoretical frameworks provide an ideal test bed for investigating context-dependent decisions.After introducing the neural mechanisms of perceptual decisions, we explain the diversity of flexible behavior associated with these decisions.We suggest that past studies tapped into different forms of flexibility, shedding light on the diverse brain regions and neural responses observed in those studies.The next section focuses on a key observation that, even for a single task, a network of brain regions is engaged during decision making.We contrast these observations to a currently dominant doctrine that the prefrontal cortex (PFC) operates as a central hub that controls flexible behavior (5,6).In the last section, we briefly discuss how network models have been designed to implement flexibility.
We end this introduction with two notes to clarify the scope of this review.First, although our review focuses on perceptual decision making, the line is blurred between perceptual decisions and other kinds.For example, experiments on value-based decision making or reinforcement learning typically involve sensory components (e.g., options with different values are presented as visual cues) and use similar task structures (e.g., binary decisions).Furthermore, much of the existing literature on cognitive control uses perceptual tasks such as the Wisconsin Card Sorting Test (WCST) to examine subjects' capacity to flexibly switch behavior.We believe our core takeaways apply to those tasks too, although we do not deeply explore them here.Second, our main interest is in flexible mechanisms that allow rapid behavioral adjustment, but context-dependent behavior can also be studied across blocks, sessions, subjects, or different training levels.By summarizing these broad studies, we demonstrate the diversity of flexible computations and the distributedness of the neural activity associated with them.

PERCEPTUAL DECISION MAKING IN A STATIONARY CONTEXT
Substantial progress has been made in understanding the computational and neural mechanisms of perceptual decision making.For thorough reviews of this progress, see 2-4.Here, we summarize the key findings, focusing on those most relevant to context-dependent decision making.
In typical perceptual decision-making tasks, subjects discriminate sensory stimuli (e.g., motion toward right or left), reporting their choice with specific motor actions (e.g., reaching right or left).The stimuli vary across trials, usually along a parametrically controlled dimension, which enables precise manipulation of task difficulty.For example, in the motion direction discrimination task with random dots (Figure 1a), the percentage of dots moving coherently together (motion coherence) determines the difficulty of the trial (7,8).Motion direction and coherence vary randomly across trials, but subjects typically perform hundreds of similar discriminations consecutively, applying the same computations and task rules on every trial.
This stationary setting provides ample data to address a key question: How are decisions made in the face of uncertainty about stimulus difficulty and noise in sensory inputs and sensory neural responses?An optimal solution would be to integrate all available sensory information, that is, to accumulate independent pieces of evidence over time for temporally extended stimuli (Figure 1b) and to integrate evidence across space or feature for spatially extended or multifeature stimuli (9,10).When there is cost associated with gathering information (e.g., maintaining fixation or sustaining attention), the reward rate is maximized by continuing the evidence accumulation until a time-dependent decision criterion or bound is reached (11-13) (Figure 1b).This accumulation-to-bound process can be further adjusted to accommodate factors such as prior probabilities of the stimuli and expected rewards of different choices to make reward-maximizing decisions (14)(15)(16)(17)(18).
Empirical results indicate a close match between the bounded accumulation model and the neural computations that underlie behavior.The model accurately fits and can even predict the distribution of choice, reaction time, and confidence (22)(23)(24)(25).Moreover, dynamics of neural responses in multiple brain regions resemble the evidence accumulation process (8,(26)(27)(28)(29)(30)(31)(32).Shadlen & Newsome (33,34) were the first to identify neurons in the monkey lateral intraparietal (LIP) area that gradually increase their firing rates when the motion stimulus supports the target in the response field of neurons (Figure 1c).Critically, these neural responses reach a common level immediately before the decision, offering a neural signature for committing to a choice through reaching the decision bound (8).Since those first studies, similar neural signatures have been identified in a variety of motor-planning regions, including frontal eye fields (28), prearcuate gyrus (15,35), dorsolateral prefrontal cortex (29,36), caudate (27), superior colliculus (26), and motor cortex (37-39) (Figure 1c).Further support for the involvement of motor-planning regions in bounded accumulation comes from visual search tasks, where subjects find a target in the midst of distractors (40), and from stop signal tasks, where subjects are instructed to generate a motor movement but later cued to withhold the movement in a random portion of trials (41).
Motivated by these empirical results, neural circuit models have been proposed for the implementation of the bounded accumulation process (19,20,21,40,(42)(43)(44)(45).These models typically comprise neural pools that accumulate evidence for their preferred choice through a mixture of self-excitation, mutual inhibition, or feedforward inhibition (Figure 1d).Sensory inputs bias the competition until one pool wins, signifying a commitment to a choice and triggering a behavioral response.Bounded accumulation and its circuit models have also been adopted beyond perceptual tasks, especially for value-based decisions (46,47) and memory-guided decisions (48,49).

DIVERSITY OF FLEXIBLE DECISIONS AND THEIR NEURAL CORRELATES
Making perceptual decisions in realistic, nonstationary environments requires processes beyond those studied in stationary task settings (Figure 2a).Before gathering sensory evidence and using it to form action plans, decision makers must first determine the context and set up the decisionmaking process accordingly.They should decide on the relevant sensory information, relevant actions, expected payoffs, fruitful solutions (e.g., integrate or differentiate), and decision policies (e.g., how much evidence to accumulate).This hierarchical decision-making structure-making Diverse forms of context dependency in perceptual decision making.(a) Decision making is hierarchical in nature.Even for the simplest decision, the brain first chooses relevant stimuli, actions, solutions, and policies.(b) Adjustments in decision policy are often explained as changes in the parameters of decision-making models.Adjustable parameters of the bounded accumulation model are shown in red.(c) Post-error slowing is an example of policy adjustment.Slower reaction times after error trials (left) are explained by reduced sensory sensitivity and lower urgency.Correspondingly, the buildup activity of neurons in the lateral intraparietal area decreases after error trials (right).Panel adapted with permission from Reference 50; copyright 2016 Elsevier.(d) Common task designs that test flexibility in stimulus-action mapping: (i) change in relevant sensory modality or feature, (ii) change in effectors (e.g., saccade and reach; top) or reversal of stimulus-action mapping (bottom), and (iii) change in categorization boundary.(e) Flexibility to adopt different solutions.For example, the same stimulus could be integrated, differentiated, matched to a template, and so on (top).More complex tasks that involve hierarchical inference offer a rich basis for studying the flexibility of strategy (bottom).Panel adapted with permission from Reference 51; copyright 2021 Ariel Zylberberg.Abbreviations: Coh, coherence of moving dots; L, left; R, right; T in , target in neuron response field; T out , target outside neuron response field.decisions about how to make a decision-enables flexible adjustment of behavior in a manner appropriate for different task contexts.In this section, we summarize recent studies that examined these adjustment mechanisms through comparing behavior and neural responses across multiple tasks.
A key challenge in summarizing these studies are the numerous ways that a decision maker can adjust decision-making processes.Different forms of flexibility possibly engage different neural mechanisms, but any change in behavior or neural activity across tasks is labeled as flexible or context-dependent in the literature.To highlight this diversity, we classify flexibility into three types (Figure 2).At the highest level, a decision maker could be flexible in their solutions for the task.For example, they can choose to integrate, differentiate, or even ignore sensory inputs (flexibility in task solution).Second, for a given solution, a decision maker could flexibly adjust the decision policy for a given solution.This includes, for example, how much evidence to accumulate or whether to favor one option over others (flexibility in decision policy).Finally, a decision maker can flexibly associate different sensory inputs to different actions (flexibility in stimulus-action mapping).

Flexibility in Decision Policy
We first discuss adjustments in decision policy (Figure 2b) as successful demonstrations of how flexible choice behavior could be described mechanistically using the existing theoretical framework for perceptual decision making.
In experimental settings, changes in decision policy can be induced through a variety of task manipulations, including changes in prior probabilities of stimuli (16,52), costs and rewards associated with each choice (18,(53)(54)(55), variability in stimulus (56), spatiotemporal effects of sensory signals on decisions (57), urgency to respond (58), or even parameters as subtle as the duration of intertrial intervals (59) and timing of reward delivery (60).Furthermore, the history of choices and feedback influences subsequent decision policies (15,17,50,61).These manipulations-which can happen at various time scales ranging from trial-to-trial to session-to-session-impact different aspects of decisions, including choice biases, accuracy, reaction time, and decision confidence.
Consider the adjustment of decision policy upon receipt of feedback in previous trials.Posterror slowing is a commonly observed effect in which subjects elongate their reaction times following an error feedback (see left side of Figure 2c).Purcell & Kiani (50) used evidence accumulation models to identify the neural underpinnings of these behavioral changes.Post-error slowing in the direction discrimination task (Figure 1a) arises from reduced sensitivity (lower drift rate) and increased decision bound (decreased urgency).Increased bound compensates for reduced sensitivity to maintain the overall accuracy but at the expense of longer reaction times.Consistent with these modeling insights, LIP neural activity following an error shows reduced dependence on stimulus strength, manifested as decreased buildup rates (see right side of Figure 2c) as predicted by lower sensitivity.Moreover, urgency signals calculated from LIP responses are reduced, indicating an effective increase in decision bound by an amount matching the model predictions.Similarly successful applications of decision-making models can be found in studies of speed-accuracy trade-off (60,62), prior probability effects (16,52), and reward bias (53,54).Thus, the modeling framework developed for a stationary setting naturally extends to certain forms of flexible computations.
The key strength of this approach is a mechanistic understanding through quantitative comparison between models and data in multiple task conditions.For behavioral data, these models explain changes in choice, reaction time, confidence, and psychophysical kernels (63).Remarkably, model parameters could be fit to one aspect of behavior (e.g., reaction time) and then used to generate accurate predictions about the other aspects (e.g., choice) (22,24,25,44).Neural data could also be examined in the same framework.For example, Hanks et al. (60) used neural data to estimate changes in model parameters between two speed-accuracy regimes and then confirmed that the neurally inferred parameters account for the observed behavioral changes.
Despite the success of this approach, there remain key unanswered questions.Most notable, we do not know what neural mechanisms control the necessity, type, and magnitude of policy adjustments.Multiple brain regions have been suggested to play a role in setting the decision bound, most prominently the basal ganglia (54,58,64) and possibly the superior colliculus (65).When human subjects are urged to respond quickly, striatum activity is enhanced compared to the condition where accuracy is prioritized (58).Deep brain stimulation of the subthalamic nucleus reduces the decision threshold and triggers faster responses (64).Choice biases induced by reward imbalance are also dependent on the striatum.Doi et al. (54) found that the activity of caudate neurons is modulated based on the association between the target in the neuron response field and reward such that the microstimulation of caudate could enhance the monkey's choice bias.In the cortex, the posterior parietal cortex (PPC) (61) and PFC (15) encode past stimuli and feedback, shaping history-dependent behavioral biases.Network models that consider subcortical structures may give us clues for how decision policy is adjusted in the evidence accumulation circuits (66).

Flexibility in Stimulus-Action Mapping
At the core of perceptual tasks is the mapping of sensory stimuli onto actions (Figure 2d).However, the flexible stimulus-action mapping is rarely a focus of the existing decision-making models, as many of them presume a fixed stimulus-action association and instead focus on the integration of evidence (Figure 1b,d).Is the selection of appropriate mapping for a particular decision independent of the decision formation itself?Empirical findings summarized below suggest that they are rather tightly intermingled in the brain.

Selection of relevant sensory features.
To perform a perceptual task, the decision maker must first decide what sensory information to base the decision on.A common experimental design to understand this process involves comparing neural responses when subjects generate the same set of actions in response to different sensory stimuli (31,(67)(68)(69).For example, Raposo et al. (70) trained rats to choose one of the two nose ports based on the frequency of either visual flashes or auditory clicks.Another task variant uses a multifeature stimulus space and instructs subjects to discriminate the same stimuli based on different features (36, 69, 71-76) (Figure 2d, subpanel i).For example, Mante et al. (36) trained monkeys to report either the dominant color (red versus green) or dominant motion (left versus right) of colorful random dot kinematograms.Because the two sensory features could support opposing choices, subjects should ideally base their decisions only on the relevant stimulus feature in each task.
One important question is whether the selection of sensory features happens independently of the evidence accumulation process.If so, we would expect modality general integration signals that reflect evidence accumulation regardless of the source of information.Such a task-invariant implementation permits identical readouts of the decision variable across tasks.In support of modality general mechanisms, human electroencephalography studies have found neural signals reflecting decision formation across sensory modalities (31,77,78).For example, centroparietal positivity reflects accumulated sensory evidence for motion discrimination, contrast detection, and auditory detection tasks (31,77).In addition, functional magnetic resonance imaging studies suggest that some regions in the prefrontal and parietal areas may reflect the decision variable regardless of sensory stimuli (79).
By contrast, electrophysiological studies of spiking activity have found that the encoding of the decision variable is task specific.Okazawa et al. (69) report a qualitative difference in the firing rates of LIP neurons during motion discrimination and face discrimination tasks.During face discrimination, LIP firing rates are lower for the strongest stimuli supporting saccade to the target in the response field, the opposite of the motion discrimination task.The LIP population represents the accumulation of evidence in both tasks.However, neural responses form a curved manifold that rotates and shifts depending on tasks (Figure 3b).The rotation of the manifold occurs even when monkeys switch between categorization of the identity or expression of the same face.An earlier study by Stoet & Snyder (72) also found that LIP neurons discharged differently for the same reach choice between color and orientation categorizations of the same stimuli.
Mechanistic accounts of such task-specific encoding of the decision variable remain elusive.These variable representations imply that optimal readout of the decision variable has to be task dependent.Curved manifolds appear to be ubiquitous across brain regions including the medial and lateral frontal cortices (69) and likely the basal ganglia (27) and the superior colliculus (26), but task dependence of the manifold should still be investigated in all these regions.One possibility is that some of these regions, unlike the LIP, include task-invariant integrator mechanisms.Alternatively, all regions may include task-dependent representations and collectively coordinate decision making through task-dependent readouts.Examination of single-cell activity is needed because electroencephalography and functional magnetic resonance imaging, which record aggregated activity of brain tissue, may lack enough resolution to separate distinct neural responses within a circuit.
What is the mechanism for the selection of relevant sensory information?It has been long considered that the PFC is central to flexible sensory selection.For example, the WCST has been used to demonstrate behavioral inflexibility in patients with prefrontal damage (80).Lesion studies in monkeys trained on WCST-analog tasks suggest differential contributions of PFC subregions to the update and maintenance of task rules or evaluation of outcomes (80,81).
Focusing on selective sensory routing, Mante et al. (36) attempted to provide a mechanistic model.During color and motion discrimination, they found that the same neural population in the frontal eye fields encodes both color and motion information but changes its response dynamics such that only the task-relevant information is integrated along the choice axis.These response patterns resemble the activity of a recurrent neural network trained to perform the same task.While this is an appealing idea, the selection process may not entirely depend on a local circuitry within the PFC.Using a task similar to that of Mante et al., Siegel et al. (74) showed that task context information is represented not only in PFC but also in LIP and then propagates to visual areas.Kamigaki et al. ( 82) also showed that neurons in the PPC modulate their firing rates when monkeys switch task rules in a similar experiment.As mentioned above, the encoding of decision variable in motor-planning areas depends on relevant sensory dimensions (69), implying that neural mechanisms for sensory selection and evidence accumulation for action are intermingled.
There is also an important, long-standing question of whether a part of the selection process happens in sensory circuits (75,83,84).Attention modulates responses of sensory neurons selective for different locations or features (84).However, the role of attention in gating relevant features in decision making remains debated.Sasaki & Uka (83) found virtually no change in middle temporal (MT) area neural responses when monkeys discriminated the direction or disparity of random dots stimuli.By contrast, Rodgers & DeWeese (75) found large differences in A1 activity when rats switched between auditory localization and pitch discrimination tasks (Figure 3e).Furthermore, selection between visual and auditory signals could be happening at the level of the thalamus (68).Wimmer et al. (68) trained mice to switch between using visual or auditory features of a bimodal stimulus.Neural activity of the visual thalamic reticular nucleus reflected task contexts, and disruption of activity affected decision making based on the relevant modality.To what extent differences of results across these experiments depend on species or tasks should be determined by future experiments.
In summary, several key questions remain unanswered or only partially answered.Although many studies implicate the PFC for the selection of sensory features, other cortical and subcortical regions also appear to engage in the process.Moreover, task-specific encoding of the decision variable suggests that sensory selection and accumulation of evidence may not be independent processes.Addressing the outstanding questions would be aided by expanding the existing decision-making models to incorporate flexible sensory selection, which provides a quantitative framework for hypothesis testing in future experiments.

Selection of appropriate motor outputs.
Besides sensory selection, the decision maker should also choose appropriate motor actions.As in sensory selection, a core question is whether the brain uses the same accumulators regardless of motor actions.This question can be answered with tasks in which subjects report the outcome of the same sensory discrimination through different motor actions (78,79,(85)(86)(87).For example, de Lafuente et al. (85) trained monkeys to report the direction of random dots either with a saccadic eye movement or by reaching one of the two targets (see top of Figure 2d, subpanel ii).
Comparison of neural responses across different actions has demonstrated specificity for action modality.As explained earlier, the discovery of evidence accumulation signals in motor-planning brain areas (Figure 1c) supports an intentional framework in which potential motor intentions compete to form a decision (88)(89)(90).Consistent with this framework, when monkeys report their choices with reaching movements, the medial intraparietal area-a reach-related region of the parietal cortex-strongly reflects decision formation, whereas in the saccade task, medial intraparietal neural responses attenuate (85) (Figure 3a).Overall, the neural populations reflecting decision formation and motor planning are not fully aligned in the neural population state space ( 69), but the same neural population encodes both the decision formation and action plans.
What if the decision maker is not informed of available action options prior to decision making?The intentional framework predicts that in such cases decisions are formed at a more abstract level beyond motor circuits (88).Multiple tasks have been developed to test this with monkeys, but experimental control is challenging and outcomes remain controversial.One task design is to obscure target locations and reveal them only after the stimulus offset (91).Alternatively, subjects could be instructed to select a target based on color (e.g., left/right motion corresponds to red/green target), but the target colors are revealed only after the decision period (92)(93)(94).As expected, the inability to form an actionable motor plan alters neural responses in motor-planning regions (92,93).However, the interpretation of results is complicated by the possibility that subjects might still form provisional action plans (88) during the stimulus-viewing period and then adjust those plans in response to the appearance of the targets.Recently, Shushruth et al. (95) tackled this issue with a similar task but trained animals to avoid the formation of provisional action plans.Curiously, they found that LIP encoded decision formation only after the target positions were revealed, effectively representing the accumulation of sensory memory rather than ongoing sensory evidence.
Perhaps the best controlled task design for studying stimulus-action mapping is to simply reverse the associations between the stimuli and actions (96)(97)(98)(99)(100)(101).For example, the left and right directions in a motion discrimination task could be associated with left and right saccades, respectively, in one task (prosaccade task) and with right and left saccade in another [antisaccade task (96)] (see bottom of Figure 2d, subpanel ii).Match-to-sample tasks have a similar structure, as subjects should switch responses to the same test stimulus depending on the match with the sample stimulus (102).
As in the selection of sensory information, classical cognitive control theories attribute the role of such behavioral switching to the PFC.Reversal learning tasks have been extensively used to study cognitive flexibility, including in clinical patients and lesioned animal models (103).Human patients with prefrontal lesions exhibit perseverative behavior and thus fail to reverse their behavior in response to rule changes (103).Similarly, rodents and monkeys with orbitofrontal cortex lesions show reversal learning deficits (103,104).Asaad et al. (98) recorded from PFC while monkeys switched associations between object stimuli and saccade targets, finding that PFC neurons encode flexible transformation from stimuli to actions.
However, responses in motor-planning areas suggest a more intricate scenario.If the PFC is fully responsible for routing sensory evidence to appropriate motor plans, neurons in motor areas would merely reflect signals supporting their preferred actions.However, experimental observations refute this prediction.For example, Wu et al. (102) examined the premotor cortex of mice performing odor match-to-sample, finding that premotor neurons encode the sample odor before the presentation of the test stimulus, when no action plan is supposed to be formed (Figure 3d).Duan et al. (97) trained rats to switch the association between light direction and nose pokes based on task cues.Some neurons in the deep layer of the superior colliculus maintained task context regardless of the light direction or nose poke direction (Figure 3f ).Muhammad et al. (105) showed that, when monkeys were cued to respond to either a matching or nonmatching stimulus in a match-to-sample task, the premotor cortex encoded the task rule even earlier than PFC did.
Overall, we see multiplex neural signals reflecting decision formation, selection of actions, and maintenance of rules in diverse brain regions.These findings suggest that, as for the sensory selection, the selection of motor outputs is a process inextricably tied to decision formation.

3.2.3.
Changes in readout along the same sensory dimension.Decision makers could also alter stimulus-action mapping by shifting their criterion for categorizing stimuli into action plans (106-111) (Figure 2d, subpanel iii).For example, Liu et al. (107) trained mice to report if a tone frequency exceeded a criterion level, with the level being 10 kHz in some blocks and 20 kHz in others.Therefore, successful performance depended on adjusting the decision criterion rather than gating different sensory features or modalities.
The behavioral outcome of this criterion shift seems similar to those when subjects adjust their decision policy to favor one option over others.For example, by changing the probability of choices (16,52,109) or reward magnitudes associated with each choice (53,54), the subjects' decision criterion for the same sensory judgment can shift.These changes can be accounted for by parameters in decision-making models such as a dynamic bias in drift rate (16) or starting point of the accumulation process (52).In theory, even when subjects are explicitly instructed to shift the decision boundary through supervised or reinforcement feedback, a similar decision-making mechanism can operate.
However, empirical findings suggest that the effect is not limited to decision-making circuits.Rather, sensory neural responses also reflect criterion shifts.For example, between the high and low tone boundaries in auditory tone discrimination, neural selectivity of the mouse auditory cortex shifts toward the discriminating boundary (112, but see 111).Also, noise correlations in sensory areas are influenced by category boundaries.Bondy et al. (106) showed that, in orientation categorization with two different category boundaries, monkey V1 neurons whose preferred orientations match the stimuli in the same category have higher noise correlations (Figure 3c).These task-dependent modulations likely arise from interactions between sensory and decisionmaking brain areas (42,106,(113)(114)(115).Thus, in circuit models, the interaction between sensory and decision-making processes should be taken into account to implement the criterion shift.

Flexible Adoption of Task Solutions
Not only can decision makers adjust the parameters of a computation (e.g., evidence accumulation) for decision making (see Section 3.1), but they can also change the computation itself.While evidence accumulation can be an optimal solution for discrimination or categorization of stable sensory information (116,117), different solutions could become appropriate in other task conditions (see top of Figure 2e).For example, when the task is to compare the magnitudes of two stimuli, then an appropriate solution is to subtract inputs rather than integrate them (118,119).Similarly, an optimal solution for a detection task against a stable background is differentiation, not integration.How can the brain flexibly adopt different task solutions?
Responses in higher cortical areas likely reflect the chosen solutions, but we are still far from understanding the principles underlying the process.One key fact is that the brain uses common resources to perform different computations and thus likely employs coding schemes that generalize over different tasks (120)(121)(122)(123).The population neural code of the PFC and hippocampus has geometries suitable for task generalizations (120).Recurrent neural networks trained to perform multiple tasks show compositional codes that reflect computations shared across tasks (121).Also, circuit models could be designed such that they perform different computations such as maintenance and comparison of sensory inputs through small changes in parameters (119).These could be the ingredients for implementing multiple task solutions.
At the same time, differences in task solutions can also shape how sensory neurons respond or contribute to behavior (124)(125)(126).For example, Koida & Komatsu (124) trained monkeys to report the category of a color patch (e.g., red or green) in one task and to report the match of the color with a reference color in another task-two tasks with distinct solutions.They found that color-selective neurons in the inferior temporal cortex strongly modulate their activity in a task-dependent manner, with some neurons almost abolishing their responses in one task (124).In another study, Chowdhury & DeAngelis (125) showed that the involvement of MT neurons in coarse disparity discrimination depends on whether monkeys were trained to perform fine disparity discrimination beforehand.These results suggest that the brain may dynamically modulate or select sensory activity for a given task solution.But the nature of such interaction between sensory activity and task solutions remains largely unexplored.
In richer task designs, subjects may employ multiple solutions for the same task context.It can happen when high task complexity requires subjects to decompose the task into multiple subtasks, for example, a tree-like (hierarchical) task structure, where subjects should make multiple binary decisions to reach a goal (51,(127)(128)(129) (see bottom of Figure 2e).The human neuroimaging literature suggests that the PFC forms a hierarchical structure along the rostrocaudal axis, where more rostral regions are involved in making higher decisions in a decision tree (130).Relatedly, there is a gradient for plasticity in the lateral PFC of monkeys, with higher plasticity in the more anterior regions (131).
Another scenario where subjects may employ multiple solutions is less constrained experimental designs with multiple task goals.For example, Yang et al. (132) trained monkeys to perform a complex computer game where multiple items are associated with reward and punishment and each requires different actions.They showed that monkeys flexibly switch across distinct behavioral strategies.Modeling studies suggest that rodent behavior could be decomposed into distinct behavioral states even during simple perceptual tasks (133).But models with increased complexity can be overparameterized, requiring careful handling (1,134).Quantitative and yet parsimonious decision-making models, such as those developed for stationary settings (Figure 1), are much needed to provide a significant advance in these topics.

Mechanisms of Deciding to Switch Task Rules
Thus far, we have mainly asked how perceptual decisions are formed under different task rules and how these rules are implemented in the brain.However, there is an equally important question common to any form of flexibility: How does the brain decide to switch task rules?In the real world, we are usually not instructed to follow predefined task rules but rather choose them ourselves based on internal knowledge or external cues, e.g., outcomes of past decisions (Figure 2a).As briefly mentioned earlier, tasks like the WCST or reversal learning have been extensively used to study how subjects switch rules based on feedback (80,81,135,136), but they are rarely associated with the existing frameworks of perceptual decision making.A unique challenge for perceptual decisions is that a decision maker should adopt correct rules and perform correct perceptual discrimination at the same time.Because failure in either leads to error, there is the ambiguity of whether errors arise from a wrong solution or erroneous perceptual discrimination.
Multiple recent studies have approached this question and started to form mechanistic models of rule switching (137,138,191).Purcell & Kiani (137) designed two task contexts in a motion direction discrimination task, where subjects should select either upper or lower direction targets depending on the hidden context (Figure 4a).To perform the task, subjects correctly infer both the current context and the motion direction.Negative feedback on a trial with higher certainty about a perceptual decision is evidence for a context change.Subjects accumulate their perceptual certainty and feedback over trials and switch their context choice when the accumulated switch evidence reaches a bound (Figure 4b).Using a similar task structure, Sarafyazd & Jazayeri (138) revealed neural correlates of such an inference process in the anterior cingulate cortex.These findings not only reveal close interactions across high and low levels of hierarchical decisionmaking processes but also indicate an intriguing possibility that common computational motifs such as evidence accumulation account for multiple levels of decision making.

DISTRIBUTED CIRCUITS FOR FLEXIBLE DECISION MAKING
In the previous section, we highlighted context-dependent signals in multiple brain regions (Figure 3).Here, we discuss the implication of these findings.While the dominant perspective in the field is that a centralized module (i.e., the PFC or a frontoparietal network) enables flexible behavior (Figure 5a), empirical evidence suggests an alternative: a distributed network as the neural substrate for flexible behavior (Figure 5b).

Centralized or Distributed Mechanisms?
The PFC has long been thought of as a central region that enables cognitive flexibility.The PFC receives inputs from various sensory modalities and limbic regions and in turn projects to these regions as well as motor-planning areas.These connections make the PFC an ideal hub for flexible stimulus-action mapping.As recent human neuroimaging studies have shown close coordination Schematic of network architectures proposed to explain flexible perceptual decision making.(a) A prevalent theory is that the prefrontal cortex or a frontoparietal network operates as a central hub that gates relevant sensory information and generates appropriate motor plans.Its internal circuits flexibly adjust computations applied to sensory inputs, sending final decisions to appropriate motor regions for execution.(b) A distributed architecture consistent with recent experimental findings.There are no distinct central controls, but multiple brain regions have the capacity to maintain task contexts and flexibly modulate behavior.Dynamics of activity in these brain regions flexibly form decisions.There are gradients within the network such that regions deeper in the sensorimotor hierarchy play more prominent roles in creating flexible behavior.
between the PFC and PPC (139,140), the special status of the PFC has also been extended to the PPC.The resulting frontoparietal network, which can change its functional connectivity with other brain areas during the adoption of different task rules, is theorized to underlie cognitive flexibility and control (6).
By contrast, studies on perceptual decision making have long suggested a distributed architecture for implementation of neural computations (4,90).As explained earlier, sensory and motor areas reflect decision formation beyond momentary sensory and motor signals (85,88,90,106,(141)(142)(143) (Figure 3a,c).Although this does not deny domain general mechanisms for decision making (31,77,78,144), the prevalence of neural signals encoding the decision variable (Figure 1c) indicates that many brain structures form an interconnected network for decision making.
How do we compare the centralized and distributed perspectives in the study of flexible perceptual decision making?Empirical findings summarized thus far indicate that a variety of brain regions change their activity depending on task rules.Centralized theories would claim that such modulations are driven by the central unit and meant to facilitate efficient routing of information through the central unit.However, patterns of neural responses observed in sensory or motor brain regions are difficult to interpret based on this idea.
First, sensory and motor areas directly encode task contexts (72, 75, 97, 101, 105, 124) (Figure 3e,f ) as opposed to mere modulation of sensory/motor responses by context.These contextual signals cannot be explained as attention-like enhancement of task-relevant information.For example, neurons encode task rules even outside the epoch where relevant sensory or motor events are happening (75,97).While such contextual signals could be maintained through inputs from the PFC, it is unclear why such inputs should be received when not needed.Perhaps these signals modulate computations within sensory and motor areas in a context-dependent manner, counter to the core idea of a centralized module for implementing flexibility.
Second, these areas also reflect signals that do not directly pertain to their roles in sensory and motor processing (69,70,102) (Figure 3b,d).For example, encoding of the decision variable in an oculomotor area is modulated by what sensory features are distinguished (69), or a premotor area reflects sample stimuli during a match-to-sample task (102).These findings indicate that motor-planning areas do not merely receive abstract decision variables from an amodal central module capable of flexible decision making.Rather, they could be part of the network that flexibly transforms sensory signals into action plans (143,145,146) (Figure 5b).
In this distributed network, there is no clear parcellation of brain regions into those responsible for flexible control and others (145,147).However, there are gradients across brain structures, where some areas have more capacity to maintain task rules, flexibly adjust computations, and exert influence over other regions.Therefore, higher-order sensory and motor-planning cortical regions and subcortical areas that have access to both sensory and motor information could play significant roles in flexible behavior, and even early sensory areas can reflect context-dependent behavior, albeit to a lesser extent.The PFC lies at the top of this distributed network as the most flexible circuitry but does not play a unique role as a control area (147).
Multiple brain structures could be responsible for different forms of flexibility in the distributed network (Section 3), but the extent of their involvement varies depending on the task.This perspective might reconcile diverse theories of perceptual decision making such as those advocating active inference by sensory neurons (115,148) or the intentional framework that emphasizes selection at motor stages (88,90).Distributed architecture is also sensible from an evolutionary perspective (145) as the development of the PFC is phylogenetically recent (149), whereas flexibility in behavior is a common requirement for many organisms (150).

How Should We Study Distributed Networks?
How can we study computations distributed in a brain-wide network?To study such processes, it would be more fruitful to focus on common computational principles across the network rather than trying to pinpoint brain regions responsible for individual operations.
Studies that aim to infer the role of individual areas in the network through causal manipulations are strongly challenged (151,152).If there were a central hub for flexible decision making, inactivating this hub region would disrupt all forms of flexible behavior.But when multiple brain regions interact in a distributed network to make decisions, perturbing a single region may yield a wide range of results, including no effect, transient behavioral deficits, and lasting deficits (143,152,153).Indeed, recent studies have found that inactivating saccade-selective LIP neurons has limited effects on perceptual decision making (154,155).More precisely, choice biases appear transiently after inactivation and disappear within a session (156).There are also mixed results on the inactivation effects of rodent PPC on perceptual decision making (157)(158)(159).The inactivation of a more peripheral structure, the superior colliculus, shows stronger effects (160).It is possible that more peripheral brain regions tend to be bottlenecks while higher areas form more robust networks.Therefore, the strength of perturbation effects alone is inadequate to determine the extent of involvement of the perturbed brain region in the studied behavior.More meaningful conclusions could be made by multipronged studies that combine subtle perturbations with electrophysiological recordings from other nodes of the network during an array of tasks, each carefully tailored to investigate an aspect of flexible behavior.
Understanding a distributed network is greatly aided by adopting population-level analyses within and across the network.In distributed networks, even neurons in modality-specific areas may show mixed selectivity for sensory, motor, and contextual information (161)(162)(163).These complex representations would be difficult to probe through conventional analysis methods that do not appreciate the diversity of neural responses (e.g., averaging activity across the population) (69,70).Emerging analysis techniques such as targeted or unsupervised dimensionality reduction (164,165) and geometric data visualization (166,167) may provide better insights into the representations afforded by each brain area (69).Furthermore, understanding the distributed network would be impossible without discovering the principles of communications across brain areas.Using the language of population neural activity and communication subspace offers a practical path forward (168).
Finally, we would like to emphasize the importance of recognizing the properties and requirements of different task designs (Figure 2).Besides identifying brain regions responsible for flexible behavior in a specific task design, it would become important to examine what forms of flexibility are related to each brain region and what forms are not.Furthermore, specific details of behavioral tasks-such as how animals are trained, how often tasks switch, and how task contexts are cuedcan be critical for interpreting behavioral and neural data.An advance in the field relies not only on novel experimental techniques but also on better understanding and designing of behavioral tasks (1,169).

IMPLEMENTING FLEXIBILITY IN NETWORK MODELS
In this last section, we briefly summarize circuit motifs used in existing computational models of perceptual decision making and flexible control.As mentioned in Section 2, existing circuit models formulate decision making as competition across neural modules encoding different actions.These models readily account for some flexible aspects of decision policy (Figure 2b).For example, tweaking the strength of self-excitation of action-selective modules alters the speed-accuracy trade-off of decisions (20, 170) (Figure 6a).However, they do not address flexible adjustment in stimulus-action mapping.
Traditionally, flexible sensory-action mapping is addressed by models of cognitive control (171)(172)(173)(174)(175).These models often include control modules that influence sensory and motor modules (Figure 6b).Different control modules are responsible for different tasks.Through external contextual signals and mutual inhibition, one control module becomes active, gating information flow from sensory to motor modules.The gating could be implemented through a range of mechanisms (171,(176)(177)(178) such as a thresholding operation that triggers outputs only when inputs from both the contextual and sensory signals are present (171).Also, models could be designed to learn novel sensory-action associations through rapid synaptic plasticity (173,174).Overall, these models are powerful but usually operate within the narrow scope they are designed for and depend on careful handcrafting by model designers to properly associate task inputs with outputs.
Recent developments in artificial neural networks have introduced an alternative route to engineer circuit models for cognition (36,121).Without handcrafting the detailed circuit architecture and computations, a fully connected recurrent neural network (RNN) could be trained  to solve diverse tasks similar to those used in human or animal studies as long as proper objective functions and learning rules are implemented (36,121,(179)(180)(181) (Figure 6c).For example, to switch between color and motion discrimination (Figure 2d, subpanel i), an RNN that receives color, motion, and contextual signals can be trained to generate correct binary choices for each task (36).Using a similar approach, Yang et al. (121) successfully trained an RNN to perform 20 perceptual tasks, including context-dependent decisions.The computations happening within RNNs could be partly inferred through analyzing their neural response dynamics (182,183).The geometry of population activity provides another fruitful method for understanding RNNs and their connection with neural responses recorded from the brain (36,180,181), with the caveat that similar geometric responses may also arise from different network architectures (184).RNNs appear to be particularly well suited for describing centralized mechanisms (Figure 5a).Several architectural properties are worth discussing.First, many existing RNNs assume a uniform network architecture composed of randomly connected units with similar physiological properties and thus lack the concept of neuron type diversity and hierarchical structures.Second, sensory inputs and motor outputs of RNNs are highly stylized.For example, the context-dependent RNN (Figure 6c) has to deal with only one-dimensional inputs of color and motion information and generate a binary output.This already circumvents the challenge of decoding meaningful information from high-dimensional sensory representations (185) and generating dynamic motor control signals inextricably tied to decision formation in the real brain (186).Third, many RNNs receive contextual signals, but such signals presume knowledge about the number and type of contexts.Such knowledge is unlikely to be readily available to the real brain operating in complex environments.Finally, typical RNNs are unencumbered with biophysical constraints of the real brain that might affect population response properties.For example, one-dimensional signals may be better encoded on a curved manifold than on a linear manifold in a network if the network needs to reduce the number of spikes due to metabolic costs (69,187).
Incorporating the distributed neural architecture of the brain in artificial neural networks could lead to a better understanding of brain-like neural computations (42,143,188,189).Such models would have a hierarchical architecture comprising multiple interconnected neural modules (Figure 5b).For example, Pinto et al. (143) modeled RNNs composed of input and output modules to account for task-dependent effects of perturbation they observed experimentally (Figure 6d).Multimodule circuits also explain immunity to the perturbation of individual circuits (152,153).Furthermore, modeling a large-scale cortical network accounts for the emergence of a hierarchical structure, where higher areas have longer timescales and stronger control over network connectivity (190).We expect that further expansion of theories that embrace the distributed brain architecture will provide more insights into interpreting diverse experimental observations highlighted in this review.

CONCLUDING REMARKS
Advances in theoretical and experimental studies of perceptual decision making have brought an understanding of cognitive flexibility within our grasp.In particular, existing behavioral and neural models can successfully account for various forms of flexible adjustments in decision policy (Figure 2b,c).However, it is much less understood how stimulus-action mapping (Figure 2d) and task solutions (Figure 2e) are implemented in a context-dependent manner.Empirical results suggest that the same neurons that encode decision formation often show activity reflecting changes in these task rules (Figure 3).This implies a potential link between evidence accumulation and flexible control mechanisms.We hope to see future efforts to integrate context-dependent computations into decision-making models.
A notable neurophysiological observation is the distributed nature of context-dependent neural responses (Figure 3).Neural activity reflecting maintenance, change, or implementation of task rules abounds outside the PFC, including the PPC, sensory, motor, and subcortical brain areas.Although mixed selectivity for diverse behavioral variables is particularly highlighted in recent literature on rodent models (161)(162)(163), past primate studies have also repeatedly reported contextdependent signals across brain regions, as summarized above.These observations invite us to focus on computational principles governing the whole network rather than testing roles for individual brain regions.Apart from perceptual decision making, context-dependent computations are essential in sensory, motor, memory, and many other brain functions.We expect that advances on this topic would open a window for understanding cognition and intelligence in general (4).

Figure 1
Figure 1 Perceptual decision making in a stationary context.(a) Design of a typical visual discrimination task.Subjects report the net direction of random dot motion (L or R) by making a saccadic eye movement.The percentage of coherently moving dots varies across trials.Neural recording is typically made from neurons whose RF (gray) overlaps with one of the targets.(b) A bounded accumulation model accounts for the behavior.The model accumulates noisy sensory evidence to form a DV and commits to a choice when the DV reaches a bound.(c) Neural activity similar to the DV can be found in diverse brain regions involved in oculomotor control.For example, neurons in the LIP increase their firing rates when the stimulus supports a saccade to the target in their RFs.Right panel adapted with permission from Reference 8; copyright 2002 Society for Neuroscience.(d, left) Circuit models.In bistable attractor dynamics models (19, 20), two pools of neurons, each supporting one of the choices, compete until one pool dominates.(d, right) In probabilistic population codes, networks that integrate neural activity can perform optimal evidence accumulation.Right panel adapted with permission from Reference 21; copyright 2008 Elsevier.Abbreviations: dlPFC, dorsolateral prefrontal cortex; DV, decision variable; FEF, frontal eye field; L, left; LIP, lateral intraparietal area; MT, middle temporal area; R, right; RF, response field; T in , target in neuron RF; T out , target outside neuron RF.

Figure 3
Figure 3 Diverse brain regions are implicated in context-dependent decision making.Dark red dots on the brain image indicate the site recorded in the study highlighted in each panel.We depict a monkey brain for illustrative purposes, but several results are from rodents.(a) Medial intraparietal neurons encode the decision variable with different strengths when monkeys report their decisions through reaching or eye movements.Panel adapted with permission from Reference 85; copyright 2015 Society for Neuroscience.(b) Lateral intraparietal neurons encode the decision variable for their preferred saccade targets along curved population response manifolds that are distinct for motion and face discrimination tasks.Panel adapted with permission from Reference 69; copyright 2021 Elsevier.(c) Monkey V1 neurons show distinct patterns of noise correlations depending on the category boundary in an orientation discrimination task.Panel adapted with permission from Reference 106; copyright 2018 Springer Nature.(d) When mice report if two sequentially presented odors (sample and test stimuli) match, premotor neurons encode sample odor during the delay period before any motor plan can be made.Panel adapted with permission from Reference 102; copyright 2020 Elsevier.(e) A1 neurons encode task rules before stimulus presentation when rats report either the location or pitch of the same auditory stimulus.Panel adapted with permission from Reference 75; copyright 2014 Elsevier.( f ) Superior colliculus neurons encode task rules before stimulus presentation when rats switch between pro-(orienting toward a stimulus) and anti-(orienting away) stimulus-action associations.Panel adapted with permission from Reference 97; copyright 2021 Springer Nature.Abbreviations: Stim, stimulus; T in , target in neuron response field; T out , target outside neuron response field.

Figure 4
Figure 4Mechanisms of deciding to switch task rules proposed byPurcell & Kiani (137).(a) Subjects report motion direction using either the upper or lower pair of direction targets depending on the context.(b) The context is not cued and must be inferred from feedback.Subjects switch context after errors (open circles; color indicates motion coherence) depending on the history of feedback and motion coherence on the previous trials (top).Behavior could be modeled as the accumulation of switch evidence to a bound (bottom).Switch evidence is formed by combining feedback and the certainty of the previous trial.Adapted with permission from Reference 137.

Figure 6
Figure 6Example circuit motifs proposed to model decision making and its flexible adjustments.S1 and S2 are inputs from sensory neurons selective to the discriminated stimuli (e.g., left and right motion directions).A1 and A2 are drives for the two actions (e.g., leftward and rightward saccade).(a) When a decision is made through competition of choice-selective neural modules, changes in self-excitation (red plus signs) alter speed-accuracy trade-off.(b) To achieve flexible stimulus-action mapping, a control module could switch the routing of sensory information depending on contextual signals.(c) More recent models implement similar computations by training recurrent neural networks.(d) Embracing the distributed nature of neural processing and interactions in the actual brain can yield mechanistic models that better explain the neural responses.Multimodule recurrent neural networks are a fruitful step in that direction.