Getting closer to the goal by being less capable

A decentralized system of heterogeneous components is more efficient when they are less sophisticated.


INTRODUCTION
Tragic events, such as the killing of a pedestrian by a driverless Uber in Tempe, Arizona on 18 March 2018, can shake public and political enthusiasm for driverless vehicles and planes (1). The need for accurate, safe, and efficient navigation is the key hurdle for these systems (2)(3)(4)(5)(6)(7)(8)(9)(10)(11). Centralization-for example, by aggregating the local information processing of a system's N ≫ 1 components into a centralized unit such as a brain or a central processing unit, by aggregating the local sensing of N ≫ 1 autonomous sensors into a centralized detector such as an eye, or by aggregating the local action of N ≫ 1 autonomous actuators into a centralized machine such as a leg-has many advantages. For example, a human relies on a centralized processor (brain), sensors (eyes), and actuators (legs) to walk in a straight line from A to B.
However, centralization has the disadvantage of making an entity vulnerable to damage or targeted attack-for example, an external hack that shuts down a cyberphysical system, a brain disease that leaves someone unable to move properly, eye loss that leaves them blind, the inevitable crash of a conventional car or plane following loss of its pilots or drivers, or the crippling of major areas of U.S. socioeconomic life when there is a federal shutdown. Centralized structures can also be disadvantageous in terms of the cost (e.g., energy or money) needed to maintain them, the need for localized heat dissipation, or the substantial communications bandwidth required to continually transport information into and out from the central unit (12). Delays in this information transfer can even produce dangerous systemlevel behavior (5,9).

DECENTRALIZED NAVIGATIONAL MODEL
We consider a generic decentralized entity (Fig. 1A, bottom) inspired by the Drosophila larva (12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26) (Fig. 1A, top), which is able to execute exploratory routines (20) as a result of the aggregate actions of abdominal and thoracic network segments, that balances the tasks of runs and turns, acting like autonomous agents (i.e., klinotaxis) (21)(22)(23)(24)(25)(26)(27)(28). Within the framework of biologically inspired algorithms, our approach lies within the category of foraging algorithms originally inspired by the chemotactic behavior (i.e., odor-driven foraging as-sociated to food, habitat, and survival) exhibited by decentralized bacterial colonies such as Escherichia coli (29). In an engineering context of navigational algorithms, our model follows the principle of proportional navigation, which aims at preserving and maximizing a vehicle-target alignment (30). Our goal is not to provide an anatomically or physiologically precise description but rather to study the dynamics of a minimal, generic abstraction. The entity comprises an arbitrary number N of individual components or members, which we refer to as agents, since each has a limited ability to receive information from the outside (i.e., sensor), to store and process this information, and to act on this (i.e., actuator) by trying to push the system to the left or right.
Each agent's capability is characterized by the integer m, since each agent can store the history of the past m outcomes for the entity, with each outcome being 0 or 1. The ability for even simple materials to have such a memory is well known, e.g., shape memory materials (31,32). Following the aggregate action of all the N agents (Fig. 2), the outcome is 1 if the entity becomes more aligned with the target and 0 if it becomes less aligned. There are hence 2 m possible histories of the past m outcomes, e.g., for m = 2, these are 00, 01, 10, and 11 (33,34). Each agent stores the same history at any given time step (e.g., 01) but has two possible actions that it can then take: to try to rotate the system an angle d clockwise (action −1) or counterclockwise (action +1). Hence, there is a look-up table (Fig. 1B) in which each row is effectively an information-processing algorithm (referred to as a "strategy") for predicting the next best action given one of the 2 m possible history inputs, meaning that there are2 2 m possible strategies. Each agent is initially assigned s strategies randomly from the space of 2 2 m possible strategies, thereby introducing a heterogeneity in agents' actions for the same history bit-string, which is again a known feature for smart materials. By allowing each agent to use the best performing among its s > 1 strategies at any given time step, and scoring all strategies based on their previous predicted actions, each agent has some adaptability. The agents' heterogeneity is also consistent with Drosophila larvae in that different individual segments contract or extend on either side to adjust the orientation (Fig. 1A) and thus generate motion to the left or right-and also, in a more abstract sense, the different pieces of the neuronal network. Different neurons in Drosophila can be functionally different (13)(14)(15)20). At each time step, the system rotates under the aggregate action of the N agents and moves in a straight distance ℓ in the new direction. This space need not be geographical: It could be an abstract space spanned by performance variables in the more abstract setting of an entity of N individual humans in an organization or company with no top-down control, aiming for some particular collective target. Denoting the number of agents taking action ± 1 at time step t as n ± 1 [t], the entity's overall rotation is then (n + 1 [t] − n − 1 [t])d. If all agents take the same action, for d = p/N, then the system rotates 180°. The winning action is the one that improves the system-target alignment (see Fig. 2). Figure 1B demonstrates the similarity between the trajectories generated by our generic model (bottom) and those of living Drosophila larva in our experiments (top). The larvae naturally seek out temperatures that maximize their growth rate (≈ 24°C) (23). They crawl atop a 22 cm by 22 cm agar substrate while experiencing a one-dimensional thermal gradient ranging from 17.5°to 22°C (i.e., positive thermotaxis) (16,17,26). The trajectories are captured by high-pixel density video cameras taking 15 frames per second. By choosing the target to be far away (0, 10 4 ℓ), the model mimics the one-dimensional gradient of the experiment.
To demonstrate the robustness of our main findings to changes in the model, Figs. 2 to 4 present results using two distinct implementations. Figure 2 defines the physical quantities ( Fig. 2A) that determine the motion updates (Fig. 2B) and hence the distance d from the target after a given time (Fig. 2C). At time t, the position is (x t , y t ) in the plane. The target is fixed at (x T , y T ). The system moves with a velocity v → t , which makes an angle q t with the horizontal. The vector T → t , pointing from the system to the target, determines the ideal trajectory. Hence, the winning action is the one that better aligns the vectors v → and T → . Our first implementation (version model 1) compares the alignment of these vectors before and after the rotation at the same position in space. Let W t −1 be the angle between v → tÀ1 and T → tÀ1 at time t − 1. Once each of the agents independently decides on an action, the aggregate of these actions rotates the system. Before it moves in the new direction, the new vector velocity now makes an angle W′ with the target vector. The difference between these two angles, DW = W′ − W t−1 , is used to discern whether a specific action is a winning one. Both angles are always taken as positive so that the ideal action would make W′ smaller than W t−1 (i.e., better alignment after rotation). If DW > 0, then the action that made W′ larger is the losing action. If DW < 0, then the action that made W′ smaller is the winning action. If DW = 0 (i.e., direction unchanged) and W t−1 ≠ 0 (and thus W′ ≠ 0), then the action that made W′ smaller is the winning action. Last, if DW = 0 and W t−1 = W′ = 0 (i.e., already aligned), then both actions win. The second implementation (version model 2) looks at the instantaneous alignment of v → and T → at the current point in space and time. This is attained through the bearing angle b t , which opens from T  (23)]. Bottom: Schematics of our multiagent, decentralized model that determines the system's direction of motion toward a desired target. Heterogeneity of the agents mimics heterogeneity in regions of larva body wall that contribute to crawling or heterogeneity of member components (e.g., paddlers) in some locomotive device (e.g., canoe) or, more abstractly, members of some human group or organization lacking top-down control. Each agent acts as a limited sensor that captures the current bit of common information, as a limited information processor by having a memory that stores the most recent m bits of common information (i.e., history) and processes which action (±1) to take based on its best-performing algorithm (strategy) among the s that it has and as a limited actuator by trying at each time step to rotate the system an angle d clockwise (action −1) or counterclockwise (action +1). Which action it takes depends on the current history at that time step (i.e., the common bit-string of length m) and its s algorithms (i.e., strategies), which form a look-up table for the action. s strategies are assigned randomly to each agent at the beginning, hence introducing intrinsic agent heterogeneity.

RESULTS
Efficiency Figure 2C shows that the entity is remarkably efficient in reaching its target when its individual components are neither too capable nor too incapable, i.e., m has an intermediate value. It shows the ratio of the final system-target separation d to the initial separation d 0 (see inset), as a function of each agent's capability m for version model 1 (black symbols) and model 2 (blue symbols). We set the total simulation time equal to n time steps with n = d 0 /ℓ, so the system would reach the target at t = n (i.e., d/d 0 = 0) if the alignment along the path was always perfect. For both model variants, there is a maximum in efficiency (i.e., minimum in d/d 0 ), which is highly robust to changes in the distance of the target, damage of the N components, and substitution or loss. This finding of a switch between increasing and decreasing efficiency with increasing m offers an answer to the question of when and why evolution in natural systems switched between decentralized (e.g., larvae) and centralized (e.g., higher species) designs. By contrast, if the entity was executing a purely random walk, then the value of d/d 0 would be of order 1 for all m and hence off the vertical scale.
The counterintuitive conclusion from Fig. 2 that increasing the capability m of the individual components (agents) eventu-ally decreases the overall efficiency can be explained and described mathematically by considering the correlations in the strategy pool via an extension of the crowd-anticrowd theory (33)(34)(35)(36). If m is large, then the number of possible strategies (2 2 m ) is large, and hence the probability that any two agents hold and use the same strategy at any given time is small. Hence, the agents will tend to act independently. This means that, as m increases, the trajectory of the entity becomes increasingly like a random walk, and hence the ability to reach a given target decreases, i.e., the deviation d after a given number of time steps n will increase as m → ∞. By contrast, when m is small, the number of possible strategies is small, and hence the probability that any two agents hold and use the same strategy at any given time is large. The agents are now strongly correlated and so effectively act as a crowd. Just as in a canoe, if all the occupants suddenly paddle on the same side, then the entity will turn through large angles at each time step as well. These large fluctuations in angle (Fig. 3) make the entity waste time steps by successively overcorrecting previous overcorrections (33)(34)(35)(36). Hence, as m decreases, the entity's ability to reach a given target decreases, i.e., the deviation d after a given number of time steps n will increase as m → 0. Figure 3 demonstrates the accuracy of our analytic crowd-anticrowd theory in quantifying the fluctuations in the orientation angle at each time step as a function of each agent's capability m.

Theoretical approach
We have derived an accurate mathematical formula for d/d 0 (pink line in Fig. 2C) by generalizing the crowd-anticrowd theory (35,36)   up to second order in the angle covariance and fourth order in angle fluctuations where C ¼ ∑ nÀ1 j¼1 cov½fq i g; fq iþj g, which can be calculated by simulation data but is well approximated by the algebraic form proportional to m a (see the Supplementary Materials). For the limiting case of n → ∞ (i.e., the target is far away), the accuracy ratio is reduced to d=d 0 ≈ s 2 q = ffiffiffiffi ffi 12 p and shown in the green curve in Fig. 2C. The angular fluctuations over time s q are given by (see the Supplementary Materials) where the formula for the fluctuations at each time step s 2 CA is derived in the Supplementary Materials and its accuracy is shown explicitly in Fig. 3. For large m, the full form of s q (Fig. 3 orange curve) in Eq. 2 shows good agreement. For small to intermediate m, the analytic formula agrees well with the simulation data and captures the optimal point at m = 5. This allows us to provide an approximate formula for the optimum capability m o of a system's components or members such that the entity will come as close as possible to its desired target where the desired m o is the solution to this equation. This estimate is general for binary agent resource models and can be used as a first approximation of the optimum capability for specific implementations. While this uses the lower-bound analytic expression in Fig. 3, the corresponding result for the upper-bound differs trivially by an extra factor ffiffi ffi 2 p (see the Supplementary Materials). Figure 4A shows that our main conclusions-and in particular, the appearance of an optimal agent capability m o -are remarkably robust to noise. This noise is introduced by a probability q that the actual outcome 0 or 1 gets corrupted before it is fed back to the agents. The depth of the d/d o minimum, and hence the strength of the efficiency maximum, increases as the noise level q increases, while the value of the optimal capability m o is insensitive to q, suggesting that Eq. 3 is universal. This effect highlights the greater capacity for error correction that a system which effectively balances crowd and anticrowd behavior (i.e., m = m o ) will have. For smaller agent capability (m < m o ), the overcorrecting behavior that produces a sharp zigzag-like trajectory, where the crowd outnumbers the anticrowd, becomes curvier and wider by the random information. This makes the system waste many more time steps to find a favorable alignment with the target (see the Supplementary Materials). Similarly, as m grows greater than m o , the randomness tends to destroy the small crowds formed due to the large strategy space, and the decision-making operation approaches more quickly that of a random process. Having agents receive the information of the winning group incorrectly with probability q at any given time step mimics disturbances in the signal reception or processing or a defective sensor whether biological or synthetic.

Noise effects
Comparison with larval klinotaxis While Fig. 1B shows visual commonality between Drosophila larvae trajectories and those of our model, we now make this comparison more quantitative. First, we consider the dependence of the turning rate f (i.e., f ≡ dq/dt) on the bearing angle b (Fig. 2A). Empirically observed f versus b measurements for chemotaxis dynamics (e.g., odor-driven) in Drosophila larvae and nematode Caenorhabditis elegans display a sinusoidal dependence (18,25). This finding supports the hypothesis that a proportional navigation method (30), which tends to preserve the line of sight angle fixed along the path, is likely used by these simple organisms when performing exploratory routines (19). As shown in Fig. 4B, our model version 1 captures the empirical sinusoidal dependence and magnitude well (25). Although we use m = 5 and q = 0.5, a sinusoidal pattern also arises for many other parameter choices [see the Supplementary Materials and (37)]. The noise level q tends to control the amplitude of the pattern, with smaller amplitudes associated with higher levels of randomness. This effect is consistent with the observations in larvae organisms, which show that they make turning decisions on a stimulus gradient against the optimal conditions about 40 to 45% of the time (16,26). The organisms do not immediately turn around when heading toward bad conditions; they eventually turn away from improving conditions. This is actually good in a realistic ecological context, because having randomness on top of purposeful motion is important for avoiding "traps" and/or finding improved conditions that could be on the other side of bad conditions (16,26). This empirically testable prediction of our model may help address open questions regarding how animals adapt to or exploit randomness. Second, we look at the curvature k at equally spaced time steps for 200 individual trajectories of duration 10 4 steps, for different model parameters, and for the Drosophila larvae organisms undergoing positive thermotaxis. Figure 4C shows the resulting distributions where the points in the experimental path are taken to be 5 s apart, while for the model, they are at 10-time step intervals. Calculations for single-step spacing yield similar conclusions (see the Supplementary Materials). As shown, the peak of the distribution tends to shift to smaller values when the memory grows (Fig. 4C, left panel), since the trajectories tend to present more turns, as shown in Fig. 1B  and fig. S1. We found a favorable comparison between the distributions of curvature values of the organism and our model for some specific parameters. The right panel of Fig. 4C shows an example for m = 3 and randomness of q = 0.45 that is in agreement with the observations for the organism's behavior. To further analyze the relationship between the distributions, we carry out a Kolmogorov-Smirnov (KS) goodness-of-fit test. The comparison shown in Fig. 4C yields an average P value of 0.3621. Overall, we found that a combination of noise values between 0.45 and 0.5 and agent capabilities of 3 and 4 provide consistently high P values (see the Supplementary Materials). This is surprising given the high degree of randomness in both the model and the organism. By contrast, in a comparison with a null model where agents flip a coin to decide between actions ±1, the test gives a P value of 10 −13 . Additional tests results are presented in the Supplementary Materials.

DISCUSSION
We acknowledge the limitations of our quantitative analysis. Our goal is not to do a rigorous comparison since we are not claiming to propose a detailed model for the larva organism. In contrast, our model is minimal and should be seen very much as a favorable prototype. Its value comes precisely from the fact that, despite its simplicity, it is able to produce nontrivial trajectories that are consistent with those performed by the organism. Individual trajectories, in both the model and the organism, are built from a large amount of randomness, and hence, we consider that an aggregated statistic is appropriate to establish a comparison between them. Our assumptions regarding the time and distance units (i.e., assuming that ℓ is comparable to millimeters and that time steps are comparable to seconds) are representative and used only as an illustrative tool. The main conclusions of this paper do not directly depend on these choices.
Last, we note that, because the trajectories of the decentralized entity are ultimately determined by the heterogeneity of the agents in our model, it is conceivable that our findings may shed some light on managing movement in individuals with significant limitations in coordination due to central nervous system injury or disease (38)(39)(40)(41). Although entirely speculative at this stage, it is possible that, instead of trying to re-establish central control, one might instead be able to use the understanding of how the ecology of strategies held by the agents  Fig. 1 affects the range of movements to manage the ecology of disparate nerve and muscle elements-in such a way that uncontrolled movements (i.e., d/d 0 ) are reduced, with the potential advantage that lower capability nerve and muscle elements (i.e., intermediate m) might work better (38)(39)(40)(41). Similarly, our results can guide the development of new-generation technology of ingestible sensors to monitor and diagnose gastrointestinal health (42). Likewise, our results may help spark new ideas about the performance of autonomous vehicles having simpler components in a decentralized design, as in our larva model. This follows earlier suggestions that future system designs might usefully learn from nature's own evolutionary solutions [see, for example, (43) and (3)]. Organisms such as Drosophila larvae harmonize the tasks of movement and turning through the collective output of individual segments of the body acting as a sensor and actuator, as mimicked by our model. Similarly, there is conceivably a real-world need for vehicles that can arrive "close enough" to some target, as in this paper, without needing very precise actions at each instance in time-and hence the vehicle could, following the results in this paper, be designed using simpler components than might otherwise be imagined. Although the lack of accuracy would not be appropriate for regular vehicle traffic, it might be acceptable for remote vehicle terrain exploration in hostile environments, where the overriding feature is to avoid centralized control, e.g., on the bottom of a deep ocean where central control may not be possible or in a conflict setting where a central control might be hacked or destroyed (44).

Experimental design
Trajectories of the Drosophila larva organism were captured with high-pixel density video cameras with a resolution of 15 frames per second. The organisms were placed atop a 22 cm by 22 cm agar substrate while experiencing a one-dimensional thermal gradient ranging from 17.5°up to 22°C.

Statistical analysis
The KS goodness-of-fit analysis that we used to obtain our results for the P values between the curvature distribution of the experiments and our model was carried out through the standard statistical package available in Wolfram Mathematica version 11.2. To compare the curvature distributions, we took 1000 random samples of sizes 500 and 1000 from each distribution and calculated the KS P value for each of them. The reported average P values are the average over all the tests of the individual random samples.

SUPPLEMENTARY MATERIALS
Supplementary material for this article is available at http://advances.sciencemag.org/cgi/ content/full/5/2/eaau5902/DC1 Section S1. Trajectory model Section S2. Trajectory model-External field Section S3. Turning rate, noise, and curvature Section S4. Theoretical approach Section S5. Crowd-anticrowd theory Fig. S1. Portion of trajectories in runs for different values of m, q, and d for a target at (0, 10 4 ℓ), an initial location at (0, 0), and an initial direction q = p/2. Fig. S2. Complete trajectories for a close reaching target.   Fig. S6. Sum of covariance (C and C 2 ) calculated from simulation data as a function of memory m. Fig. S7. Schematics representation of the matrix Y for m = 2 and s = 2 in the Reduced Strategy Space (RSS). Fig. S8. Crowd-anticrowd theory against numerical simulations for the model organism as a function of the agent capability m, for N = 101 agents, s = 2, and d = p/2N. Table S1. Statistical similarity between the curvature distribution of the larva organism and our model 1 using the KS test.