Research ArticleMATERIALS SCIENCE

Ionic decision-maker created as novel, solid-state devices

See allHide authors and affiliations

Science Advances  07 Sep 2018:
Vol. 4, no. 9, eaau2057
DOI: 10.1126/sciadv.aau2057

Abstract

Decision-making is being performed frequently in areas of computation to obtain better performance in a wide variety of current intelligent activities. In practical terms, this decision-making must adapt to dynamic changes in environmental conditions. However, because of limited computational resources, adaptive decision-making is generally difficult to achieve using conventional computers. The ionic decision-maker reported here, which uses electrochemical phenomena, has excellent dynamic adaptabilities, as demonstrated by its ability to solve multiarmed bandit problems (MBPs) in which a gambler given a choice of slot machines must select the appropriate machines to play so as to maximize the total reward in a series of trials. Furthermore, our ionic decision-maker successfully solves dynamic competitive MBPs, which cause serious loss due to the collision of selfish users in communication networks. The technique used in our devices offers a shift toward decision-making using the motion of ions, an approach that could find myriad applications in computer science and technology, including artificial intelligence.

INTRODUCTION

Decision-making based on complex data processing is indispensable to intelligent life, making it able to adapt to dynamic environmental change and survive. The more sophisticated decision-making abilities, including mutual concession for overall optimization, make human beings different from other creatures; they are what makes us human. Computer science and technology is used to emulate decision-making abilities: automatically sense environmental changes and make decisions about how to behave in various situations (for example, information and communications, manufacturing, financial trading, and entertainment) (16). Decision-making in applications is usually emulated using conventional computers composed of a central processing unit (CPU), memory, and algorithms (software programs). While CPU-based computing has proved useful for emulating human intelligence, it has one critical drawback: The amount of computational resources needed for handling the rapidly increasing amount of information is exploding exponentially. The conventional approach to handling information thus reaches its limit (718). The ionic decision-maker we have developed overcomes the limitations of conventional computers by creating a paradigm shift toward “decision-making using the motion of ions.”

Conventional computing is designed for “Turing machines” in which versatility (flexibility) is achieved through the digitization of information. While this digitization allowed for substantial development in digital computers, it simultaneously lost an intrinsic efficiency of nature that is achieved by the coupling of information processing (that is, decision-making) with underlying physical laws, which efficiency is found in various natural phenomena (for example, feeding in amoeba and phototropism in sunflowers) (714). The question arises of how this natural efficiency can be regained and applied to our technology.

Here, we report an ionic decision-maker that could significantly reduce the computational cost of decision-making in computer science and technology. Its decision-making ability is achieved by electrochemical phenomena including ionic transport and redox reactions (1922). A principle with tug-of-war (TOW) dynamics has been exploited to solve dynamic multiarmed bandit problems (DMBPs) (1418, 2326), which are long-standing mathematical problems that have been applied to deep learning and related technologies (16, 2731). The excellent solvability and adaptability of our devices are advantageous for dealing with the temporal dynamics of practical problems.

Our ionic decision-maker demonstrates the ability to solve competitive MBPs, which suffer serious loss of rewards (for example, throughput in communication networks) due to the Nash equilibrium (NE), which is the natural consequence for a group of independent selfish users (7, 30, 31). Excellent throughput has been demonstrated by overall optimization of the selections made by users in a communication network. Advantageous performance, particularly adaptability, is achieved by the unique and inherent voltage-charge relationship of electrochemical cells, which works as a forgetting parameter that usually requires huge computational resources in conventional computational approaches. The ionic decision-maker creates a new research field of “materials decision-making” in which the intrinsic properties of materials are used to make decisions not only for large-scale computations of human behavior but also for developing autonomous intelligent chips for mobile applications (for example, communication networking and medical diagnosis).

RESULTS AND DISCUSSION

Theoretical background and experimental setup

Both human beings and computers solve various problems by making decisions regarding subsequent actions. The problems, in many cases, can be interpreted as an MBP with stochastic events (16, 2729). The MBP is a problem in which a gambler at a row of slot machines has to empirically decide which machines to play to maximize the total reward (coins) in a series of trials (30, 31), as illustrated in Fig. 1A (1). Here, we discuss the MBP in a scenario where a user of busy communication channels has to select appropriate channels from among the available channels to transmit his or her information data at maximum efficiency. The situation can be discussed on the basis of the channel model proposed by Lai et al. (1, 2), which is shown in Fig. 1A (2). In this model, the user can select only a single channel at any given time. Suppose that at a certain moment (t), the user selects either channel A or B, which are available (open) with probabilities PA and PB and unavailable (occupied) with probabilities 1 – PA and 1 – PB, respectively. The user does not know the value of PA and PB a priori. With PA (that is, channel A is available), one packet is transmitted (accepted); this situation is hereafter referred to as “transmitted.” The packet will not be transmitted (rejected) with a probability of 1 – PA. This situation is referred to as “blocked.” A series of selections results in a table with “selected (transmitted),” “selected (blocked),” or “not selected,” such as shown in Fig. 1B.

Fig. 1 Theoretical background and experimental setup of ionic decision-maker.

(A) 1: Original MBP, in which a gambler attempts to select a slot machine. 2: MBP in the channel model, in which a communication network user attempts to select an available channel. (B) Example of the results for users 1 and 2. (C) Illustration of the charge-conserving TOW principle used. (D) Illustration of the experimental setup using a two-terminal electrochemical cell, potentio/galvanostat, and a random number generator. The illustration is simplified. Details of the setup are described in Materials and Methods.

It should be emphasized that the original MBP and the channel model are equivalent, as is indicated by the comparison of the two in Fig. 1A (1 and 2). The slot machine in the MBP corresponds to a channel in the communication network. Obtaining a coin corresponds to transmission of a packet. Therefore, we can discuss the MBP based on the channel model without loss of generality.

The user of the communication network described above must decide which of channels A and B should be selected so as to maximize the efficiency of transmitting data packets. The ionic decision-maker developed in the present study is efficiently used for this decision-making. For this purpose, we use the TOW principle that resembles the TOW game in which two persons, A and B, pull against each other at opposite ends of a rigid rope (7, 14, 25). Physically, the TOW principle consists of two key points, that is, the conservation of some physical quantity (which corresponds to the length of the abovementioned rigid rope) and a certain stochastic fluctuation (corresponding to the fluctuation of the horizontal position of the rigid rope, which will be discussed in detail later) (2, 7, 14).

Figure 1C illustrates a strategy based on the TOW principle, in which we assume that the rigid rope makes decisions as indicated by the yellow bar (1418, 2326). Here, we use an electrical charge applied to electrochemical cells to represent the rigid rope. Our principle is thus charge-conserving TOW dynamics, in which a sum of total charge passing through electrode i (corresponds to channel i), Qi, is conserved to zero (that is, QA + QB = 0, due to +QQ = 0) during current application. Electrical potential of channel i, Ei, is a variable used to evaluate which channel is profitable to select. It is increased or decreased, in a timely manner by a current application to the cell, on the basis of the results of stochastic events (for example, transmitted or blocked). For example, the user selects channel A when EA is larger than EB and vice versa. Under the condition PA > PB, EA becomes stably larger than EB after several selections. Here, the variation in Ei appears as a variation in cell voltage V, which is caused by the application of a constant current (I) for a limited time (t).

Figure 1D illustrates our experimental setup. The setup consists of a two-electrode electrochemical cell, a potentio/galvanostat, and a random number generator to emulate the transmission of packets as stochastic phenomena. The stochasticity in Fig. 1 (C and D), that is, +Q or −Q, is externally inserted by the random number generator, whereas stochasticity is an internal property for the examples in Fig. 1A (1 and 2). The cell includes a Nafion H+-conducting polymer electrolyte (in which the H+ is highly mobile, while the electrons are immobile) and platinum thin-film electrodes (3234). The electrical potential of electrode A (EA) with respect to electrode B (EB) is denoted as V (see Fig. 1D). In the initial state, H+ is homogeneously distributed in Nafion, which causes the complete symmetry of electrodes A and B, leading to zero V. However, the V starts to deviate from zero because electrochemical reactions taking place during current application to the cell break the symmetry. Note that the electrochemical reactions mainly consume Q applied to the cell during the current application. The resultant modulation of proton and gas concentration in the Nafion is thus the origin of V variation.

Adaptive operation of ionic decision-maker for DMBPS

DMBPs can be efficiently solved with our ionic decision-maker. The operation principle of the device illustrated in Fig. 1D is provided for the case that PA and PB discussed above are 0.6 and 0.4, respectively. The dynamic change of the measured voltage V during operation is shown in Fig. 2A. First, the sign of V (positive or negative) is determined in the open-circuit condition (indicated by black small arrows). Although the V should be ideally zero in the initial state, in reality, it slightly deviates from zero due to different adsorption/structure at the electrode/electrolyte interfaces (even using the same metal for electrodes). Therefore, we observe positive or negative V even in the first V measurement. When V is positive (negative), a random number between 0 and 1 is generated to emulate that channel A (B) with PA (PB) is selected. With a PA of 0.6, a random number smaller (larger) than 0.6 corresponds to transmission (block) of a packet. In accordance with the results [transmitted or blocked; as illustrated in Fig. 1, A (2) and B], a constant current of 500 nA, which corresponds to +Q in the TOW principle (shown in Fig. 1C), or −500 nA, which corresponds to −Q, is applied to electrode A for 500 ms [that is, Q (=I·t) = 500 nA·500 ms]. The circuit is then opened, and V is measured for 500 ms (indicated by green large arrows). The sequence of these steps (from a black arrow to the next black arrow) is defined as one selection of the channel in the communication network. As shown in Fig. 2A, the repeated selections resulted in a digital-like V variation, which is used to calculate correct selection rate as explained later. This behavior corresponds to a concentration polarization (generating voltage in the Nafion) by modulation of concentration distribution of proton and gas in the Nafion due to the electrochemical reactions shown belowEmbedded Image(1)Embedded Image(2)

Fig. 2 Adaptive operation of ionic decision-maker for DMBPs.

(A) Variation in cell voltage during experiment. (B) Illustration of variation in CSR versus number of selections. (C) Top: Variation in CSR of ionic decision-maker against Pi inversions occurring every 200 selections. CSR starts at close to 0.5, corresponding to completely random selection. Although CSR reached 1 within 100 selections with initial conditions (0.9, 0.1), (0.8, 0.2), and (0.7, 0.3), it did not exceed 0.95 with (0.6, 0.4) even within 200 selections because of relatively close probabilities. Bottom: Variation in number of packets.

Because it is an electrochemical cell consuming Q by the electrochemical reactions, a deviation from an ideal V behavior of a capacitor, as indicated by the dashed lines in Fig. 2A, is significant. Such a deviation from an ideal capacitor, in which Q is completely preserved after current application, in contrast to the present case, plays an important role for achieving excellent adaptability, as will be discussed in the later section.

We investigated the variation in correct selection rate (CSR) of our ionic decision-maker device. Correct selection means that the channel with the highest probability (Pi) was selected regardless of whether a packet was transmitted or not, that is, the voltage takes a positive (negative) value at the selection under the PA > (<) PB condition. The CSR is automatically calculated on the basis of V variation with time that is similar to Fig. 2A. We performed 800 consecutive selections, with each selection being repeated for 100 cycles. We calculated CSR by dividing the number of cycles in which a correct selection was made, N, by the total number of cycles (100 in Fig. 2B), C (that is, CSR = N/C). Details of the experimental conditions are described in Materials and Methods. Figure 2B schematically illustrates the variation in CSR. CSR gradually increased with the number of selections because V tended to take a positive value after repetition of selections under the PA > PB condition due to accumulation of protons near electrode A (1418, 2326).

To evaluate adaptability against environmental change, which is a particularly important property for practical applications, we inverted Pi [for example, from (PA, PB) = (0.6, 0.4) to (0.4, 0.6)] at every 200th selection. CSR dropped immediately after each Pi inversion because the correct V had changed to negative; the choice of channel A (that is, positive V) was no longer the correct decision.

The variation in CSR of our two-electrode electrochemical cell against Pi inversions given at every 200th selection is shown in Fig. 2C (top), starting from various combinations of PA and PB. The CSR steeply increased toward 1.0 after Pi inversions, indicating that our ionic decision-maker is highly adaptable. This behavior agrees well with the illustration in Fig. 2B and the theoretical calculations shown in fig. S2, indicating that the TOW principle works properly in our device. The quick and adaptive behavior demonstrates that our ionic decision-maker can efficiently solve DMBPs.

Figure 2C (bottom) shows the variation in the number of packets, which is the number of packets transmitted from the beginning to a specific selection. Given that V is positive (negative) at a selection, one packet is added for PA (PB), whereas a packet is not added for 1 – PA (1 – PB). As the larger probability (that is, PA for the first 200 selections) decreases from 0.9 to 0.6, the slope gradually decreases. This is quite reasonable because the slope is close to the expected value of the transmitted packets for the correct channel at a selection. The slope was about 0.9 when PA was 0.9 and PA > PB (PA·1 = 0.9).

One may ask “Is a simple capacitor enough to solve DMBPs because it also satisfies the condition of +QQ = 0 during current application?” The answer is “no.” The difference is caused by an inherent nonlinearity observed in V-Q relationship of the cell. It functions as a forgetting parameter, α(<1) in the TOW principle and should thus be termed as “built-in α” (see figs. S2 and S3 for details).

Ionic decision-maker for competitive DMBPs

An ionic decision-maker can be used to solve more complex and practical problems that are difficult to solve from a mathematical viewpoint, that is, competitive DMBPs (18). Let us consider two network users who attempt to select an available channel in the network, as illustrated in Fig. 3A. As long as they select different channels, the Pi is the same as those for the single user cases illustrated in Figs. 1 and 2. However, if they select the same channel, a collision occurs, and Pi is evenly split between them (that is, Embedded Image), leading to a significant decrease in the total number of transmitted packets for both users (1618). This situation is summarized in the payoff matrix shown in Table 1.

Fig. 3 Theoretical background and experimental setup for competitive DMBPs.

(A) Examples of two situations in which two network users attempt to use different channels or the same channel in the network (that is, competitive DMBP). (B) Illustration of ionic decision-maker composed of two electrochemical cells (devices 1 and 2), each with three electrodes, which are connected to potentio/galvanostat via a switch matrix.

Packet loss is a serious problem for communication network systems with a limited number of channels and many users (1, 2). Avoiding packet loss is difficult because it has a mathematical origin: NE, which is the natural consequence for a group of independent users who attempt to use the best channel in a self-centered manner. This problem is much more complicated in reality because Pi dynamically fluctuates. Therefore, it is a major challenge to solve competitive DMBPs and achieve overall optimization, which is referred to here as “social maximum (SM)”. Given two users and three channels, SM is defined as the situation in which the users select channels with the highest and second highest Pi, resulting in maximization of the total number of packets for all users. Note that SM differs from the situation in which the number of packets for each user is maximized, as illustrated by the two SM situations in Table 1. In the two situations, the number of packets for user 1 or 2 is 0.4, which is not the maximum value.

We solve competitive DMBPs by using an extended form of ionic decision-maker. As illustrated in Fig. 3B, an ionic decision-maker is implemented using two electrochemical cells, each with three electrodes (A, B, and C). We define each cell, including a potentio/galvanostat and a random number generator, as devices 1 and 2, corresponding to users 1 and 2, respectively. The devices are connected in series so that a positive current to electrode A in device 1 is equivalent to a negative current to electrode A in device 2. The strong interaction between the devices due to sharing of the applied current means that the selection of a specific channel (for example, channel A) by device 1 makes selection of the same channel difficult for device 2. The theoretical background for the solvability is described elsewhere (18).

Figure 4A shows the variation in selection rates for each channel (that is, A, B, and C rates, as defined in the figure) for both devices. The initial Pi (PA, PB, and PC) were set to (0.9, 0.4, and 0.2) (please see figs. S4 and S5 for the experimental details). At the beginning of the operation, the A, B, and C rates for both devices were close to 0.33 (that is, 1/3), which corresponds to a random distribution. While the A and B rates increased to 0.5, the C rate decreased to zero due to the lowest PC (0.2). The asymptotic approaches of the A and B rates to 0.5 mean that both devices selected channel A (with the highest Pi) in about half the cycles (10) while, in the remaining cycles (10), they selected channel B (with the second highest Pi). This was caused by the interaction between the devices, as mentioned above. The A, B, and C rates reach different values due to the different Pi assignments (0.4, 0.2, and 0.9) after the 100th selection. A and B rates reached only 0.5, which is half the saturation value observed in Fig. 2. This indicates that the devices made mutual concessions in choosing channels A and B. This selection continued in the subsequent selections following the Pi assignment changes. This behavior is completely different from that shown in Fig. 2, which shows that the correct (highest Pi) channels were always selected so that CSR (for example, a rate when PA is the highest) reached 1.0 (see section S5). The result shown in Fig. 4A evidences that the devices achieved SM.

Fig. 4 Adaptive operation of ionic decision-maker for competitive DMBPs.

(A) Variation in selection rates for channels for two devices measured over 20 cycles for averaging. Initial probabilities of channels (PA, PB, and PC) were set to (0.9, 0.4, and 0.2), and the assignment was changed at every 100th selection. (B) Variation in total number of packets for devices 1 and 2 in two operation modes (combined devices and independent devices). SM and NE show theoretical limits of SM (pink dashed curve) and NE (gray dashed curve).

The red curve in Fig. 4B shows the variation in the total number of packets for users 1 and 2. The combined devices (connected in series as shown in Fig. 3B) achieved performance very close to the theoretical limit of SM (pink dashed curve); this achievement is possible only in the virtual case where users 1 and 2 perfectly select the highest and second highest Pi channels for all selections. The slight deviation from theory is due to imperfect adaptability of the ionic decision-maker, which is, at least in part, inevitable because all MBP solvers, including algorithms and devices, need to explore incorrect channels to grasp their probability of being correct before making a decision.

For comparison, we also examined the performance of independent devices (two devices operated independently) under the same conditions. The total number of packets is indicated by the blue curve in Fig. 4B. The total number of packets for the independent device case was significantly lower than that for the combined device case because the devices independently sought the highest Pi, leading to the NE state. The performance was even below the theoretical limit of NE (gray dashed curve); the performance (not so good because of NE) can be achieved only when devices 1 and 2 perfectly select the highest Pi channels for all selections. The comparison shows that ionic decision-maker with the combined devices is advantageous for solving competitive DMBPs.

While each device emulates selfish seeking during operation, their interaction enables mutual concessions to achieve SM. The processing in ionic decision-maker differs completely from that in conventional computers, in which elements and their interactions are fully calculated using huge computational resources. Ionic decision-makers thus offer new insight into a novel class of non–von Neumann architecture computing (7).

At present, the device cannot operate in closed (for example, encapsulated) environment because it operates using electrode reactions (1) and (2) that include exchange of mass with the surroundings. The limitation may be overcome by using other electrode reactions without exchange of mass with the surrounding. Moreover, further downscaling is possible by replacement of the Nafion by thin-film electrolytes.

CONCLUSION

The ionic decision-maker introduced in this article was developed on the basis of the solid-state ionic principle. Its excellent solvability and adaptability for DMBPs, including competitive ones, are achieved by ionic transport and the resultant electrochemical phenomena. This was the first physical implementation of ionic decision-maker in a device with decision-making ability. An ionic decision-maker was particularly effective for solving competitive DMBPs, which are normally difficult because of the NE. A conserving physical object and built-in α, the two main characteristics of the TOW principle, were successfully emulated by the intrinsic property of ionic charge: strictly conserved during application (that is, instantaneous conservation) but somewhat volatile (timewise volatility) in electrochemical systems.

Optimization in a society through competition is similar to decision-making in an individual through dilemmas. Ionic decision-makers can thus be applied to human behaviors, from microscopic to macroscopic. Furthermore, the combination of ionic decision-makers with artificial synaptic plasticity, which was recently found in ionic devices, should be useful for emulating complex personality formation in humans beyond short-term plasticity and long-term potentiation in a single synapse (35, 36). The ionic motion in all-solid-state devices is useful not only for large-scale emulation of collective behavior (Web advertising, financial trading, etc.) but also for developing autonomous intelligent chips for mobile applications (communication networking, medical diagnosis, etc.).

MATERIALS AND METHODS

Fabrication of electrochemical cells for an ionic decision-maker

The procedure used for fabricating electrochemical cells for an ionic decision-maker is illustrated in fig. S1. A Nafion dispersion solution (5%; DuPont DE521) was mixed with 20 weight % platinum-loaded carbon (Vulcan XC-72). The solution was casted onto the surface of a Teflon sheet and dried in air at room temperature for about 6 hours. The resultant thin film (Nafion film with Pt-C catalyst) was transferred and hot-pressed at 413 K onto the surface of Nafion H+-conducting membrane (DuPont N-117) with a thickness of 183 μm. Pt thin film with a thickness of 80 nm was deposited onto the surface of the Nafion with a shadow mask by using electron beam deposition. The gap between the Pt electrodes was 100 μm.

Operation of an ionic decision-maker for DMBPs with one device and two channels

The ionic decision-maker device was placed in a manual probe system and connected using two tungsten probes, as shown in Fig. 1D. The electrochemical cell was connected to a potentio/galvanostat (CompactStat, Ivium Technologies), which was controlled using custom software designed to generate a random number and initiate a current application at positive or negative 500 nA for 500 ms. A random number was generated to emulate stochastic events using various values of Pi. For example, a random number from 0 to 1 was generated, and if it was smaller than Pi, it was interpreted as transmitted, and a positive 500 nA was applied for 500 ms. If the generated number was larger than Pi, it was interpreted as blocked, and a negative 500 nA was applied for 500 ms.

The operation was composed of 800 selections, with each selection being repeated for 100 cycles. Each cycle started with a short circuit of the two electrodes for 200 s to refresh the electrochemical cell (refreshment stage). The circuit was opened, and the voltage was measured to identify the signal of the voltage (positive or negative). When the voltage was positive, channel A was selected. The selection was emulated by random number generation with PA and the subsequent current application, as described above, and vice versa. After current application, the circuit was opened for 500 ms to measure the voltage. Each unit of operation was a selection. After every 200 repetitions of the selections, the PA and PB values were inverted to emulate environmental changes. After 800 consecutive selections with three inversions (that is, one cycle), the operation moved to the refreshment stage of the next cycle. One hundred cycles were performed for each PA/PB combination.

SUPPLEMENTARY MATERIALS

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/4/9/eaau2057/DC1

Section S1. Fabrication of electrochemical cells for ionic decision-maker

Section S2. Comparison with CPU-based computation using mathematical algorithms and dielectric capacitor

Section S3. Operation mechanism of ionic decision-maker

Section S4. Theoretical expectation of decision-making behavior with two devices

Section S5. Two operation modes of ionic decision-maker for competitive MBPs with two devices and three channels

Fig. S1. Fabrication of electrochemical cells for ionic decision-maker.

Fig. S2. Comparison with CPU-based computation using mathematical algorithms and dielectric capacitor.

Fig. S3. Operation mechanism and built-in “α” of ionic decision-maker.

Fig. S4. Expected behavior of two devices and corresponding variation in selection rates for each channel for both devices.

Fig. S5. Two operation modes of ionic decision-maker for competitive MBPs with two devices and three channels.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.

REFERENCES AND NOTES

Acknowledgments: Funding: The authors acknowledge that they received no funding in support of this research. Author contributions: T. Tsuchiya, T. Tsuruoka, S.-J.K., K.T., and M.A. conceived the idea for the study. T. Tsuchiya, T. Tsuruoka, and S.-J.K. designed the experiments. T. Tsuchiya and T. Tsuruoka wrote the paper. T. Tsuchiya carried out the experiments. T. Tsuchiya and S.-J.K. analyzed the data and carried out the theoretical calculation. All authors discussed the results and commented on the manuscript. K.T. and M.A. directed the projects. Competing interests: The authors declare that they have no competing interests. Date and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.
View Abstract

Navigate This Article