## Abstract

Materials development often confronts a dilemma as it needs to satisfy multifunctional, often conflicting, demands. For example, thermoelectric conversion requires high electrical conductivity, a high Seebeck coefficient, and low thermal conductivity, despite the fact that these three properties are normally closely correlated. Nanostructuring techniques have been shown to break the correlations to some extent; however, optimal design has been a major challenge due to the extraordinarily large degrees of freedom in the structures. By taking graphene nanoribbons (GNRs) as a representative thermoelectric material, we carried out structural optimization by alternating multifunctional (phonon and electron) transport calculations and Bayesian optimization to resolve the trade-off. As a result, we have achieved multifunctional structural optimization with an efficiency more than five times that achieved by random search. The obtained GNRs with optimized antidots significantly enhance the thermoelectric figure of merit by up to 11 times that of the pristine GNR. Knowledge of the optimal structure further provides new physical insights that independent tuning of electron and phonon transport properties can be realized by making use of zigzag edge states and aperiodic nanostructuring. The demonstrated optimization framework is also useful for other multifunctional problems in various applications.

## INTRODUCTION

Materials informatics (MI) is an emerging approach that aims to accelerate materials development by combining material science and data science (*1*). The core of the approach is to identify “best” materials by bridging the gap between the data in hand and the optimal structure and/or composition using unrecognized complex correlation in the data instead of (or in addition to) relying on insights driven by physical/chemical principles. MI has been applied to the design of novel materials such as cathode materials for lithium-ion batteries (*2*), nitride semiconductors composed of earth-abundant materials (*3*), piezoelectric materials (*4*), and thermoelectric materials (*5*–*9*). All of these studies have aimed to realize high-throughput screening of the best materials from the pool of stoichiometric compounds.

Another but more ambitious course of MI aims to create nanostructures by identifying optimal geometry that maximizes the target property. There has been research on nanoparticles embedded in matrix to modulate heat conduction (*10*), solid-solid interfaces to identify energetically stable structures (*11*, *12*), and multicore structures of plasmonic nanowires to control optical scattering and cloaking effects (*13*). Recently, the method has been extended for designing atomic-scale structures with minimum/maximum lattice thermal conductance (*14*, *15*). There, to reduce the number of candidates for which property calculations are performed, atomic-scale phonon transport calculations and an informatics-based optimization technique are alternately conducted. The work has shown that such an approach can considerably accelerate nanostructure design for transport properties, which is not only limited to phonons but also applicable to any other quasi-particles.

Now, the next nontrivial challenge is to extend such an approach to optimize the materials for multifunctional properties. In many applications, a material needs to satisfy multifunctional demands that are often in trade-off relations, such as thermal resistance versus mechanical strength in thermal insulators (*16*), and thermal conductance versus mechanical compliance in thermal interface materials (*17*). An extreme example in this context is thermoelectric materials, where optimization requires resolution of the conflicting demands of Seebeck coefficient, electrical conductivity, and thermal conductivity. Here, the dimensionless figure of merit is expressed as *ZT* = *R*_{th}*S*^{2}*T*/*R*_{el}, where *R*_{el/th} is the electronic/thermal resistance, *S* is the Seebeck coefficient, and *T* is the temperature.

Thermoelectric materials have been widely studied, particularly with increasing demands to power ubiquitous sensors and transmitters without a complex system and frequent maintenance for the coming trillion sensor era (*18*). This has led to the enhancement of *ZT* particularly by using nanostructuring techniques, such as in low-dimensional materials (*19*), superlattices (*20*, *21*), nanocrystals or nanocomposites (*22*–*24*), and nanoporous structures (*25*, *26*). However, the development of nanostructures requires much time and effort because of its huge parameter space, much larger than that of single-crystal compounds; thus, further acceleration of the development process requires a goal, that is, knowledge of optimal structures, to guide the research.

Here, structural optimization is carried out by alternating multifunctional (phonon and electron) transport calculations and Bayesian optimization. Bayesian optimization should be particularly useful for optimizing thermoelectric structures, as the function for *ZT* is highly uncertain because of the complex correlations among nanostructures, thermal properties, and electrical properties, and evaluation of *ZT* that requires calculations of both thermal and electrical properties is expensive. We take graphene nanoribbons (GNRs) as the representative thermoelectric materials. Carbon nanomaterials, including GNRs, are promising thermoelectric materials (*27*–*30*) due to high Seebeck coefficient (*31*), high electron mobility (*32*), and mechanical flexibility (*33*–*35*), although their thermal conductivity is high (*36*, *37*). To enhance thermoelectric properties of GNRs, different types of nanostructures have been proposed, such as vacancies (*38*, *39*), edge disorders (*40*, *41*), nanoporous structures (*42*, *43*), chevron-type GNRs (*44*), and GNRs composed of different edges and widths (*44*, *45*). It is also known that GNRs have different electronic properties depending on their edge type: armchair or zigzag edge. Armchair GNRs with a well-defined width are intrinsically semiconducting and thus have a high Seebeck coefficient (up to a few millivolts per kelvin) (*46*, *47*). On the other hand, because of the edge state, which is a topologically stable electronic state (*48*, *49*), pristine zigzag GNRs are metallic and thus have a low Seebeck coefficient regardless of their width (*30*). We use zigzag GNRs in the present study, expecting that the presence of the edge state gives larger tunability of electrical and thermal transport.

This paper is organized as follows. In the first part, we demonstrate the effectiveness of the optimization method for thermoelectric nanostructures using periodically nanostructured GNRs. In the second part, we optimize the structure of antidot GNRs, which have recently been fabricated in experiments using the ion beam technique (*50*, *51*), and show how the optimization helps identify high figure-of-merit structures and understand pathways to separately control electron and phonon transport.

## RESULTS AND DISCUSSION

### Nanostructured thermoelectric GNRs

We consider two different models as shown in Fig. 1: periodically nanostructured GNR (model A) and antidot GNR (model B). A pristine GNR structure is defined by the indices (*N*_{x}, *N*_{y}), where *N*_{x} and *N*_{y} are the numbers of carbon chains along the axial direction and the direction perpendicular to the axis, respectively. Model A (Fig. 1A) is composed of a periodic structure of a (4, 6) GNR from which *m* atoms are removed. In this analysis, we assume that the structural stability is ensured if every carbon atom in the GNR structure belongs to at least one complete hexagonal lattice and no hexagonal lattice is isolated. In the case of model A, to validate the effectiveness of the Bayesian technique for structural optimization of multiple transport properties, the system is deliberately set small enough that we are able to calculate *ZT* of all the candidates. While thermoelectric properties vary with the chemical potential (μ), the representative values of thermoelectric properties were evaluated at the “peak chemical potential” (μ_{peak}) that gives maximum *ZT* within a realistic chemical potential range, −1.0 ≤ μ ≤ 1.0 eV. To perform optimization based on machine learning, the representations identifying each candidate, called “descriptors” or “features,” play an important role, although it is still controversial how to decide them (*52*). In this analysis, we use the binary flag describing the state of each hexagonal lattice as a descriptor: “1” and “0” for the complete and defective hexagonal lattice, respectively, as shown in Fig. 1A. The set of the binary flag descriptors for each structure is defined as **g**_{A} = {1, 0, …}.

For a system that is more realizable in experiments, we consider antidot structures, model B, shown in Fig. 1B. Model B is composed of an antidot nanostructured region with multiple sections of (6, 8) GNR connected with semi-infinite pristine GNRs with a width of *N*_{y} = 8. Each section in the nanostructured region is either a pristine structure or an antidot structure with a pore (diameter *d* = 2.9 Å) in the center of the section. The binary flag descriptor for model B is defined to be 0 and 1 for the pristine structure and the antidot structure, respectively. Similar to the case of model A, the set of the binary flag descriptors for model B is defined as **g**_{B} = {1, 1, …}. For model B, we mainly study systems with 16 sections that give 32,896 candidates, which is so large that the direct simulations of all the candidates would exceed the computational resource, but reasonably large for the Bayesian optimization.

### Bayesian-based multifunctional structural optimization

Before discussing the optimization, we first highlight the sensitivity of *ZT* to GNR structures. Figure 2A shows *Z*_{ave}*T* (circle), *Z*_{max}*T* (triangle), and *Z*_{min}*T* (inverse triangle) of model A with different numbers of vacancies (*m* = 0 to 10), where *Z*_{ave}*T*, *Z*_{max}*T*, and *Z*_{min}*T* denote average, maximum, and minimum *ZT*, respectively, of all the candidates with the same number of vacancies (*m*). While *ZT* of the pristine GNR with no vacancies (*m* = 0), *Z*_{0}*T*, is relatively small (*Z*_{0}*T* = 0.048), introduction of vacancies enhances *ZT* on average. Note here that vacancies do not always enhance *ZT* but can degrade *ZT* from that of the pristine case. The large difference between *Z*_{max}*T* and *Z*_{0}*T* or *Z*_{ave}*T* shows that the thermoelectric performance of GNR is highly sensitive to the structure, and thus, it is an appropriate system to target with the optimization technique. Here, we consider not only the absolute *ZT* value but also its increment against *m*, because improving *ZT* with fewer vacancies is beneficial in terms of mechanical stability. Here, the average rate of increase in *ZT* with respect to *m*, namely, the gradient of dashed lines in Fig. 2A, is defined as Δ(*ZT*)/Δ*m*. The best structure also stands out in this context as, for example, Δ(*Z*_{max}*T*)/Δ*m* = 0.086 is three times higher than Δ(*Z*_{ave}*T*)/Δ*m* = 0.029 in the range of 1 ≤ *m* ≤ 10.

The multifunctional structural optimization for model A is performed with the Bayesian optimization, and its efficiency is compared with random search for different *m* (*m* = 6 to 10). For each *m*, *N*_{Bayes} and *N* _{rand} are evaluated, where *N*_{Bayes/rand} denotes the average number of calculations needed until, for the first time, a structure that belongs to the top *k*% (*k* = 0.0 to 2.0%) of all the candidates is found. For the Bayesian search, the values for 100 optimization runs are averaged, while, for the random search, *N*_{rand} is obtained with the hypergeometric distribution: *N*_{rand} = (*N*_{total} + 1)/(*kN*_{total} + 1), with *N*_{total} being the number of all the candidates. Note that we omit the structures with translational and reversal symmetries from the candidates to save the optimization time.

The Bayesian search accelerates the exploration of high-*ZT* structures (*N*_{Bayes}/*N*_{rand} < 1) in all cases as shown in Fig. 2B, except for a singular point for the best case of *m* = 6. In most cases, the top 0.5% of the structures can be found by the Bayesian search with half the calculations for the random search (*N*_{Bayes}/*N*_{rand} < 0.5), and in the best case, which corresponds to the search of the best structure for *m* = 7, the Bayesian search requires merely 18% calculations of that with the random search (*N*_{Bayes}/*N*_{rand} = 0.18). As for the dependence of efficiency on *k* and *m*, the efficiency of the Bayesian search increases with decreasing *k*, while we do not observe a clear dependence on *m*. The *k* dependence indicates that Bayesian optimization functions are more efficient under more limited conditions. The absence of the *m* dependence indicates that efficiency is independent of *N*_{total}, and thus, the comparable efficiency is expected in case of larger GNR systems. We note that the singular case for *m* = 6 is due to strong dependence of electronic properties on atomic geometries; low *ZT* structures that are similar (closer in descriptor space) to the optimal structure make the prediction of high-*ZT* structures difficult. Such large structural sensitivity may have intriguing physics behind it; however, here, we focus on the general optimization problem.

To gain quantitative insight into the efficiency of optimization process, we compare the optimization efficiency of *et al*. (*14*) obtained 3.8 times higher efficiency (*N*_{Bayes}/*N*_{total} = 0.0089) than the present study (*N*_{Bayes}/*N*_{total} = 0.034). Note that, for the sake of comparison with the study of Ju *et al*. (*14*), we performed additional optimization calculations without considering the reversal symmetry and used *N*_{Bayes}/*N*_{total} as a criterion, where *N*_{Bayes}/*N*_{total} is obtained by averaging results of 10 and 100 optimization runs for the study of Ju *et al*. (*14*) and this study, respectively. Lower efficiency in the present study should occur because the optimal structure obtained by Ju *et al*. (*14*) is simpler, namely, a superlattice with smooth interfaces. The successful results of optimizations for different properties with different models of the current study and the study of Ju *et al*. (*14*) indicate that Bayesian optimization is versatile for various structural optimization problems. The current study gives a rough indication for the efficiency of Bayesian optimization to be expected for other problems.

As it is natural that optimization efficiency depends on the class of structures, the more important comparison here is the optimization efficiency of a single property versus that of multiple properties for the same system (GNR). To this end, we compare the optimization efficiency of two different cases: optimization for *ZT* (multifunction). The optimization for *ZT*. Here, it is interesting that optimization for *P*, which is composed of *S* and *R*_{el}, can be slower than that for *ZT*, as shown in fig. S1. This is probably because *ZT* of model A is more sensitive to *P*, and the presence of *ZT* makes the optimization process faster. For *m* = 7, optimal structure with maximum *ZT* has the highest

### Observation of optimal structures

To probe the relationship between geometrical structure and thermoelectric properties, we observe the optimized structures in detail. Figure 2 (C and D) shows GNR structures with *m* = 4 and 10 that have the largest power factor *P* (= *S*^{2}/*R*_{el}) and *R*_{th}, respectively, in all the candidates for *m* = 1 to 10. As for the *P*-optimized structure (Fig. 2C), vacancies are introduced over the entire area of GNR, except for the hexagonal lattices along the edge. The structural optimization leads to strong flattening of electronic bands around energy levels of the edge state (highlighted by bold lines in Fig. 2C), and eventually, band gaps are generated. Nanostructuring in middle areas of zigzag GNRs, therefore, leads to phonon scattering without a significant change of the edge state and enhances the thermoelectric performance. Antidot GNRs (model B), which will be used in the next section, are expected to benefit from this trend. As for the *R*_{th}-optimized structure (Fig. 2D), which also gives the highest *ZT*, the GNR has a complicated labyrinthian shape. In such a structure, phonons are significantly localized and their group velocities are reduced, as seen in the longitudinal acoustic mode highlighted by bold lines in Fig. 2D, for example.

The physical insights into the correlation between structure and thermoelectric properties described above can be fed back to formulate more efficient descriptors, although for the exploration of compound materials for specific physical properties good descriptors are somewhat nonintuitive (*52*). For instance, Ju *et al.* (*14*) reduced the number of target structures by using the knowledge learned from precalculations of smaller systems. We have also explored this possibility by using the insight that the labyrinthian shape tends to give high *R*_{th} and adding the topological representation as a descriptor. As a result, an additional descriptor representing the labyrinthian shape was found to accelerate all the cases of the optimization for *ZT*. The details are described in section S3.

### Optimization of antidot GNR structures

Although the labyrinthian shape leads to highest *ZT* structures for model A, GNR structures with the edges remained should still be preferable because they are thermodynamically more stable and, moreover, available in experiments (*50*). We therefore apply Bayesian optimization to search for the antidot GNR (model B) with highest *ZT*. In this analysis, we define the structure obtained after calculating and processing 3000 structures with Bayesian optimization as the optimized one. It is confirmed that the *ZT* is converged during the optimization process; the highest *ZT* does not change during the last 2469 calculations. Figure 3A compares thermoelectric properties—*ZT*, *P*, and *R*_{th}—of the pristine structure (black), the periodic antidot structure (blue), and the optimal structure (red), the last two of which are shown in Fig. 3B. The periodic antidot structure here means that antidots are introduced in all the 16 sections. As shown in the figure, the optimal structure has an aperiodic array of antidots. These aperiodic optimal structures have also been obtained for superlattices (*14*). The optimal arrangement of antidots increases *ZT* by 11 times (from *ZT*_{0} = 0.044 to *ZT* = 0.46), with the increase in *P* and *R*_{th} by 2.9 and 3.6 times, respectively. Simply arranging the antidot periodically increases *ZT* (*42*), in our case by 5.0 times, compared with the pristine structure, yet the remaining 2.1 times does require the optimization. Note that although the *ZT* of a periodic antidot GNR was calculated to reach 1.2 in the study of Yan *et al.* (*42*) using a simple tight-binding approximation with the nearest-neighbor π-orbital and without structural relaxation, when calculating the same structure using the current method with transfer integrals of 1s orbital of hydrogen and 2s2p_{x/y/z} orbitals of carbon with a longer cutoff length (= 3.7 Å) and structural relaxation, we obtain *ZT* = 0.28. The current results show that optimization of the arrangement of antidots can effectively improve thermal and electrical properties simultaneously. Furthermore, compared with additionally performed analyses for armchair GNRs with a similar width (*N*_{y} = 6), *ZT* of the optimal zigzag GNR is 1.2 (3.3) times as high as that of the optimal (pristine) armchair GNR (see section S4). This result indicates that the edge state is successfully used to independently tune thermal and electrical properties and that the potential of zigzag GNRs as thermoelectric materials is higher than that of armchair GNRs. Note that, while the total number of sections (16) was chosen to be too large for an exhaustive search and reasonable size for the Bayesian optimization as discussed earlier, the optimal *ZT* increases with the number of sections because the structural degrees of freedom increase, which was also checked by performing the optimization for 8 sections.

To gain insights into the improvement of the thermoelectric performance of zigzag GNRs, we investigate their phonon and electron transport properties in detail. Figure 3C shows the phonon transmission functions Θ_{ph}(ω) of the pristine (black), periodic (blue), and optimal aperiodic (red) structures. Θ_{ph}(ω) decreases, and hence *N*_{dot}) because of the generation of Bloch states in the nanostructured region (*53*). The dependence of *N*_{dot} shown in fig. S4 indicates that *N*_{dot} = 16 reaches 99% of *53*). Optimization can further increase _{ph}(ω) over a wide energy range despite the increase in the degree of freedom of atomic vibrations. For the modulation of *14*). The former effect becomes stronger with increasing number of antidots, while the latter effect becomes stronger with increasing length of the homogeneous region, that is, the region with a sequence of the same type of sections (with or without an antidot). Therefore, it is reasonable to suppose that the spatial distribution of smaller and larger antidots-distance provides tunability of the balance between the above two effects, although this physical knowledge alone does not explain why this particular distribution is optimal; thus, machine learning was needed to find the optimal structure.

The electron transmission functions Θ_{el}(*E*) of representative structures are compared in Fig. 3D. In finite periodic structures, the peaks of Θ_{el}(*E*) corresponding to resonant states appear and split by increasing the number of antidots, as shown in fig. S5A (*54*). Θ_{el}(*E*) of finite periodic structures finally approaches that of periodic antidot structures, while the residual electrical resistance remains at boundaries between the periodic antidot region and the semi-infinite pristine region, as shown in fig. S5B. Unlike pristine GNRs that have a low Seebeck coefficient due to the absence of the band gap, periodic antidot GNRs have a higher Seebeck coefficient because of the presence of transport gaps (energy gaps in electron transmission), corresponding to energy gaps of an infinite periodic antidot GNR. The introduction of periodic antidots, therefore, can enhance the thermoelectric performance (Fig. 3A). Thermoelectric performance is further enhanced by taking advantage of the edge state seen for Θ_{el}(*E*) in −0.078 eV ≤ *E* ≤ 0.048 eV. Utilization of the edge state requires suppression of the electron transmission of resonant states with energy near that of the edge state, as discussed in section S6. In general, any electron states are localized, and their transmission is suppressed in infinite random potential fields (*41*, *55*). Instead, the optimal antidot arrangement can suppress Θ_{el}(*E*) of target electron states, namely, resonant peaks in this case. In Fig. 3E, density of states (DOS) and Θ_{el}(*E*) around the energy corresponding to μ_{peak} show that although resonant states exist near the edge state in both the periodic and optimal structures, Θ_{el}(*E*) of the optimal structure is strongly suppressed.

For deeper comprehension of the transmission of the resonant states, local DOS (LDOS) at the resonant energy is mapped onto each atom in Fig. 3F, where the top/bottom contours denote the LDOS of the finite periodic/optimal structure. The indices of resonant state in Fig. 3F correspond to those in Fig. 3E. LDOS in the finite periodic structure spreads over the whole nanostructured region, while the optimal aperiodic structure leads to the localization of states in limited areas. It is intuitively comprehensible that the widely spreading states, which can be regarded as Bloch states, contribute to electron transport, while the strong localization generates the region with extremely low DOS and suppresses the electron transport, as shown in Fig. 3E.

### Impact of aperiodicity

The above discussions raise the question: How important is the exact arrangement of antidots to control thermoelectric properties? To answer this question, we calculate the distribution of thermoelectric properties for structures with the same numbers of antidots and homogeneous regions. Since the number of homogeneous regions is an indication of their lengths, following the above discussion, the two numbers represent (i) surface scattering effect and (ii) interference effect. The optimal structure has 10 antidots and 7 homogeneous regions (Fig. 3B). Figure 4 shows the distributions of *P* (blue), *R*_{th} (red) of structures with the same numbers as that of the optimal structure. These three properties are normalized by those of the pristine GNR, and the data are extracted from 359 structures calculated during the optimization process. The result shows that, on average, *R*_{th} (*R*_{th} (*R*_{th} (*P* makes the arrangement of antidots more important; the SD of *P* is 0.23, which is 2.1 (3.9) times larger than the SD of *R*_{th} (*P* can increase by 3.0 times, the improvement could be reduced to 1.3 times without accurate arrangement of antidots. The above discussion strongly suggests that the control of nanostructures is required for enhancing electron transport and thus for thermoelectric performance. As for the overall uniqueness of the optima, there are other structures that exhibit *ZT* inferior to the optimal structure but not by much; as shown in fig. S9, there are 24 structures with *ZT* > 0.95*Z*_{max}*T*. These structures have narrow ranges of numbers of antidots (8 to 12) and homogeneous regions (5 to 7). This result shows that these numbers can be a good indicator function for high-*ZT* structures, while we still need the structural optimization because of the fluctuation of thermoelectric properties within structures having the same numbers. It is worth noting that such uniqueness of the optima varies for different models and problems. In cases of model A, there are structures with *ZT* > 0.95*Z*_{max}*T* for *m* = 6, 8, and 9, but there are no other structures with *ZT* > 0.95*Z*_{max}*T* than the optimal structure for *m* = 7 and 10.

## CONCLUSIONS

In summary, we have shown that the Bayesian optimization technique accelerates structural optimization including multitransport properties, namely, *R*_{th}, *R*_{el}, and *S*. The analysis with the periodically nanostructured GNR reveals the following design strategies for thermoelectric nanostructures: (i) The introduction of defects into the entire region of GNR except for the hexagonal lattices along an edge can improve the power factor, and (ii) the labyrinthian shape can increase the thermal resistance. Using the former strategy, we show that introduction of antidots can significantly enhance the thermoelectric performance of zigzag GNRs up to 11 (1.2) times larger than that of the pristine zigzag (optimal armchair) GNR. Conversely, the careful arrangement of antidots is indispensable for keeping optimized electron states in the nanostructured region; otherwise, the electron properties fluctuate widely.

The state-of-the-art ion beam technique enables the fabrication of GNRs/antidots with the width/diameter down to 5 nm/1.3 nm (*50*, *51*). While the combination of these techniques may still be challenging at present, the fabrication of GNR structures proposed in this study should become possible in the near future. Furthermore, we may be able to fabricate arrayed antidot GNR structures combining nanomesh structures (*56*) with the above fabrication techniques; because neck regions in nanomesh structures can be regarded as arrayed GNRs, they will become arrayed antidot GNRs if antidots are introduced into the neck regions.

The demonstrated framework enables us to optimize structures without any previous knowledge of the relation between the structures and properties, and with intuitive descriptors to represent the structures. The Bayesian optimization method is expected to be useful for searches of other geometries and physical properties as long as the number of candidates is processable and the properties are calculable.

## METHODS

### Analysis method

Before calculation of thermoelectric properties, a relaxation calculation was performed until the maximum atomic force became less than 0.01 meV/Å. To obtain thermoelectric properties (*R*_{th}, *S*, *R*_{el}, and the power factor *P* = *S*^{2}/*R*_{el}), we calculated the phonon and electron transmission functions, defined by Θ_{ph}(ω) with the frequency ω and Θ_{el}(*E*) with the energy *E*, respectively, using Green’s function method with Landauer formalism (*27*, *42*, *44*–*46*, *57*, *58*) implemented in an Atomistix ToolKit package (see section S1) (*59*). Here, *R*_{th}^{−1} = _{peak}) within a realistic range, −1.0 eV ≤ μ ≤ 1.0 eV, that gives maximum *ZT*. For model A, the optimized Tersoff potential (*60*) and the π-orbital transfer integral that decreases exponentially with increasing C–C bond length (*61*) were used to calculate phonon and electron transport, respectively. For model B, to perform more realistic calculations, the effects of terminal hydrogen atoms were treated explicitly. The optimized Tersoff potential was applied to C–C interatomic interactions, and the Brenner potential (*62*) modified for different properties including thermal properties (*63*) was applied to C–H and H–H interatomic interactions. The transfer integral based on the density functional theory was used to represent the interaction among 1s orbital of H and 2s and 2p_{x/y/z} orbitals of C (*64*). All the calculations were performed under *T* = 300 K. It is noted that the typical system size in this study (~30 nm) is much smaller than the electron mean free path in pristine graphene or GNRs (~1 μm) at room temperature (*65*, *66*), and thus, electron scattering by phonon was ignored.

### Bayesian optimization

Bayesian optimization has an advantage over empirical optimization in its ability to determine required data and a regression curve by machine learning without any previous knowledge of the model. We used a Bayesian linear regression model for the prediction of *ZT***x** is the vector of the descriptors of GNR structure, ϕ is the feature map including *l* basis functions, **w** is the *l*-dimensional vector of the weight, and ε is the noise subject to Gaussian distribution with the mean of zero and the covariance σ^{2}. Bayesian optimization was performed by repeating the following process: (i) training the Bayesian model with a data set, *n* is the number of calculated structures; (ii) selecting a candidate for the next exact calculation with the trained model; (iii) performing the exact calculation for the selected candidate; and (iv) adding the result to the data set *D* for a new training. To obtain ϕ, we adopted the random feature map, which approximates the mapping induced by the Gaussian kernel with the width η (*67*) and set *l* = 5000. The Bayesian linear model with the random feature map approximates the Gaussian process, which has merit in the flexible expressivity by nonparametric regression. The Gaussian process was widely used for materials design with the Bayesian optimization (*11*, *14*). The hyperparameters σ^{2} and η were automatically tuned every 20 optimization runs with the type II likelihood maximization (*68*). The next point for the exact calculation was determined following Thompson sampling (*69*). The above optimization process was conducted using the open-source library common Bayesian optimization (COMBO) (*70*). More detailed explanations of the Bayesian optimization method can be found in the studies of Ju *et al.* (*14*) and Ueno *et al.* (*70*).

## SUPPLEMENTARY MATERIALS

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/4/6/eaar4192/DC1

section S1. Analysis method: Green’s function approach

section S2. Efficiency of single-functional optimizations

section S3. Acceleration of the Bayesian optimization

section S4. Structural optimization of antidot armchair GNRs

section S5. Phonon and electron transport in finite periodic structures

section S6. Effects of resonant states on thermoelectric properties

section S7. Resonant states in one-dimensional tight-binding chains

section S8. Statistical errors of the efficiency of the Bayesian optimization

section S9. Uniqueness of the optimal structures

fig. S1. Comparison of optimization efficiency of different functionals.

fig. S2. Acceleration of the Bayesian optimization using a topological descriptor: The mean shortest path.

fig. S3. Structural optimization of antidot armchair GNR (AGNR).

fig. S4. Phonon thermal resistance in finite periodic structures.

fig. S5. Electron transport in finite periodic antidot structures.

fig. S6. Effects of resonant states on thermoelectric properties.

fig. S7. Electron transport properties of one-dimensional tight-binding chains.

fig. S8. Efficiency of the Bayesian optimization.

fig. S9. Model B structures that exhibit *ZT* > 0.95*Z*_{max}*T*.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is **not** for commercial advantage and provided the original work is properly cited.

## REFERENCES AND NOTES

**Acknowledgments:**We would like to thank K. Tsuda, T. Kodama, and T. Shiga for helpful discussions.

**Funding:**This work was financially supported, in part, by the “Materials research by Information Integration” Initiative (MI

^{2}I), Core Research for Evolutional Science and Technology grant no. JPMJCR16Q5 from the Japan Science and Technology Agency, and the Japan Society for the Promotion of Science (KAKENHI, grant no. 16H04274).

**Author contributions:**J.S. directed and designed the project. M.Y. carried out the numerical simulations. All authors contributed to the discussions and wrote the paper.

**Competing interests:**The authors declare that they have no competing interests.

**Data and materials availability:**All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.

- Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC).