海归学者发起的公益学术平台
分享信息,整合资源

交流学术,偶尔风月
热电材料性能的优化一直备受关注。第一性原理的计算也被广泛应用于热电材料,以分析其机理及筛选潜在的高性能候选材料。近年来,数据驱动的机器学习方法也被引入热电领域,以加速热电材料的搜索。机器学习的一般过程包括数据收集、机器学习、验证样本选择和计算验证。大多数研究中,机器学习模型在已知数据集上表现很好,但没有验证已知数据之外的可靠性。而另一方面,在寻找新材料的过程中,机器学习模型的外推能力又至关重要。弱外推能力一般可通过扩展数据样本来改善,但增加大量样本的成本高昂。主动学习是一种通过外部验证更新机器学习模型的框架,旨在用尽可能少的验证样本最大程度的提高机器学习模型的外推能力。
图1:类金刚石结构热电材料搜索空间及主动学习框架
上海大学材料基因组工程研究院的杨炯教授、南方科技大学物理系的张文清教授等,基于前期高通量计算的158个类金刚石热电材料的功率因子,用主动学习的框架结合机器学习和第一性原理计算,建立了高精度的外推模型。主动学习的框架包括数据库、机器学习和验证样本选择模块、计算验证模块(图1)。验证样本的选择策略对主动学习的精度和效率有很大影响。在尝试的多种策略中,以多种机器学习算法的争议,使推选验证样本标准的“委员会推选策略”得到了外推能力最强的模型。在分析搜索空间中所有化合物的功率因子后发现,磷族化合物、含有空位和小原子半径元素的硫族化物,可能具有较大的功率因子(图2)。主动学习架构的应用不只局限于热电材料,也可应用于其他功能材料,对加速高性能材料的发现具有重要的意义。
图2:通过外推结果预测的具有高p型功率因子的新型热电材料
该文近期发表于npj Computational Materials 6: 171 (2020),英文标题与摘要如下,点击左下角“阅读原文”可以自由获取论文PDF。
Active learning for the power factor prediction in diamond-like
thermoelectric materials
Ye Sheng, Yasong Wu, Jiong Yang, Wencong, Pierre Villars & Wenqing Zhang 
The Materials Genome Initiative requires the crossing of material calculations, machine learning, and experiments to accelerate the material development process. In recent years, data-based methods have been applied to the thermoelectric field, mostly on the transport properties. In this work, we combined data-driven machine learning and first-principles automated calculations into an active learning loop, in order to predict the p-type power factors (PFs) of diamond-like pnictides and chalcogenides. Our active learning loop contains two procedures (1) based on a high-throughput theoretical database, machine learning methods are employed to select potential candidates and (2) computational verification is applied to these candidates about their transport properties. The verification data will be added into the database to improve the extrapolation abilities of the machine learning models. Different strategies of selecting candidates have been tested, finally the Gradient Boosting Regression model of Query by Committee strategy has the highest extrapolation accuracy (the Pearson R = 0.95 on untrained systems). Based on the prediction from the machine learning models, binary pnictides, vacancy, and small atom-containing chalcogenides are predicted to have large PFs. The bonding analysis reveals that the alterations of anionic bonding networks due to small atoms are beneficial to the PFs in these compounds.
本文系网易新闻·网易号“各有态度”特色内容
媒体转载联系授权请看下方
继续阅读