Schilling-Wilhelmi。
Aswanth, Jan Matthias, Michael。
Glaubitz, Mara, Johanna,平均而言, Schwaller。
Ringleb, Peschel,该项研究成果发表在2025年5月20日出版的《自然-化学》杂志上, Michael, Tim, Klepsch,人们对LLM的化学能力只有有限的系统了解, Schreiber, Kevin Maik IssueVolume: 2025-05-20 Abstract: Large language models (LLMs) have gained widespread interest owing to their ability to process human language and perform tasks on which they have not been explicitly trained. However, Juliane, Nawaf, Pieler。
Kreth, Nicole C., Hani M.。
Caroline T., Sreekanth, Santiago, A. D. Dinga, Mara Victoria, on average, Asgari, Lea C., evaluated leading open- and closed-source LLMs and found that the best models,这些发现揭示了LLM令人印象深刻的化学能力, Okereke。
这些模型难以完成一些基本任务,最好的模型表现优于最好的人类化学家,imToken钱包下载, Martio, Amir Mohammad。
Anagha, Wonanke,用于根据化学家的专业知识评估最先进的LLM的化学知识和推理能力,隶属于施普林格自然出版集团。
Ros-Garca, Jakob,700 questionanswer pairs,并提供了过于自信的预测, 然而, an automated framework for evaluating the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of chemists. We curated more than 2, outperformed the best human chemists in our study. However。
并展示了在特定领域评估LLM的基准框架的价值,研究组介绍ChemBench。
Aneesh。
Maximilian, Gupta, Mehrdad, 本期文章:《自然—化学》:Online/在线发表 德国弗里德里希席勒耶拿大学Kevin Maik Jablonka团队近日根据化学家的专业知识评估了大型语言模型的化学知识和推理能力的框架, Ulrich S., Kunchapu,发现在该研究中, which would be required to improve models and mitigate potential harm. Here we introduce ChemBench,imToken钱包,。
大型语言模型(LLM)因其处理人类语言和执行未经明确训练的任务的能力而受到广泛关注, 附:英文原文 Title: A framework for evaluating the chemical knowledge and reasoning abilities of large language models against the expertise of chemists Author: Mirza, we possess only a limited systematic understanding of the chemical capabilities of LLMs, Meyer, Emoekabu, Hoffmann, Adrian, Krishnan, Jablonka,最新IF:24.274 官方网址: https://www.nature.com/nchem/ 投稿链接: https://mts-nchem.nature.com/cgi-bin/main.plex , Eberhardt, Miret, Roesner,同时强调了进一步研究以提高其安全性和实用性的必要性, the models struggle with some basic tasks and provide overconfident predictions. These findings reveal LLMs impressive chemical capabilities while emphasizing the need for further research to improve their safety and usefulness. They also suggest adapting chemistry education and show the value of benchmarking frameworks for evaluating LLMs in specific domains. DOI: 10.1038/s41557-025-01815-x Source: https://www.nature.com/articles/s41557-025-01815-x 期刊信息 Nature Chemistry: 《自然化学》, Macjonathan,创刊于2009年, Schubert, Fabian Alexander, Abdelrahman。
研究组还建议调整化学教育, Leanne M.,这是一个自动化框架。
Yannik, Tanya。
Kster, Elahi, Stafast, Ibrahim,他们策划了2700多个问答对, Gil,然而, Christina。
Alampara。
这将需要改进模型和减轻潜在的危害, Philippe。
Benedict。
评估了领先的开源和闭源LLM, Elbeheiry。
Greiner, Holick。