TDP1阻害活性予測！ChemBERTaで創薬を加速🚀

Published：2025/12/3 20:42:22

TDP1阻害活性予測！ChemBERTaで創薬を加速🚀

超要約: ChemBERTaっていうAIを使って、TDP1阻害薬（がん治療に大事！）を効率よく見つける方法を見つけたよ！ IT企業も参入できるチャンス💖

✨ ギャル的キラキラポイント ✨

● がん治療のカギ🔑TDP1阻害薬を見つけるのを、AIが手伝ってくれるなんて、最先端すぎ✨ ● SMILES（分子構造を表す記号）から、直接阻害活性を予測！効率的で良くない？😍 ● IT企業が創薬分野に参入するチャンス到来！新しいビジネスチャンスが広がる予感💖

続きは「らくらく論文」アプリで

Fine-Tuning ChemBERTa for Predicting Inhibitory Activity Against TDP1 Using Deep Learning

Baichuan Zeng

Predicting the inhibitory potency of small molecules against Tyrosyl-DNA Phosphodiesterase 1 (TDP1)-a key target in overcoming cancer chemoresistance-remains a critical challenge in early drug discovery. We present a deep learning framework for the quantitative regression of pIC50 values from molecular Simplified Molecular Input Line Entry System (SMILES) strings using fine-tuned variants of ChemBERTa, a pre-trained chemical language model. Leveraging a large-scale consensus dataset of 177,092 compounds, we systematically evaluate two pre-training strategies-Masked Language Modeling (MLM) and Masked Token Regression (MTR)-under stratified data splits and sample weighting to address severe activity imbalance which only 2.1% are active. Our approach outperforms classical baselines Random Predictor in both regression accuracy and virtual screening utility, and has competitive performance compared to Random Forest, achieving high enrichment factor EF@1% 17.4 and precision Precision@1% 37.4 among top-ranked predictions. The resulting model, validated through rigorous ablation and hyperparameter studies, provides a robust, ready-to-deploy tool for prioritizing TDP1 inhibitors for experimental testing. By enabling accurate, 3D-structure-free pIC50 prediction directly from SMILES, this work demonstrates the transformative potential of chemical transformers in accelerating target-specific drug discovery.

cs / cs.LG / cs.AI

Arxivで見る