
Designing antimicrobial peptides against infectious diseases
#Machine Learning #Ligand-Based Drug Design #Peptide Design #Infectious Diseases #Antimicrobial Resistance
Context
Antimicrobial peptides (AMPs) represent a promising class of therapeutic agents to combat antibiotic-resistant infections. However, many candidate peptides suffer from hemolytic toxicity, limited bioavailability, and poor predictive modeling when designed computationally. As a principal investigator and now as Founder of Ingenie Bio, I developed and refined a suite of machine learning pipelines to design safe and structurally diverse peptides targeting infectious pathogens.
Client/Partner Type
Academic Research (with biotech translation relevance)
Challenge
Traditional AMP discovery pipelines struggle with:
Poor generalization to novel sequences
Bias toward α-helical peptides
Inadequate representation of sequence-structure-function relationships
Lack of outlier and uncertainty detection in predictive models
The overarching goal was to develop ML strategies that not only predict biological activity and safety, but also define reliable applicability domains and integrate structural diversity — key for real-world infectious disease targeting.
Solution
Over a multi-project program spanning three major publications, we developed a comprehensive, multi-stage framework:
(1) Toxicity-Aware Classification with Outlier Detection
Developed ML models to predict hemolytic toxicity, a major barrier to AMP clinical use
Trained 14 classifiers (e.g., GBC, SVM, XGBoost) using physicochemical descriptors from curated datasets
Integrated 9 outlier detection methods (e.g., LOF, Mahalanobis) to define the applicability domain
Used models to screen 3,000+ natural AMPs and generate 500+ low-toxicity peptide designs
Plisson, F.*; Ramírez-Sánchez, O. & Martínez-Hernández, C. Machine learning-guided discovery and design of non-hemolytic peptides. Scientific Reports 2020, 10, 16581. [DOI] [Github]
(2) Structure-Aware ML Modelling
Applied secondary structure prediction (PEP2D) to 5,800+ AMPs from GRAMPA
Visualized fold space with ternary plots (Helix-Strand-Coil)
Identified structural biases in public datasets and AMP design models
Combined structural projections with outlier detection to guide more diverse and robust peptide design
Aguilera-Puga, M. d. C. & Plisson, F.*. Structure-aware machine learning strategies for antimicrobial peptide discovery. Scientific Reports 2024, 14, 11995. [DOI] [Github]
(3) Protocolisation and Knowledge Transfer
Published a peer-reviewed protocol for researchers to build and evaluate their AMP classifiers
Covered data curation, feature engineering, model training, validation, and applicability domain analysis
Positioned the approach as a standardised framework for safer and more effective AMP discovery
Aguilera-Puga, M. d. C.; Cancelarich, N. L.; Marani, M. M.; de la Fuente-Nuñez, C.* & Plisson, F.*. Accelerating the discovery and design of antimicrobial peptides with artificial intelligence. Methods in Molecular Biology 2024; 2714 : 329-352. [DOI]
Outcome
Discovered 34 natural AMPs with low hemolytic potential from public data
Designed 507 de novo peptides with minimised hemolytic risk
Defined new structural guidelines and modeling practices to mitigate taxonomic and fold bias
Developed a reproducible, extensible modeling pipeline for use in peptide drug discovery
Published in leading journals (Scientific Reports) and included in practical Springer Methods
Related articles
Robles-Loaiza, A. A.; Pinos-Tamayo, E. A.; Mendes, B.; Ortega-Pila, J. A.; Proaño-Bolaños, C.; Plisson, F.; Teixeira, C.; Gomes, P.; Almeida, J. R.*. Traditional and Computational Screening of Non-Toxic Peptides and Approaches to Improving Selectivity. Pharmaceuticals 2022, 15, 323. [DOI]
Bajorath, J.; Chávez-Hernández, A. L.; Duran-Frigola, M.; Fernández-de Gortari, E.; Gasteiger, J.; López-López, E.; Maggiora, G. M.; Medina-Franco, J. L.*; Méndez-Lucio, O.; Mestres, J.; Miranda-Quintana, R. A.; Oprea, T. I.; Plisson, F.; Prieto-Martínez, F. D.; Rodríguez-Pérez, R.; Rondón-Villarreal, P.; Saldívar-Gonzalez, F. I.; Sánchez-Cruz, N.; Valli, M. Chemoinformatics and artificial intelligence colloquium: progress and challenges to develop bioactive compounds. Journal of Chemoinformatics 2022, 14(1), 82. [DOI]
Robles-Ramirez, O.; Osuna, J. G.; Plisson, F.; Barrientos-Salcedo; C.*. Antimicrobial Peptides in Livestock: A Review with a One Health Approach. Frontiers in Cellular and Infection Microbiology 2024; vol. 14. [DOI]
Tools/Expertise Used
Python (Scikit-learn, modlAMP, PyOD), Gradient Boosting, Outlier Detection, Feature Engineering, Sequence Analysis, Secondary Structure Prediction (PEP2D), AlphaFold2.
N.B. This work laid the foundation for Ingenie Bio’s service offerings in ML-guided predictions and generation in protein design.