peptide.lab

Antibiotic resistance kills 1.27 million people per year. peptide.lab uses distributed browser computing to discover novel antimicrobial peptides. Volunteer browsers run ESM-2, a 150 million parameter protein language model, to screen candidate sequences for antimicrobial activity. Top candidates are submitted to university wet labs for experimental validation. The peptide sequence space is effectively infinite. This search runs as long as there are volunteers.

PHASE 1

embed known peptides

PHASE 2

train classifier

ONGOING

screen novel candidates

sequences analyzed

AMP candidates

active nodes

compute hours

seq / min

volunteer network

no nodes connected. become a volunteer →

top AMP candidates discovered export json export csv

candidates appear here once screening begins

Every browser that joins accelerates the search. No installs. No personal data collected.

volunteer your browser

about peptide.lab

peptide.lab is an independent research initiative built on a simple premise: meaningful scientific discovery no longer requires a PhD, a university affiliation, or a seven-figure grant. The tools that were locked behind institutional walls for decades are now open source. The data is public. The compute is distributed. What used to take a lab of 20 now takes a laptop and a willingness to learn.

We are not a pharmaceutical company. We are not a university lab. We are independent researchers using open source machine learning models, public biological databases, and volunteer computing to explore the antimicrobial peptide sequence space. Our work is transparent, our methods are reproducible, and our results belong to the public.

The antibiotic resistance crisis is not waiting for traditional drug development pipelines to catch up. 1.27 million people die every year from drug-resistant infections. The WHO calls it one of the top ten global public health threats. Most pharmaceutical companies have abandoned antibiotic R&D because it is not profitable. The gap between what is needed and what is being done is enormous.

peptide.lab exists in that gap. We believe that small, focused, open research efforts can produce candidates worth testing. We believe that distributing the computational burden across volunteer browsers is a legitimate and scalable approach to sequence screening. And we believe that when we find promising candidates, the right thing to do is publish them openly and connect with wet labs that can validate them.

This is citizen science in the most literal sense. Every browser that contributes compute is a collaborator. Every candidate we find belongs to everyone. No patents. No paywalls. No gatekeeping.

how it works

01. data

We start with a curated, balanced dataset of 906 peptide sequences: 453 validated antimicrobial peptides from APD3 and 453 non-antimicrobial controls from UniProt/Swiss-Prot, stratified 80/20 into train and test sets.

02. embed

Volunteer browsers run ESM-2 (150M parameters, Meta AI, Science 2023) to generate 640-dimensional embedding vectors for each sequence. These embeddings encode structural and functional properties learned from 250 million proteins.

03. classify

A classifier trained on the embedded dataset predicts whether a novel sequence is likely antimicrobial. The model is trained on the server and distributed to browsers for local inference.

04. explore

Novel candidates are generated through guided mutation and recombination of known AMP scaffolds. The exploration adapts: high-scoring sequences become new scaffolds, focusing the search on productive regions of sequence space.

05. screen

Each candidate is evaluated for net charge, amphipathicity, protease stability, predicted toxicity, and overall antimicrobial probability. The sequence space is infinite (20²⁰ = 10²⁶ for a 20-mer). This search never ends.

06. validate

Top candidates will be published as a bioRxiv preprint with full property profiles. We will actively seek wet lab partners to test candidates against MRSA, E. coli, and other WHO priority pathogens.

why independent research matters

The tools of modern computational biology have been democratized. ESM-2 is open source. AlphaFold is open source. AutoDock Vina is free. The entire UniProt database is publicly accessible. PubMed is free to read. The infrastructure that made computational drug discovery a billion-dollar institutional endeavor is now available to anyone with an internet connection.

This matters because institutional science has blind spots. Grant cycles favor incremental work on established targets. Pharmaceutical companies optimize for return on investment, not for global health impact. Antibiotic development has been deprioritized industry-wide despite being one of the most urgent problems in medicine.

Independent researchers, citizen scientists, and open source communities can fill gaps that institutions won't. FoldIt (University of Washington, 2008) proved that non-experts can solve real protein structure problems. Folding@home demonstrated that distributed computing produces publishable results. Galaxy Zoo showed that volunteers can classify data as well as trained astronomers.

peptide.lab follows this tradition. We are not replacing academic research. We are supplementing it. The candidates we identify computationally still require experimental validation by trained scientists in equipped laboratories. Our role is to narrow the search space from billions of possibilities to hundreds of promising leads. That is a genuine and meaningful contribution.

technical details

model

ESM-2 150M
esm2_t30_150M_UR50D. 150 million parameters. 640-dimensional embeddings. Runs in-browser via Transformers.js (ONNX/WebAssembly). Published: Lin et al., Science 379.6637 (2023).

training data

APD3 + UniProt
453 validated AMPs (APD3, Wang et al. 2016). 453 non-antimicrobial controls (Swiss-Prot reviewed entries). Balanced, deduplicated, stratified 80/20 train/test split (726/180).

classifier

logistic regression
Trained server-side on crowdsourced ESM-2 embeddings. Weights distributed to browsers for local inference during screening. Retrained periodically as embedding coverage increases.

disclosures

Computational predictions only. All results are computational predictions that have not been experimentally validated. "Candidate" means a sequence with promising predicted properties, not a confirmed antimicrobial agent.

Independent research. peptide.lab is not affiliated with any university, pharmaceutical company, or government agency. This work has not undergone institutional peer review.

Open science. All code, data, methods, and results are open source. No patents will be filed on volunteer-discovered candidates. Discoveries belong to the scientific commons.

Volunteer privacy. No personal data is collected from volunteer browsers. Only peptide sequences and their computed numerical embeddings are transmitted and stored.