Part III: Synthetic Biology and the Digital Path to Clinic

Chapter 9: From In Silico to Clinic — Ex Vivo, Clinical Simulation, and Efficacy Validation

Written: 2026-05-12 Last updated: 2026-05-12

Why this chapter

Chapters 4 through 8 built up an AI-driven design pipeline — strain triage, AlphaFold-grade target docking, microbiome-skin interaction modeling, DBTL strain engineering, generative and Bayesian formulation. By the end of Chapter 8 a candidate microbiome cosmetic active exists on paper, with predicted target binding, predicted producibility, and predicted formulation behavior. None of those predictions has yet touched human skin. This chapter is where they do.

The honest stake is not technical. It is evidentiary. The pharma industry has one published end-to-end example of an AI-designed small molecule clearing a randomized clinical trial — Insilico's rentosertib reaching Phase 2a with a mean forced vital capacity (FVC) change of +98.4 mL versus −20.3 mL placebo over 12 weeks ^[23]; the same program's earlier Nature Biotechnology writeup documents target identification through IND in roughly 30 months ^[22]. The cosmetic industry has zero equivalent published clinical readouts of AI-designed actives as of May 2026 ^[11]. The Korean COSMAX–Dankook axis has the most rigorous publicly available cosmetic-microbiome clinical work in any single corpus — EPI-7 ferment filtrate as an anti-aging postbiotic ^[14] and the Streptococcus-spermidine pipeline ^[13] — but neither was designed by an AI model. The validation infrastructure that closes this gap is what this chapter walks.

Three quantitative anchors for this chapter 1. Pharma template (rentosertib Phase 2a): 60 mg QD arm mean FVC +98.4 mL vs placebo −20.3 mL at 12 weeks, dose-dependent trend, AI-designed-active first clinical efficacy readout ^[23]. The cosmetic-industry analog is zero. 2. Korean cosmetic clinical reference (EPI-7): prospective topical postbiotic study with wrinkle, elasticity, hydration, and transepidermal water loss (TEWL) endpoints, Korean adult cohort, Yonsei–Dankook–COSMAX–HuNBiome consortium ^[14]. The single most rigorous Korean cosmetic-microbiome publication in the corpus. 3. L'Oréal-co-authored systematic review (Haykal 2025): 74 included studies, 22 RCTs at Level I evidence, zero AI-designed cosmetic active reaching publication-grade clinical readout ^[11]. The gap is named by L'Oréal's own authors.

9.1 The validation gap — why most AI cosmetic claims do not end in trials

Cosmetic R&D and pharma R&D operate under different evidentiary regimes, and the differences explain almost everything that follows. A new drug must clear a randomized, blinded, prospectively-registered, statistically-powered clinical trial to enter the US, EU, or Korean market. A new cosmetic active does not — substantiation requires enough evidence for the marketing claim and the safety dossier, both of which are evaluated by the firm's regulatory affairs team and audited only when challenged. The asymmetry is not unethical; it reflects the lower risk profile of topical, leave-on cosmetic products versus systemic drug exposure. But it has the predictable consequence that the cosmetic industry's incentive to publish peer-reviewed efficacy trials is weak. A successful AI-designed cosmetic active immediately becomes proprietary IP and a marketing differentiator — the firm files patents and writes claim language, it does not write New England Journal of Medicine articles.

The result is exactly what L'Oréal's own systematic review ^[11] documents: 74 included studies on cosmetogenomics AI, 22 of which reached Level I evidence (randomized controlled trials), and not a single example of an AI-designed cosmetic active brought all the way to peer-reviewed clinical readout. The Atallah 2025 review of bioengineered microbiome cosmetics reaches the same conclusion through a different lens — most cited efficacy claims for engineered actives "rely on company press releases, not peer-reviewed clinical trials" ^[3]. The Di Guardo 2025 cosmetic-AI review notes that cosmetic-prebiotic AI is "barely covered because public data is scant" ^[6]. Three independent reviews from three different angles converge on the same gap.

This is Gap 1 of the book, named in the analysis layer and load-bearing in three chapters: Chapter 5 (the AI design toolkit is ready), this chapter (the validation infrastructure exists), and Chapter 12 (the proposed publication template). The gap is not technical. It is economic and cultural. Every framework discussed in the remainder of this chapter — ex vivo models, clinical simulation, digital biomarkers, the EPI-7 study, the spermidine pipeline — is independently usable today. What is missing is the integrated end-to-end publication.

The cosmetic industry has been here before. The first generation of fermentation cosmetics (Pitera, Sulwhasoo, Galactomyces ferment filtrate) was marketed for decades before molecular mechanism papers caught up — ^[24] gave Galactomyces ferment filtrate its first AHR-agonist mechanism roughly 30 years after Pitera's commercial launch; ^[9] added the NRF2 pillar another seven years later. The historical pattern is that mechanism follows marketing by decades. The AI inflection is supposed to invert that pattern — design with mechanism in hand, validate, publish, then claim. The cosmetic industry has not yet inverted it. This chapter is the playbook for inversion.

Figure 9.1 — Validation rigor ladder. Y-axis evidentiary rigor, X-axis example studies. Visualizes the gap between current cosmetic AI actives and the pharma template. illustration by author (Gemini assisted)

9.2 Ex vivo skin models — what they replicate and what they cannot

The cleanest substitute for human skin that is not human skin is a reconstructed skin model — keratinocytes cultured into a stratified, differentiated epidermal sheet. Two commercial platforms dominate. EpiSkin (L'Oréal-affiliated) and EpiDerm (MatTek, since the 2007 standardization wave) are reconstructed human epidermis (RHE) tissues used worldwide for irritation, corrosion, sensitization, and increasingly efficacy assays. Both are validated under OECD test guidelines (TG 431 for corrosion, TG 439 for irritation), which means a cosmetic firm can use them in lieu of animal tests for regulatory submissions in the EU (where animal testing for cosmetics has been banned since 2013), Korea, and increasingly in the US. The OECD validation is a measurement-system credential, not an efficacy credential — but it created the supply chain that downstream efficacy work piggybacks on.

Above RHE in biological complexity sit full-thickness skin models (T-Skin, Phenion FT, EpiDerm-FT) that include a dermal compartment with fibroblasts beneath the keratinocyte layer. These support readouts that pure epidermal models cannot — collagen synthesis, dermal-epidermal junction integrity, anti-aging endpoints relevant to AlphaFold-derived targets like collagen-stabilizing peptides (Chapter 5). The catch is that full-thickness models cost roughly 3–5× more per insert, take longer to mature (2–3 weeks vs 1 week for RHE), and have inter-batch variability that demands larger replicate counts to detect modest effects. This is exactly the economic friction that pushes most AI-designed cosmetic candidates into RHE rather than full-thickness, which in turn limits the depth of mechanism that ex vivo work can defend.

The frontier is microfluidic skin-on-chip with deliberate microbial colonization. The 2025 Nature Communications hydrogel-scaffold skin microecology model ^[19] is the published exemplar — a multi-species microbial community stably maintained on a skin-mimetic scaffold for systematic testing of cosmetic actives against native commensals. The verma 2024 organoid biofilm review ^[26] explicitly frames skin-on-a-chip with organoid microbiome integration as "required for next-gen cosmetic efficacy testing," which is also where Chapter 6's digital twin models are most naturally grounded. The 2026 limit: skin-on-chip is still a research tool, not an industrial QC tool, and the supply chain is fragmented (each lab grows its own chips, calibration protocols are not standardized, and no OECD-equivalent validation exists yet). For an AI-designed cosmetic active in 2026, skin-on-chip is the deepest ex vivo validation realistically achievable but does not by itself produce a regulatory-grade readout.

What none of these models replicates is full immunological context. Ex vivo models lack functional adaptive immunity, lack systemic neuroendocrine input, lack the chronic homeostatic feedback loops between barrier function, microbiome composition, and immune surveillance that ^[4] established as foundational. An AI-predicted anti-inflammatory active that scores well on an EpiDerm IL-6 assay can still fail in human use because the human Th17–Treg balance the model cannot capture rejects it. Ex vivo is necessary, not sufficient — and the gap between "necessary" and "sufficient" is exactly where in silico clinical simulation and human trials are supposed to bridge.

L'Oréal's own R&I internal review acknowledges this honestly: their analytical toolbox lists 16S, shotgun, metabolomics, and ex vivo organotypic skin, with explicit acknowledgment of "low biomass, batch effects, sample collection variability" as the persistent reproducibility burden ^[8]. Industry insiders read this paper as the cleanest available admission that even the best-resourced cosmetic firms run into the same ex vivo limitations academic labs do.

Figure 9.2 — Ex vivo skin model lineup: (a) RHE (EpiSkin/EpiDerm), (b) full-thickness (T-Skin/Phenion-FT), (c) microfluidic skin-on-chip with microbial colonization. Readouts per model annotated. illustration by author (Gemini assisted)

9.3 In silico clinical simulation — PBPK and QSP, transplanted from pharma

Pharma has been doing in silico clinical trials in earnest for over a decade. The technology stack that matters has two layers. Physiologically-based pharmacokinetics (PBPK) simulates how a compound distributes through compartments — gut, blood, organ-specific tissue — under mechanistic ADME (absorption, distribution, metabolism, excretion) parameters. Quantitative systems pharmacology (QSP) sits above PBPK and adds disease-mechanism models — cytokine networks, receptor occupancy dynamics, biomarker progression — so a candidate's predicted plasma concentration translates into a predicted clinical endpoint. The FDA accepts PBPK and QSP submissions in drug regulatory filings; the EMA does the same. For oral and injectable drugs the workflow is mature.

The transplant to topical cosmetics is partial. Topical PBPK exists — the Potts–Guy equation gave cosmetic chemists a first-principles way to predict permeability through the stratum corneum from molecular weight and logP starting in 1992, and modern ML adds nonlinear vehicle and excipient effects (Chapter 8). What is thin in cosmetic-specific literature is QSP for cosmetic endpoints — wrinkle depth, elasticity, hydration. The reason is that cosmetic endpoints are not biomarker-anchored the way drug endpoints are. A pharma QSP model for IPF takes "FVC change" as the endpoint and has a mechanistic chain from drug → TNIK inhibition → fibroblast proliferation → collagen deposition → measured FVC. A cosmetic QSP model for "wrinkle improvement" lacks that mechanistic chain in equivalent rigor; wrinkle morphology is a composite of dermal collagen quality, dermal–epidermal junction integrity, hydration, and superficial barrier topology, each of which has its own incomplete mechanistic model.

The cosmetic-industry workaround is virtual cohort simulation, which Chapter 8 examined for formulation. The same architecture serves clinical-readout simulation. Unilever's disclosed 2,500-subject AI virtual cohort ^[25] pre-screens formulas before bench validation; conceptually the same engine can pre-screen efficacy outcomes before human trial. The unsolved problem is calibration. A virtual cohort can match a real cohort's mean response distribution by construction (it is trained on the real cohort), but it cannot independently validate that generalization to a new ethnicity, age band, or skin type works. The peer-reviewed cosmetic literature does not yet contain a published virtual-cohort validation study that meets pharma QSP standards.

The transfer template, when the cosmetic industry chooses to deploy it, looks like this. (i) Identify a cosmetic endpoint with a defensible mechanistic chain — anti-aging via collagen quality is the most mature candidate, with ^[13] providing the spermidine-collagen mechanism baseline. (ii) Build a QSP-style model linking the candidate active to the endpoint through measurable intermediates. (iii) Calibrate against a small real human cohort with the same instrumentation that the eventual trial will use. (iv) Pre-screen formulation × dosing variants in silico, advance the survivors. (v) Run a registered prospective clinical study and publish. The EPI-7 study below is the closest existing approximation to steps (iii)–(v), though it did not formally use a virtual cohort for pre-screening.

9.4 Digital biomarkers — what counts as a measurement

The validation chain is only as credible as the measurement. Cosmetic clinical studies have historically used a small set of standardized instruments: cutometer (skin elasticity), corneometer (hydration via dielectric constant), tewameter (transepidermal water loss), sebumeter (sebum quantity), VISIA imaging (high-resolution photography with spectral channels). The methodological burden is real — inter-instrument variability, operator effects, environmental dependence (the room must be 22°C and 40–50% relative humidity for hydration readings to be comparable). Cosmetic clinical trial design has internalized this burden for decades. The EPI-7 study used precisely this instrument set ^[14].

The frontier is AI-augmented digital biomarkers. Three threads matter. First, smartphone-based skin imaging AI. L'Oréal Skin Genius and the more recent Cell BioPrint platform deploy convolutional networks on smartphone-quality images to predict skin attributes that previously required clinic instruments — pore visibility, redness, pigmentation distribution, even biological skin age. The Skin Genius diagnostic ships through retail and consumer applications and exceeds 100 million inferences cumulatively ^[17]; the system was extended to 28 countries through partnerships with Modiface and chain retail ^[16]. The selling point is not that smartphone AI replaces a cutometer — it doesn't — but that it puts a consistent longitudinal instrument in every consumer's hand, enabling at-home outcome tracking across weeks and months that clinic visits cannot match.

Second, dermoscopy AI. The dermatology community has built convolutional and now Transformer-based classifiers for melanoma, basal cell carcinoma, and inflammatory skin disease over the past decade. The cosmetic industry has begun adopting the same architectures for non-medical endpoints — wrinkle depth segmentation, sebaceous-pore mapping, redness localization. The fly in the ointment is dataset bias: most published dermoscopy AI has been trained on lighter Fitzpatrick skin types, and the L'Oréal-co-authored Haykal review explicitly flags "limited geographic diversity, darker phototypes underrepresented" as a structural reproducibility failure mode in cosmetic AI ^[11]. A digital biomarker that works for Fitzpatrick I–III subjects and fails on IV–VI is, in regulatory and ethical terms, not a digital biomarker yet.

Third, automated TEWL and sebum sensors. Wearable patches and benchtop devices have lowered measurement cost and increased measurement frequency. The implication for clinical trial design is that endpoints once measured at three time points (baseline, week 4, week 12) can now be measured continuously, and AI summary statistics replace single readings. This is the same statistical efficiency gain that fitness wearables brought to nutrition trials — ZOE's PREDICT cohort exploits continuous glucose monitoring this way ^[2], and the cosmetic translation is direct. None of this has yet been deployed at scale for a cosmetic AI-designed-active trial because, again, no such trial has been published.

The synthesis matters: the measurement infrastructure to publish an AI-cosmetic-active clinical trial in 2026 exists. Cutometer, corneometer, tewameter, dermoscopy AI, smartphone imaging, automated wearable biomarkers, ex vivo confirmatory assays — all of these are individually validated. What is missing is the coordinated study.

Figure 9.3 — Digital biomarker landscape: classical clinical instruments (cutometer, corneometer, tewameter, sebumeter, VISIA) vs AI-augmented digital biomarkers (Skin Genius, Cell BioPrint, dermoscopy AI, wearables). illustration by author (Gemini assisted)

9.5 EPI-7 — the Korean cosmetic clinical reference

The 2023 EPI-7 study ^[14] is the most rigorously executed Korean cosmetic-microbiome clinical readout in the corpus and a useful template for what an AI-cosmetic-active trial could look like at minimum rigor. The substance under test was a postbiotic ferment filtrate derived from Epidermidibacterium keratini, a novel skin-derived genus discovered by the same Yonsei–Dankook–COSMAX BTI–HuNBiome consortium that ran the trial. The collaboration is structurally interesting in itself: Yonsei University Dermatology provided clinical expertise (lead authors Ju Hee Lee and Young In Lee), Dankook University the genomics analytics (Kyudong Han), COSMAX BTI the manufacturing and strain bank, and HuNBiome the formulation development.

The trial design is straightforward: a prospective topical study on a Korean adult cohort with anti-aging endpoints measured by the standard instrument set — wrinkle parameters via VISIA imaging and skin-replica analysis, elasticity by cutometer, hydration by corneometer, transepidermal water loss by tewameter. The reported outcome was significant improvement versus baseline across multiple endpoints. The trial's principal limitation is what every cosmetic study at this tier of investment carries: open-label, single-arm (no placebo control), and short duration (8–12 weeks typical). These are honest constraints — a placebo-controlled trial doubles cost and triples enrollment burden, and most cosmetic actives do not generate the commercial expectation that justifies the additional investment. EPI-7 is at the high end of what the Korean cosmetic-microbiome industry typically publishes.

What makes EPI-7 the right reference for the AI-cosmetic-active publication question is that the underlying ferment filtrate was not AI-designed. The strain was discovered by traditional microbial isolation and taxonomy, characterized by classical microbiology, fermented by standard methods, and tested in a standard cosmetic clinical protocol. The trial is the validation chain in the absence of AI. The book's argument is that an AI-designed active should aim for at least this rigor — the same instrument set, the same statistical analysis, the same publication venue, the same author consortium structure — and ideally exceed it with a placebo-controlled arm. The technology stack to do so has been ready since 2024.

A subtlety: EPI-7 also implicitly demonstrates the value of strain-level work. The genus Epidermidibacterium keratini did not exist in published taxonomy before the Korean consortium described it; the trial is downstream of original strain discovery, in vitro mechanism work, and proprietary fermentation development that took multiple years. This is exactly the AI predicts, culture validates asymmetry of seed.md and Chapter 3 — the AI prediction layer cannot bootstrap from nothing; it needs a strain corpus and a measurement infrastructure to predict against. EPI-7's clinical success traces back to a culture collection that AI did not produce.

Figure 9.4 — EPI-7 trial design: strain discovery (Epidermidibacterium keratini) → postbiotic preparation (COSMAX BTI + HuNBiome) → Yonsei prospective open-label clinical readout. illustration by author (Gemini assisted)

9.6 The COSMAX–Dankook spermidine pipeline — what good looks like

If EPI-7 is the clinical-readout template, the ^[13] spermidine study is the end-to-end pipeline template. It is the cleanest published example of the full predict–design–validate chain executed within a single Korean industry-academic consortium (COSMAX BTI + GIST + Genome and Company + KIST + Kyung Hee University), and it is published in Communications Biology — a Nature-family open-access venue with rigorous peer review.

The pipeline. (i) Strain discovery via metagenomics: 16S sequencing of a Korean skin cohort identified Streptococcus species (S. pneumoniae, S. infantis, S. thermophilus) as enriched on younger, more elastic skin. The discovery layer is correlational metagenomics — the same architecture the book's Chapter 4 frames as AI-amenable, here executed by classical statistical methods. (ii) Metabolite identification via GC-MS metabolomics: spermidine emerged as the Streptococcus-secreted metabolite candidate plausibly responsible for the elasticity association. (iii) Mechanism validation in ex vivo cell systems: aged keratinocytes and fibroblasts treated with spermidine showed upregulated collagen and lipid synthesis at the transcriptome level. (iv) Topical efficacy in human application: spermidine-containing topical formulation improved elasticity (cutometer), hydration (corneometer), and desquamation in a clinical study.

This is a five-step chain from a 16S correlation to a measured clinical endpoint, published in 2021, by a Korean consortium, with the same instrument set the EPI-7 trial used two years later. The authors are honest about limitations: correlational metagenomics does not prove causation in vivo, the clinical phase was open-label without placebo, and mechanism isolation between spermidine alone and the full Streptococcus secretome is incomplete. Each of these is exactly the limitation an AI-driven version of the pipeline could address: AI can pre-screen candidate metabolites at scale to argue spermidine is the dominant active (mechanism isolation); AI virtual-cohort pre-screening can shrink the recruitment burden of a placebo-controlled arm (statistical power); AlphaFold-grade target docking can argue which host targets spermidine engages (mechanism). The 2021 pipeline did not need AI to be published; an AI-augmented 2027 version would publish more rigorously on the same scaffold.

This is what the book's seed.md means by AI predicts, culture validates. The Kim 2021 pipeline is the operational template for cosmetic-microbiome efficacy publication; the AI layer is additive, not substitutive.

Figure 9.5 — Spermidine pipeline in five panels: (1) 16S cohort → Streptococcus correlation, (2) GC-MS metabolomics → spermidine identification, (3) ex vivo assay, (4) topical formulation, (5) clinical readout. illustration by author (Gemini assisted)

9.7 The pharma transfer template — rentosertib as the operational target

The transfer template the cosmetic industry has not yet replicated is Insilico Medicine's rentosertib (ISM001-055) program. The Nature Biotechnology writeup ^[22] documents the program from target identification through investigational new drug (IND) filing in approximately 30 months — Insilico's PandaOmics platform identified TNIK as a novel IPF target, Chemistry42 generative chemistry designed the small molecule, in vitro and in vivo validation followed, IND-enabling toxicology cleared, and Phase 0/1 reported. The follow-up Nature Medicine paper ^[23] reports the Phase 2a randomized double-blind result: 60 mg QD arm mean FVC +98.4 mL versus placebo −20.3 mL at 12 weeks, dose-dependent trend, safety profile supporting Phase 3 progression. As of 2025, Insilico listed on the Hong Kong Stock Exchange in the largest HK biotech IPO of the year [Insilico, 2025-HKEX], capitalizing the Phase 3 expansion and 30+ program pipeline.

What does the cosmetic-industry analog look like, given that cosmetic actives do not go through FDA IND? The end-state publication would be a Communications Biology or British Journal of Dermatology paper with the following shape. (a) AlphaFold-grade target prediction (Chapter 5) producing a microbiome-derived metabolite or peptide candidate against a defined cosmetic-relevant host target (collagen-stabilizing, MMP-inhibiting, AHR-modulating). (b) Synthetic biology production at scale (Chapter 7), allowing dose-controlled topical formulation. (c) Ex vivo mechanism validation in RHE or full-thickness models with target engagement biomarkers measured directly. (d) Randomized, double-blind, placebo-controlled topical clinical trial with standard cosmetic instrumentation (cutometer, corneometer, tewameter, VISIA) and trial registration. (e) Honest reporting of effect size, drop-out, and adverse events. The technology to do this exists. The economic incentive does not yet exist for any major.

The cosmetic-industry candidates most plausibly first to publish such a trial are not Big-4 Western majors. They are: a Korean consortium expanding the EPI-7 or spermidine pipelines with AI design at the front end; a synbio startup with a defined-mechanism active and a marketing strategy that rewards peer-reviewed evidence (Geltor's PrimaColl GRAS letter ^[7] is the ingestible analog, the topical equivalent is not yet filed); or a regulatory-pressured EU firm responding to anticipated EU Cosmetic Products Regulation (CPR) tightening on AI-derived claims ^[6]. The book's prediction (Gap 1 in the analysis layer) is that the first published example will appear by 2027–2028 from a Korean or mid-tier EU source rather than a Big-4 major.

The reason this matters for Korean R&D planners is competitive: the publication-grade clinical readout is a defensible market differentiator that marketing-grade claims cannot match. A Korean firm that publishes the first AI-designed cosmetic active clinical trial owns a permanent slice of the field's history. The cost of doing so — relative to the cost of the marketing alternative — is the question Chapter 12 returns to in the planning blueprint.

9.8 The Gap — zero AlphaFold-derived cosmetic actives in clinic

This section names the central tension of the book as cleanly as possible. The AI-design toolkit reached commodity status in 2024: AlphaFold 3 ^[1], Boltz-1 ^[27], Chai-1 ^[5], ESM3 ^[10], RoseTTAFold-AllAtom ^[15], and ESMFold-derived Metagenomic Atlas of >617M predicted structures ^[18] together commodify protein-structure and protein-interaction prediction for the cosmetic active universe. Synthetic biology produces engineered actives at scale (Chapter 7). Ex vivo and clinical-instrument infrastructure exist (Sections 9.2 and 9.4). Korean industrial-academic consortia have published the cleanest cosmetic-microbiome clinical work in the corpus (Sections 9.5 and 9.6). The pharma template demonstrates that an AI-designed active can clear randomized trials (Section 9.7).

Yet as of May 2026, the published cosmetic literature contains zero peer-reviewed clinical trials of an AlphaFold-derived (or generally AI-designed) cosmetic active. The Haykal 2025 PRISMA review confirms this absence from inside the L'Oréal R&I author network ^[11]. The Atallah 2025 bioengineered microbiome cosmetics review notes most efficacy claims rely on press releases ^[3]. The Di Guardo 2025 cosmetic-AI review names AI-derived efficacy claim regulatory frameworks as missing ^[6]. Three independent reviews, one consistent finding.

The implication for Chapter 12 is operational. The book's contribution is not just to describe this gap but to specify what would close it: a minimum viable cosmetic AI-active publication template (endpoint definition, sample size with realistic effect sizes, blinding strategy, instrumentation choice, statistical analysis plan, pre-registration venue). Chapter 12 returns to this as one of three actionable blueprints for Korean R&D planners and mid-tier global firms.

The gap is also, candidly, an opportunity vector. A firm that publishes first owns the field's historical anchor. The cost — one well-designed RCT, peer-reviewed publication, instrumentation cost roughly $500K–$1M depending on cohort size and length — is small relative to the marketing budget the same firm spends on a single campaign. The reason no one has done it is not cost. It is cultural inertia about whether cosmetic R&D should publish at this rigor at all. The book's argument is that the firms that change this culture first will define the next decade of cosmetic R&D's external credibility.

9.9 Regulatory considerations — a brief preview

A full regulatory treatment is Chapter 12's responsibility. Three pointers matter for this chapter's argument.

FDA cosmetic vs OTC drug boundary. In the US, a cosmetic is defined by intended use ("cleanse, beautify, promote attractiveness, or alter the appearance") and faces no pre-market approval requirement under FDCA Section 601. An OTC drug — sunscreen, anti-dandruff shampoo, anti-acne treatment — must comply with a monograph or undergo NDA approval. AI-designed cosmetic actives sit in the cosmetic category by default, which is why no clinical trial is required. But if an active begins to make claims that approach therapeutic territory (acne, atopic dermatitis, eczema), it crosses the boundary into OTC drug regulation, where clinical evidence becomes mandatory. The Gallo lab's ShA9 S. hominis program ^[20] explicitly chose the FDA-IND drug pathway, not OTC cosmetic, for autologous bacteriotherapy against atopic dermatitis — and that strategic choice carried clinical evidence requirements that cosmetic-track competitors do not face.

Korean MFDS functional cosmetic claims. South Korea has an intermediate regulatory tier — functional cosmetics — that requires efficacy substantiation for specific claims (whitening, anti-aging, sunscreen) above ordinary cosmetic substantiation but below drug substantiation. The Korean instrument set used in EPI-7 ^[14] and the spermidine study ^[13] is calibrated to MFDS functional cosmetic substantiation requirements. This is one reason Korean cosmetic-microbiome clinical work is published at higher rigor than Western cosmetic equivalents — MFDS substantiation is a forcing function, not an option.

EU Cosmetic Products Regulation (CPR). The EU's 2009 CPR plus the Safety Assessment requirement under Annex I creates the strongest pre-market substantiation regime among cosmetic frameworks globally. Substantiation must be documented in the Product Information File. AI-derived efficacy claims fall into the same documentation requirement as traditional claims, but no EU CPR guidance yet exists for how AI predictions enter the substantiation chain. ^[6] anticipates EU CPR guidance by 2027. The Geltor PrimaColl GRAS letter ^[7] is the closest existing US precedent for an AI-or-synbio-derived active receiving regulatory recognition, but it is ingestible-specific.

Chapter 12 develops each of these in operational detail, including a proposed reproducibility checklist and a publication template specifically designed for AI-derived cosmetic actives in 2027–2030.

9.10 Open Questions

Validation rigor escalation — Will the first published AI-designed cosmetic active clinical trial come from a Korean consortium (extending EPI-7 / spermidine with AI front-end), a synbio startup (post-Geltor topical equivalent), or a regulatory-pressured EU firm? The book's working prediction is Korea or mid-tier EU by 2027–2028; the alternative is Big-4 internal data never reaching peer review.
Digital biomarker standardization — Can smartphone-based skin imaging AI (Skin Genius, Cell BioPrint) be standardized across phototypes and devices to function as a regulatory-grade longitudinal biomarker? The current limitation is Fitzpatrick I–III over-representation ^[11], which is solvable but requires deliberate cohort expansion.
Microbiome longitudinal data infrastructure — How long can ex vivo skin-on-chip with microbial colonization realistically run? Two weeks? Three months? The longer the stable colonization window, the more clinically relevant the readouts become — but no current commercial chip exceeds a few weeks of stable multi-species community maintenance.
Virtual cohort calibration for clinical efficacy — Can Unilever's AI virtual cohort architecture ^[25] be calibrated to a level that survives a regulatory audit? No published validation study yet exists; this is the single most consequential methodology question for cosmetic clinical simulation.
Functional cosmetic vs cosmeceutical regulatory boundary — As AI-designed actives approach therapeutic effect sizes, will regulators tighten the cosmetic/drug boundary? The Gallo S. hominis program suggests the field will fragment — efficacy-grade actives migrate to drug or functional cosmetic pathways, while marketing-grade actives stay in ordinary cosmetic. The line will move; the question is in which direction and at what speed.

References

Abramson, J., Adler, J., Dunger, J. et al. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500.
Asnicar, F., Berry, S. E., Valdes, A. M. et al. (2021). Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nature Medicine 27, 321–332.
Atallah, R., Ahmed, A., Shams, M. et al. (2025). Bioengineered Skin Microbiome — Cosmetic Ingredient Landscape and Regulatory Considerations. Cosmetics 12(5):205.
Belkaid, Y. and Tamoutounour, S. (2014). The compartmentalized and systemic control of tissue immunity by commensals. Nature Immunology 15, 646–653.
Chai Discovery (2024). Chai-1 — Decoding the Molecular Interactions of Life. Technical report, Sep 2024. [Chai Discovery, 2024]
Di Guardo, A., Trovato, F., Cantisani, C. et al. (2025). Artificial Intelligence in Cosmetic Formulation: Predictive Modeling for Safety, Tolerability, and Regulatory Perspectives. Cosmetics 12(4):157.
Geltor (2025). Geltor PrimaColl receives FDA GRAS "no questions" letter. Geltor press, October 2025. [Geltor, 2025]
Gueniche, A., Perin, O., Bouslimani, A. et al. (2022). Advances in Microbiome-Derived Solutions and Methodologies Are Founding a New Era in Skin Health and Care. Pathogens 11(2):121.
Hashimoto-Hachiya, A., Furue, M., Tsuji, G. (2022). Galactomyces Ferment Filtrate Potentiates an Anti-Inflammaging System in Keratinocytes. Journal of Clinical Medicine 11(22):6691. [Hashimoto et al., 2022]
Hayes, T., Rao, R., Akin, H. et al. (2025). Simulating 500 million years of evolution with a language model (ESM3). Science 387, 850–858.
Haykal, D., Flament, F., Amar, D. et al. (2025). Cosmetogenomics unveiled: a systematic review of AI, genomics, and the future of personalized skincare. Frontiers in Artificial Intelligence 8:1660356.
Insilico Medicine (2025). Insilico Medicine lists on Hong Kong Stock Exchange — largest HK biotech IPO of 2025. Insilico press, 2025. [Insilico, 2025-HKEX]
Kim, G., Kim, M., Kim, M. et al. (2021). Spermidine-induced recovery of human dermal structure and barrier function by skin microbiome. Communications Biology 4:231.
Kim, J., Lee, Y. I., Mun, S. et al. (2023). Efficacy and Safety of Epidermidibacterium Keratini EPI-7 Derived Postbiotics in Skin Aging: A Prospective Clinical Study. International Journal of Molecular Sciences 24(5):4634.
Krishna, R., Wang, J., Ahern, W. et al. (2024). Generalized biomolecular modeling and design with RoseTTAFold All-Atom. Science 384, eadl2528.
L'Oréal (2024). L'Oréal Beauty Tech leadership at VivaTech 2024 — AI Skin Genius, Modiface, microbiome direction. L'Oréal press, May 2024. [L'Oréal, 2024]
L'Oréal R&I (2024). The Future of Cosmetics Is Playing Out In The Microbiome. L'Oréal R&I editorial, 2024. [L'Oréal R&I, 2024]
Lin, Z., Akin, H., Rao, R. et al. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130.
Microecology consortium authors (2025). Hydrogel-based scaffold in vitro skin microecology model supporting stable multi-species microbiota. Nature Communications (2025). [Microecology et al., 2025]
Nakatsuji, T., Chen, T. H., Narala, S. et al. (2017). Antimicrobials from human skin commensal bacteria protect against Staphylococcus aureus and are deficient in atopic dermatitis. Science Translational Medicine 9, eaah4680.
Nakatsuji, T., Hata, T. R., Tong, Y. et al. (2021). Development of a human skin commensal microbe for bacteriotherapy of atopic dermatitis. Nature Medicine 27, 700–709.
Ren, F., Aliper, A., Chen, J. et al. (2024). A small-molecule TNIK inhibitor (ISM001-055 / rentosertib) discovered via end-to-end generative AI from target identification to Phase 1. Nature Biotechnology.
Ren, F., Zhavoronkov, A. et al. (2025). A generative AI-discovered TNIK inhibitor for idiopathic pulmonary fibrosis: a randomized phase 2a trial. Nature Medicine, May 2025.
Takei, K., Mitoma, C., Hashimoto-Hachiya, A. et al. (2015). Galactomyces Ferment Filtrate Prevents T-Helper 2-Mediated Reduction of Filaggrin in an Aryl Hydrocarbon Receptor-Dependent Manner. Journal of Cutaneous Pathology.
Unilever Beauty & Wellbeing R&D (2025). How Unilever's pioneering skin microbiome research is shaping product innovation. Unilever news, 2025. [Unilever, 2025]
Verma, A., Singh, V. et al. (2024). Organoid and biofilm models for skin and lung microbial infection research. npj Biofilms and Microbiomes (2024).
Wohlwend, J., Corso, G., Passaro, S. et al. (2024). Boltz-1 — Democratizing Biomolecular Interaction Modeling. bioRxiv preprint, Nov 2024.