Part III: Synthetic Biology and the Digital Path to Clinic

Chapter 8: AI Formulation Design — Synergy and Stability Prediction

Written: 2026-05-12 Last updated: 2026-05-12

Why this chapter

Up to Chapter 7 the story has been about ingredients — single molecules, peptides, postbiotics, engineered strains. A finished cosmetic is never a single ingredient. It is a 20–60 component emulsion in which surfactants, polymers, lipids, actives, preservatives, fragrance, and pH modifiers interact in ways that decide synergy, physical stability, sensory texture, percutaneous permeability, and regulatory acceptability — simultaneously. Brief A asked the sharpest version of this question: "Generative AI and ML are predicting the optimal synergistic formula out of tens of thousands of microbial combinations — is that real?" The honest answer is: the combinatorial impossibility is real, the toolkit is largely transplanted from pharma and chemistry, and the peer-reviewed cosmetic case studies are still few enough to count on one hand. This chapter walks that gap.

A useful contrast: Chapter 5 covered AlphaFold-class protein structure prediction, where the candidate is a single sequence — one molecule, one folded structure. Formulation prediction is the opposite end of the same scientific spectrum — many molecules, one interacting system, where the readout is emergent (stability, sensory feel) rather than reducible to a single physical observable. The math gets harder; the data gets thinner; the regulatory acceptance is foggier. That is exactly why this chapter is where AI-cosmetic enthusiasm collides with reality.

Three quantitative anchors for this chapter 1. Combinatorial wall: a 60-ingredient cosmetic shortlist with ~10 concentration levels has on the order of $10^{60}$ candidate formulas — physically un-enumerable, so enumeration was never the strategy. 2. Operational KPIs claimed by industry: Unilever discloses consumer-insight 60% faster, formulation cycles 5–6 → 1–2, claims generation 75% faster with its AI virtual cohort of 2,500 simulated subjects [14]. POND'S ships an in-store 60-minute microbiome diagnostic. None of these has been externally audited (Gap 15). 3. Peer-reviewed cosmetic AI-formulation precedent: the Unilever × IBM Carrieri 2021 paper [1] remains the foundational example of methodologically transparent AI on a cosmetic-microbiome formulation question — and as of May 2026 has very few public sequels of comparable rigor.

8.1 The formulation problem — why enumeration was never the strategy

A finished skincare emulsion contains roughly four blocks: (i) a base (water, oils, emulsifiers, thickeners — typically 10–20 components), (ii) actives (peptides, vitamins, postbiotics, antioxidants — 3–10 components), (iii) a preservation system (preservatives, chelators, pH adjusters — 4–8 components), and (iv) sensory/marketing (fragrance, color, pearlescent agents — 3–10 components). The total of 20–60 ingredients per stock-keeping unit is the industry-standard reality [4].

A naive enumeration breaks down immediately. Suppose 60 candidate ingredients, each with ~10 plausible concentration levels — that is $10^{60}$ candidate formulas before considering process variables (homogenization shear, phase order, fill temperature) or packaging (UV exposure, oxygen permeability of the cap). Cosmetic R&D was never enumerating; it was navigating a known small region of the design space with formulator intuition. AI's promise is to widen the navigable region without breaking the budget — not to enumerate.

Three constraints make navigation hard. First, non-linear interactions: a peptide active that is stable in surfactant A may degrade in surfactant B by a factor of ten via a mechanism (micelle partitioning, oxidation catalysis at the interface) that is impossible to predict from the active and surfactant in isolation. Second, emergent endpoints: physical stability over 24 months is not the sum of individual ingredient stabilities; sensory "non-greasy feel" is not the sum of ingredient lipophilicity scores. Most cosmetic endpoints emerge from the system. Third, the validation cost is high: a real-time stability study runs 3–6 months at multiple temperatures (4°C, 25°C, 40°C, 50°C); accelerated studies are imperfect proxies. Every model error costs a cycle.

The reason AI matters here is not that it removes the validation step. It is that it lets a formulator burn fewer cycles before the first design that survives 40°C/4-week accelerated stability and a 30-person sensory panel.

Figure 8.1 — Combinatorial design surface: small 'intuition-navigable' region vs wider 'AI-navigable' region, with a budget-constrained validation channel below. illustration by author (Gemini assisted)
Figure 8.1 — Combinatorial design surface: small 'intuition-navigable' region vs wider 'AI-navigable' region, with a budget-constrained validation channel below. illustration by author (Gemini assisted)

8.2 Generative models for formulation — VAE, diffusion, and the latent space of mixtures

The dominant academic frame for generative formulation borrows directly from chemistry. Variational autoencoders (VAEs) trained on SMILES strings or molecular graphs map a chemical universe into a continuous latent space where smooth interpolation between known molecules generates plausible neighbors. Diffusion models generalize this: they iteratively denoise a random latent vector under a conditioning signal (target property, scaffold, ingredient class) to produce a candidate. Both classes have produced peer-reviewed drug candidates — [17]'s GENTRL DDR1 inhibitor was a reinforcement-learning generative model whose end-to-end successor pipeline finally produced a clinical readout six years later [6] (Chapter 4).

The cosmetic adaptation is less mature. The unit of generation is not a single SMILES — it is a mixture with concentrations, often expressed as a percentage vector summing to 100. Mixture-space generative models exist in the chemical-engineering literature, but they require structured training data of the form (ingredient list, concentration vector, measured property). That data is precisely the data cosmetic firms guard most tightly. The same explains why public formulation benchmarks are rare. The asymmetry mirrors Chapter 4's metabolite-public / strain-private split, except worse — formula × property is treated as both R&D IP and marketing-claim IP simultaneously.

A useful counter-pattern: rather than generating the full formula, generate substitution candidates under constraint. Given an existing formula and a "replace fragrance X with a lower-allergen alternative preserving sensory profile" task, the search space collapses from $10^{60}$ to dozens; VAE/diffusion-style latent interpolation works well at that scale. This is closer to how cosmetic AI-formulation tools are actually used internally, and how indie-brand SaaS products like Potion AI position themselves [12].

The bridge to Chapter 5: latent-space optimization in formulation looks structurally similar to latent-space protein design (Chapter 5). The difference is that protein latent spaces are anchored by structure prediction (AlphaFold provides a ground-truth folding score); formulation latent spaces have no equivalent anchor. Stability and sensory readouts require physical experiments. This is the deepest reason cosmetic formulation AI lags drug discovery AI by roughly five years.


8.3 Bayesian optimization — the dominant working method

Where generative models are the popular frame, Bayesian optimization (BO) is the working frame for most cosmetic AI-formulation projects. The fit is structural. BO assumes (i) experiments are expensive — a cosmetic batch takes weeks and a stability panel takes months; (ii) the response surface is smooth enough to model with a Gaussian process or tree ensemble; (iii) each new experiment should be chosen to maximize expected information gain or expected improvement under an acquisition function. All three assumptions match cosmetic R&D economics far better than they match cheap-and-fast in silico screening.

The operating loop runs: build an initial design of experiments (DOE) over the input space → measure responses → fit a surrogate model → propose the next batch via an acquisition function → repeat. Five rounds of 8–16 formulas per round typically suffice to triangulate a multi-property optimum — which is exactly the 5–6 formulation cycles baseline Unilever reports collapsing to 1–2 under its AI workflow [14].

The cosmetic-specific subtlety is multi-objective BO. A real-world goal is rarely a single scalar — it is "maximize 4-week stability and minimize allergen content and hit a target viscosity range and keep cost-of-goods below $X/kg." Pareto-front Bayesian optimization handles this directly, but only if the practitioner is honest about which objective gets traded against which. Cosmetic R&D has historically resolved this trade with an experienced formulator's intuition; making the trade explicit in an acquisition function is itself a cultural shift more than a technical one.

A reproducibility caveat that [10]'s microbiome-ML guidelines emphasizes carries over here: a BO loop is only as trustworthy as its experimental noise model. If the same formula measured twice produces a 20% variance on a stability score, modeling that response surface as a smooth Gaussian process will produce confident-looking nonsense (Gap 11). Honest BO requires honest noise estimates, which require duplicate experimentation — which doubles the budget at the exact moment the formulator was hoping AI would halve it.

Figure 8.2 — Bayesian optimization workflow for a cosmetic formula: initial DOE → surrogate fit → acquisition → wet-lab → loop; Pareto front for multi-objective trade-off. illustration by author (Gemini assisted)
Figure 8.2 — Bayesian optimization workflow for a cosmetic formula: initial DOE → surrogate fit → acquisition → wet-lab → loop; Pareto front for multi-objective trade-off. illustration by author (Gemini assisted)

8.4 Carrieri 2021 — the published methodological backbone

The most-cited peer-reviewed cosmetic-microbiome AI paper applicable to formulation thinking is the IBM × Unilever collaboration of [1]. Although its surface topic is prediction (does microbiome composition predict hydration, age, menopause, smoking?), its methodological backbone is exactly what an AI-driven formulation pipeline needs: (i) prospective cohort sampling at scale a single firm can sustain, (ii) gradient-boosted tree models — robust at the 50–500 sample scale at which most cosmetic studies sit, (iii) SHAP (Shapley value) attribution to make each prediction explainable at the per-ingredient level.

The paper sampled 62 healthy women in Canada with 16S V1–V2 sequencing of leg-skin sites at multiple time points, validated on a UK cohort, and reported menopause-status AUC ~0.85, hydration AUC ~0.7–0.8, smoking AUC ~0.75. Three taxa (Cutibacterium, Streptococcus, Anaerococcus) recurrently drove predictions, with SHAP attribution traceable to each individual classification. The methodological choice that matters for formulation: gradient-boosted trees + SHAP, not a deep network — because at n ≈ 50 a deep network overfits and a SHAP-attributed gradient-booster gives R&D teams a feature-importance map they can argue about in a formulation meeting.

The same pipeline shape transfers directly to formulation: train a gradient-boosted regressor of stability/sensory/permeability on a few hundred internal formulas, surface per-ingredient SHAP contributions, let formulators steer. The reason this pattern dominates is not theoretical superiority — it is that the regulatory and marketing-claim layer of cosmetic R&D requires interpretable feature attribution. A claim like "reduces fine lines by 30%" backed by "a deep model predicts so" rarely survives legal review; one backed by "ingredients X and Y dominate the predicted effect with SHAP scores 0.4 and 0.3, traceable to literature mechanism Z" survives it more often [4].

Two limits the paper itself surfaces, both relevant to formulation transfer: (i) the cohort was female-only and leg-only — generalization to facial sites and male skin is unverified, and skin formulation that works on the leg often fails on the face by orders of magnitude; (ii) SHAP attribution is correlational, not interventional — it tells the formulator which ingredient is associated with a predicted property change, not which ingredient causes it. Intervention experiments still require wet-lab work (Chapter 9).


8.5 L'Oréal × IBM and the broader cosmetic-formulation platform landscape

L'Oréal's formulation-AI strategy has been less peer-reviewed and more strategic than Unilever's. The 2024 VivaTech disclosure framed Beauty Tech as a layered stack — AI Skin Genius diagnostic for consumer-facing personalization, Modiface AR for visualization, and a microbiome research line accelerated by the Lactobio acquisition [8]. The R&I editorial "The Future of Cosmetics Is Playing Out In The Microbiome" [9] explicitly names AI-driven microbiome formulation as the next strategic frontier without disclosing methodology. The Haykal 2025 PRISMA-style systematic review of AI cosmetogenomics — itself co-authored with L'Oréal R&I — confirms that no peer-reviewed clinical readout of an AI-designed cosmetic active has been published as of 2025 [5]. This is Gap 1 of the book, and Chapter 8 inherits it: AI formulation activity is intense, but the audit trail visible to outsiders is thin.

L'Oréal's partnership pattern leans on IBM-class research capability without dedicating a co-authored peer review series the way Unilever-Carrieri did. The 2024 acquisition of Lactobio brought ~10,000 isolated strains plus efficacy data into L'Oréal R&I; combined with Modjoul / Modiface device telemetry, this is the data substrate on which downstream formulation AI is being trained, but the architecture is internal IP [8]. The competitive logic mirrors Chapter 4's buy-the-data observation — model architectures get commoditized by academic AI, strain × phenotype × formula × clinical-mapping data does not, and L'Oréal's bet is heaviest on the data side.

The broader landscape splits cleanly. Internal-stack majors: Unilever (XAI pipeline + virtual cohort + POND's diagnostic), L'Oréal (Lactobio data + Modjoul devices + AI Skin Genius), Shiseido + Accenture's Voyager platform trained on a stated 500,000 internal R&D data points [13], COSMAX's 2nd-generation Microbiome AI platform [2]. SaaS for the long tail: Potion AI's formulation co-pilot targets indie brands and CDMOs without in-house data-science teams [12]. The split mirrors the pharma split between Big Pharma's internal AI stacks and platform providers like Cradle (Chapter 4). The cosmetic version is younger and less independently audited.

A specific limit of the SaaS layer: LLM-style formulation co-pilots (Potion GPT and equivalents) inherit hallucination risk. A GPT may confidently recommend an ingredient combination that violates a regional regulatory limit, exceeds a sensitization threshold, or destabilizes a preservation system — and present the recommendation in well-formatted prose. Cosmetic R&D adoption of LLM co-pilots is therefore gated by chemistry-constrained decoding, retrieval over verified safety databases, and human-in-the-loop review — which the indie-brand market has only partially absorbed.


8.6 Unilever's 30K microbiome platform and the AI virtual cohort

The Unilever skin-microbiome platform is, in disclosed operational terms, the most ambitious AI-formulation engine in the cosmetic industry. The numbers Unilever publicly states: a 30,000-sample skin microbiome dataset spanning all major body sites and accumulating ~5 billion data points; >100 microbiome-related patents filed; an AI virtual cohort of 2,500 simulated subjects that pre-screens formulas before they reach a physical R&D bench; and operational KPIs of 60% faster consumer insight, formulation cycles 5–6 → 1–2, claims generation 75% faster [14]. The 2026 forward outlook restates and extends this stack [16].

Two things are notable. The numbers are coherent with each other — a 30K-sample real cohort plausibly trains a virtual cohort, a virtual cohort plausibly halves the bench-side iteration count, and halving cycles plausibly accelerates claims development by ~3–4×. The internal logic is consistent. None of it is externally validated. No third-party benchmark, peer-reviewed audit, or regulatory review of the virtual-cohort methodology has appeared in the literature (Gap 15). [4]'s cosmetic AI review explicitly flags "lack of harmonized regulatory frameworks for AI-derived efficacy claims" as the macro problem; Unilever's disclosed metrics live entirely inside that vacuum.

The methodological core of the virtual cohort is worth naming, because it is the cleanest example of a digital-twin claim in cosmetic R&D (Chapter 6 covers digital twins more broadly). A virtual cohort is a generative or simulation-driven population of synthetic subjects whose microbiome × skin-property distributions match the real cohort statistically. A candidate formula is then "applied" to all 2,500 simulated subjects, response distributions are predicted, and only formulas whose predicted response distributions clear thresholds reach physical trial. The logic is identical to in silico clinical trials in pharma (Chapter 9). The unsolved problem is calibration: does the virtual cohort's predicted variance match real-cohort variance for cosmetic endpoints, and across which ethnicities and skin types?

The cohort-coverage critique inherits from Chapter 4 and [5] — "limited geographic diversity, darker phototypes underrepresented." A 30K cohort sounds large until it is split by Fitzpatrick type, age, sex, and body site, at which point any single subgroup may be undersized for stable virtual-cohort calibration. Whether Unilever's internal calibration addresses this is undisclosed.

A modest reading: Unilever's KPIs are likely real within Unilever's launch processes — the 5–6 → 1–2 cycle reduction is a process-engineering measurement, not a scientific claim. The aggressive reading — that virtual cohorts can substitute for human trials in efficacy or safety claims — is the methodological aggressiveness regulators have not yet evaluated. Chapter 12 returns to this as a decision variable for Korean firms considering analogous platforms.

Figure 8.3 — Unilever virtual cohort workflow: 30K real microbiome dataset → simulated 2,500-subject cohort → AI pre-screening → bench validation. KPI overlay (60% / 5-6→1-2 / 75%). illustration by author (Gemini assisted)
Figure 8.3 — Unilever virtual cohort workflow: 30K real microbiome dataset → simulated 2,500-subject cohort → AI pre-screening → bench validation. KPI overlay (60% / 5-6→1-2 / 75%). illustration by author (Gemini assisted)

8.7 POND'S in-store diagnostic — formulation AI reaching retail

The POND'S Skin Institute 60-minute in-store microbiome analysis is the consumer-facing end of the same Unilever stack [11]. A shopper provides a skin sample, an in-store device sequences a microbiome signature in roughly an hour, and an AI recommendation system maps that signature to a personalized regimen drawn from the POND'S formulation library. This is the first mass-market microbiome-to-formula retail flow. COSMAX × HelloBiome is pursuing the same loop in Korea with a 900-consumer cohort feeding two commercialized postbiotic actives (Amioter, Fillerstin) into a three-step regimen [3].

The consumer-facing flow is where AI-formulation, AI-claims, and AI-diagnostic stacks all collapse into a single retail interaction — and where the audit gap from the previous section bites hardest. The recommendation engine is a black-box mapping from a microbiome signature to a regimen, with no externally validated calibration. [5] flags this entire layer as "unaudited consumer-facing AI" — not unique to Unilever (L'Oréal AI Skin Genius rolled to 28 countries on a similar architecture), but unique in that POND'S is the first to make microbiome-specific sequencing part of the retail interaction.

The structural critique is the same as the virtual-cohort critique: the operational engineering is impressive, the externally validated science is thin. As FTC and EU CTR scrutiny on AI-derived efficacy claims tightens — likely by 2027 per the trajectory traced in Chapter 12 — retail-grade microbiome diagnostics will be the most-exposed surface. Cosmetic firms whose retail AI is best-instrumented internally (Unilever, L'Oréal, COSMAX) will be best positioned to defend, but defense itself will require a calibration layer the public literature does not yet contain.

Figure 8.4 — POND'S in-store retail flow: shopper → microbiome sampling → 60-minute device → AI signature interpretation → personalized regimen. illustration by author (Gemini assisted)
Figure 8.4 — POND'S in-store retail flow: shopper → microbiome sampling → 60-minute device → AI signature interpretation → personalized regimen. illustration by author (Gemini assisted)

8.8 Multi-objective prediction — synergy, stability, texture, permeability, sensory

The frontier of cosmetic AI-formulation is the simultaneous prediction of multiple endpoints from a single formula representation. Each endpoint individually is tractable; their joint prediction is what compresses formulation cycles.

Synergy is the easiest endpoint to define mechanistically (do two actives produce more than additive efficacy?) and the hardest to measure (synergy in vitro often does not survive permeability and stability in the finished product). The standard pharma proxy is Loewe additivity or Bliss independence on a dose-response surface; cosmetic ports of this require an in vitro assay system that is consistent across the active panel — most easily achieved for antimicrobial endpoints (Chapter 4) and harder for anti-aging composite endpoints.

Stability is the most data-rich endpoint inside firms because it must be measured anyway for regulatory dossiers. Real-time studies at 25°C and accelerated studies at 40–50°C produce a time-series of physical and chemical measurements (viscosity, color, pH, active concentration via HPLC). Time-series regression on this data is a standard ML application, and stability prediction is where AI delivers most consistently. [4] notes the cosmetic industry frequently uses ML-aided shelf-life forecasting in routine R&D, although peer-reviewed benchmarks are rare.

Texture and sensory are the cosmetic-specific endpoints with no good in silico equivalent. Sensory data comes from trained human panels rating attributes (greasiness, pickup, spreadability, after-feel) on standardized scales. Ground-truth subjectivity is real, but inter-panelist consistency on a trained 10–20 person panel is high enough to support supervised learning. The model input is a formulation vector, the output is a sensory profile, and the bottleneck is that public sensory datasets effectively do not exist. Internal sensory datasets at majors (Unilever, L'Oréal, Shiseido, Amorepacific) have been built over decades and are among the most valuable AI training corpora the cosmetic industry holds.

Permeability through the stratum corneum is mechanistically a function of molecular weight, lipophilicity (logP), and formulation vehicle. Mathematical models from pharmacology (Potts-Guy equation, ADME-Tox suites) ported to cosmetics in the 2000s; modern ML adds nonlinear vehicle-effect terms and trains on internal permeability datasets. For microbiome-active formulations, permeability is non-trivial — postbiotic actives are often larger molecules than classical small-molecule cosmetics, and engineered cosmetic peptides (Chapter 7) sit in a permeability gray zone.

Regulatory acceptance is the endpoint cosmetic AI-formulation most awkwardly addresses, because regulators have not yet decided how to evaluate AI-derived predictions. The current operating model: AI accelerates R&D internally, the dossier submitted to FDA, EU CTR, or Korea MFDS uses traditional measurements (HPLC, stability assays, sensitization tests), and the AI never enters the regulatory file directly. This is sustainable until a regulator demands AI provenance — which is exactly what [4] anticipates by 2027.

The cleanest multi-objective AI-formulation deployment is Pareto-front Bayesian optimization across these endpoints. The Pareto front in stability × sensory × permeability × cost space gives a formulator a set of "best trade-offs" rather than a single recommendation, and an experienced formulator chooses along the front. This is the architecture closest to deployed practice inside majors. It bridges naturally to Chapter 9's efficacy validation — once a formula on the Pareto front reaches ex vivo skin and clinical readout, the predictions are tested against real human skin.

Figure 8.5 — Pareto front in stability × sensory × permeability × cost space; formulator chooses along the front, AI proposes candidates. illustration by author (Gemini assisted)
Figure 8.5 — Pareto front in stability × sensory × permeability × cost space; formulator chooses along the front, AI proposes candidates. illustration by author (Gemini assisted)

8.9 The limits — why cosmetic formulation AI lags drug discovery AI

Three structural limits are worth naming directly.

Data scarcity relative to pharma. Pharma's ML benefits from public assay databases at million-compound scale (ChEMBL, PubChem) plus century-scale clinical-trial archives. Cosmetic formulation has neither equivalent. The largest cosmetic-formulation training corpora live inside firms (Shiseido's stated 500K data points [13], Unilever's 30K microbiome samples [14]) and are not externally accessible. [7]'s survey of microbiome AI and [10]'s ML-for-microbiome guidelines both flag this as the core reproducibility bottleneck — a finding Chapter 12 develops into a proposed open-benchmark blueprint.

Sensory ground-truth subjectivity. Drug efficacy reduces to a discrete biological readout (target engagement, biomarker shift, symptom score). Cosmetic sensory efficacy reduces to a trained-panel rating that is inherently human, inherently culturally calibrated, and inherently expensive. There is no in silico ground truth. AI models can interpolate within a firm's sensory panel calibration but cannot replace it.

Regulatory acceptance of AI-derived claims. Cosmetic claims sit between marketing and regulatory affairs. FTC, EU CTR, and Korea MFDS each have substantiation standards for efficacy claims; none has yet articulated how AI-derived predictions enter that substantiation chain. [4] notes the "opacity of cosmetic AI training data, lack of harmonized regulatory frameworks for AI-derived efficacy claims" as a present-day gap. The industry's working compromise is that AI accelerates the design phase but does not enter the dossier; whether that survives the 2027–2030 regulatory window is the open question Chapter 12 returns to.

A fourth limit, more cultural than structural: cosmetic R&D culture trusts formulators in a way drug discovery culture does not trust medicinal chemists. A senior cosmetic formulator's intuition over 20–30 years of category experience is genuinely competitive with a small AI system, and replacing that intuition is neither possible nor desirable. The realistic deployment pattern is AI-augmented formulator, not AI-replaced formulator — and the cosmetic firms making the cleanest progress (Unilever's KPI disclosures, COSMAX's platform announcements) are explicit about this.


8.10 Open Questions

  1. Public formulation benchmark — What would a public benchmark in cosmetic formulation AI look like? The pharma equivalent (MoleculeNet, OGB) does not exist for mixtures with concentration vectors and sensory endpoints. A cosmetic-industry adapted benchmark — modeled on MIBiG for natural products (Chapter 4) — could unlock academic progress, but requires precompetitive data-sharing the industry has not yet agreed to.
  2. Sensory data labeling protocol — Is there a way to capture sensory ground truth at sufficient inter-rater agreement for cross-firm pooling? Current trained-panel protocols are internally consistent but not externally interoperable; standardization would be a precompetitive contribution analogous to ISO methods in food.
  3. AI claim regulatory pathway — How should FTC and EU CTR evaluate efficacy claims partially derived from AI predictions? Existing substantiation frameworks assume traditional measurement; AI predictions occupy a regulatory gray zone today and are likely to be tested by 2027. [4] anticipates this; no jurisdiction has yet ruled.
  4. Virtual cohort calibration — Does a 2,500-subject AI virtual cohort produce response variance distributions that match real-cohort variance across ethnicities, skin types, and body sites, for cosmetic endpoints? [14]'s claim is methodologically aggressive; no external calibration study exists.
  5. Multi-objective trade-off transparency — When a Pareto-front formulator chooses one solution over another, can the trade-off be communicated to consumers as part of claim language? Current claim culture is single-attribute ("reduces fine lines 30%"); a multi-objective culture ("optimized stability and permeability while preserving sensory") has no marketing template yet.

References

  1. Carrieri, A. P., Haiminen, N., Maudsley-Barton, S. et al. (2021). Explainable AI reveals changes in skin microbiome composition linked to phenotypic differences. Scientific Reports 11:4565.
  2. COSMAX USA (2024). COSMAX unveils 2nd-Generation Skin Microbiome platform + Microbiome AI. Global Cosmetic Industry, 2024. [COSMAX, 2024]
  3. COSMAX × HelloBiome (2025). Korean Beauty Manufacturer COSMAX × HelloBiome microbiome-powered personalized care. WWD / Personal Care Insights, 2025. [COSMAX × HelloBiome, 2025]
  4. Di Guardo, A., Trovato, F., Cantisani, C. et al. (2025). Artificial Intelligence in Cosmetic Formulation: Predictive Modeling for Safety, Tolerability, and Regulatory Perspectives. Cosmetics 12(4):157.
  5. Haykal, D., Flament, F., Amar, D. et al. (2025). Cosmetogenomics unveiled: a systematic review of AI, genomics, and the future of personalized skincare. Frontiers in Artificial Intelligence 8:1660356.
  6. Insilico Medicine clinical authors — Ren, F., Zhavoronkov, A. et al. (2025). A generative AI-discovered TNIK inhibitor for idiopathic pulmonary fibrosis: a randomized phase 2a trial. Nature Medicine, 2025. [Insilico, 2025]
  7. Wang, X.-W., Wang, T., Liu, Y.-Y. (2024). Artificial Intelligence for Microbiology and Microbiome Research. arXiv preprint 2411.01098. [Wang et al., 2024]
  8. L'Oréal R&I (2024). L'Oréal Beauty Tech leadership at VivaTech 2024 — AI Skin Genius, Modiface, microbiome direction. L'Oréal press, May 2024. [L'Oréal, 2024]
  9. L'Oréal R&I (2024). The Future of Cosmetics Is Playing Out In The Microbiome. L'Oréal R&I editorial, 2024. [L'Oréal R&I, 2024]
  10. Papoutsoglou, G., Tarazona, S., Lopes, M. B. et al. (2023). Machine learning approaches in microbiome research: challenges and best practices. Frontiers in Microbiology 14:1261889.
  11. POND'S (Unilever) (2024). POND's Skin Institute microbiome analyzer — 60-minute in-store consumer device. Unilever press, May 2024. [POND'S, 2024]
  12. Potion AI (2025). Potion AI platform updates — formulation AI for indie brands. Potion AI product updates, 2025. [Potion AI, 2025]
  13. Shiseido + Accenture (2024). Shiseido develops AI systems for ingredient biodegradability and Voyager formulation platform. Global Cosmetics News, Feb 2024. [Shiseido + Accenture, 2024]
  14. Unilever Beauty & Wellbeing R&D (2025). How Unilever's pioneering skin microbiome research is shaping product innovation. Unilever news, 2025. [Unilever, 2025]
  15. Unilever Beauty & Wellbeing (2025). SXSW 2025 AI/ML/data behind Unilever's latest launches. Unilever news, Mar 2025. [Unilever, 2025-SXSW]
  16. Unilever (2026). How AI is transforming innovation in Unilever Beauty & Wellbeing. Unilever news, 2026. [Unilever, 2026]
  17. Zhavoronkov, A., Ivanenkov, Y. A., Aliper, A. et al. (2019). Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nature Biotechnology 37, 1038–1040.