Revolutionizing Solvent-Base Selection with Parametrized Optimization

81% Yield Achieved in 85% Fewer Experiments

Quick summary

Using SuntheticsML, a chemistry team optimized the solvent-base combination in a Suzuki–Miyaura cross-coupling, increasing yield from 17% to 81% in just 12 experiments—an 85% reduction compared to traditional methods.

Industry

Pharmaceutical and Materials R&D

Client

A synthetic chemistry research group optimizing conditions for a Suzuki–Miyaura cross-coupling reaction—a staple in pharmaceutical and materials R&D. The team faced challenges identifying ideal solvent–base pairs from a vast design space.

Challenge

Identify the optimal solvent–base combination to maximize product yield (Ar1–Ar2) using minimal experiments.
Navigate the complexity of categorical variables (solvents and bases), which cannot be optimized using traditional numerical approaches.
Avoid inefficient combinatorial screening that previously required 81 experiments to find a >75% yield.

Goal

Reach or exceed a 75% yield.
Achieve this with fewer than 15 experiments.
Uncover non-obvious or previously overlooked solvent–base combinations.

Approach & Solution

Used SuntheticsML’s Bayesian Optimization, powered by proprietary Supervised Learning (SL) and Active Learning (AL).
Encoded categorical variables with real chemical meaning:
- Solvents described by dielectric constant and polarity.
- Bases described by pKa and ionization energy.
ML algorithm generated predictions and selected the most promising combinations for testing, iterating every two experiments.

Results & Metrics

Final best-performing combination:
- EtOH + KOtBu → 81% yield
Performance progression:
- Started at 7% yield with Toluene + NaOtBu
- Gradual increase across 4 model iterations, ending at 81%
Total number of experiments:
- Only 12, guided across 5 iterations
Algorithm-recommended combinations:
- Included candidates that researchers had not previously considered or had erroneously ruled out
Corrected experimental error:
- An earlier false negative (0% yield for EtOH–KOtBu) was overturned thanks to the model’s suggestion, revealing experimental mislabeling
Compared to empirical (combinatorial) approach:
- 81 experiments needed to achieve 83% yield (DME + KOH)
- SuntheticsML achieved 81% in just 12 experiments
- → 85% experiment reduction

The Sunthetics Edge

“SuntheticsML not only guided to an unexpected high-yield result but also flagged an experimental error that would've led them astray. It saved time, materials, and gave us confidence in our choices.”

Key Takeaways

Categorical optimization is solvable: Proper parameterization unlocks categorical ML use cases.
ML sees what humans miss: EtOH–KOtBu was dismissed by researchers but identified as optimal by the model.
Fewer iterations, better outcomes: Just 5 modeling iterations were needed to reach 81% yield.
Data-light, insight-rich: ML-based parameter tuning beats trial-and-error even with a small dataset.
Major reduction in experimentation: 85% fewer tests, less waste, and faster progress.

‍