Client
A synthetic chemistry research group optimizing conditions for a Suzuki–Miyaura cross-coupling reaction—a staple in pharmaceutical and materials R&D. The team faced challenges identifying ideal solvent–base pairs from a vast design space.
Challenge
- Identify the optimal solvent–base combination to maximize product yield (Ar1–Ar2) using minimal experiments.
- Navigate the complexity of categorical variables (solvents and bases), which cannot be optimized using traditional numerical approaches.
- Avoid inefficient combinatorial screening that previously required 81 experiments to find a >75% yield.
Goal
- Reach or exceed a 75% yield.
- Achieve this with fewer than 15 experiments.
- Uncover non-obvious or previously overlooked solvent–base combinations.
Approach & Solution
- Used SuntheticsML’s Bayesian Optimization, powered by proprietary Supervised Learning (SL) and Active Learning (AL).
- Encoded categorical variables with real chemical meaning:
- Solvents described by dielectric constant and polarity.
- Bases described by pKa and ionization energy.
- ML algorithm generated predictions and selected the most promising combinations for testing, iterating every two experiments.
Results & Metrics
- Final best-performing combination:
- Performance progression:
- Started at 7% yield with Toluene + NaOtBu
- Gradual increase across 4 model iterations, ending at 81%
- Total number of experiments:
- Only 12, guided across 5 iterations
- Algorithm-recommended combinations:
- Included candidates that researchers had not previously considered or had erroneously ruled out
- Corrected experimental error:
- An earlier false negative (0% yield for EtOH–KOtBu) was overturned thanks to the model’s suggestion, revealing experimental mislabeling
- Compared to empirical (combinatorial) approach:
- 81 experiments needed to achieve 83% yield (DME + KOH)
- SuntheticsML achieved 81% in just 12 experiments
- → 85% experiment reduction
The Sunthetics Edge
“SuntheticsML not only guided to an unexpected high-yield result but also flagged an experimental error that would've led them astray. It saved time, materials, and gave us confidence in our choices.”
Key Takeaways
- Categorical optimization is solvable: Proper parameterization unlocks categorical ML use cases.
- ML sees what humans miss: EtOH–KOtBu was dismissed by researchers but identified as optimal by the model.
- Fewer iterations, better outcomes: Just 5 modeling iterations were needed to reach 81% yield.
- Data-light, insight-rich: ML-based parameter tuning beats trial-and-error even with a small dataset.
- Major reduction in experimentation: 85% fewer tests, less waste, and faster progress.