Client
A synthetic chemistry research group optimizing conditions for a Suzuki–Miyaura cross-coupling reaction—a staple in pharmaceutical and materials R&D. The team faced challenges identifying ideal solvent–base pairs from a vast design space.
Challenge
- Identify the optimal solvent–base combination to maximize product yield (Ar1–Ar2) using minimal experiments.
 - Navigate the complexity of categorical variables (solvents and bases), which cannot be optimized using traditional numerical approaches.
 - Avoid inefficient combinatorial screening that previously required 81 experiments to find a >75% yield.
 
Goal
- Reach or exceed a 75% yield.
 - Achieve this with fewer than 15 experiments.
 - Uncover non-obvious or previously overlooked solvent–base combinations.
 
Approach & Solution
- Used SuntheticsML’s Bayesian Optimization, powered by proprietary Supervised Learning (SL) and Active Learning (AL).
 - Encoded categorical variables with real chemical meaning:
- Solvents described by dielectric constant and polarity.
 - Bases described by pKa and ionization energy.
 
 - ML algorithm generated predictions and selected the most promising combinations for testing, iterating every two experiments.
 
Results & Metrics
- Final best-performing combination:
 - Performance progression:
- Started at 7% yield with Toluene + NaOtBu
 - Gradual increase across 4 model iterations, ending at 81%
 
 - Total number of experiments:
- Only 12, guided across 5 iterations
 
 - Algorithm-recommended combinations:
- Included candidates that researchers had not previously considered or had erroneously ruled out
 
 - Corrected experimental error:
- An earlier false negative (0% yield for EtOH–KOtBu) was overturned thanks to the model’s suggestion, revealing experimental mislabeling
 
 - Compared to empirical (combinatorial) approach:
- 81 experiments needed to achieve 83% yield (DME + KOH)
 - SuntheticsML achieved 81% in just 12 experiments
 - → 85% experiment reduction
 
 
 The Sunthetics Edge 
“SuntheticsML not only guided to an unexpected high-yield result but also flagged an experimental error that would've led them astray. It saved time, materials, and gave us confidence in our choices.”
Key Takeaways
- Categorical optimization is solvable: Proper parameterization unlocks categorical ML use cases.
 - ML sees what humans miss: EtOH–KOtBu was dismissed by researchers but identified as optimal by the model.
 - Fewer iterations, better outcomes: Just 5 modeling iterations were needed to reach 81% yield.
 - Data-light, insight-rich: ML-based parameter tuning beats trial-and-error even with a small dataset.
 - Major reduction in experimentation: 85% fewer tests, less waste, and faster progress.