Development of a de novo molecular generative model using decoupled setting in multi-objective Bayesian optimization
Takamasa SUZUKI *1, Nobuaki YASUO2, Masakazu SEKIJIMA1
1School of Computing, Institute of Science Tokyo
2School of Materials and Chemical Technology, Institute of Science Tokyo
The development of new drugs is a notoriously expensive and time-consuming process. While computer-aided methods, such as molecular generative models, offer a path to accelerate this process, they face a significant bottleneck in multi-objective optimization. Existing methods are often computationally expensive because they require the simultaneous evaluation of all objective functions (e.g., toxicity, affinity) for every new candidate molecule. This research aims to address this problem by introducing a multi-objective molecular generative model that utilizes a ""decoupled evaluation"" setting, which promises a more efficient and comprehensive search by evaluating only one objective function at a time.
The key innovation in the proposed method is that the search step also identifies the single, most informative objective function to evaluate for each candidate, considering the computational cost. On selecting the objective functions, the proposed method captures its Pareto frontier, the solution of the optimization problem. Thanks to the decoupled settings, the proposed method was designed to reasonably reduce the cost. The method's performance was evaluated on a bi-objective problem. The model demonstrated high uniqueness and internal diversity scores comparable to or better than existing methods. The experiments confirmed that when one objective was set to be computationally cheaper, the model intelligently optimized for that objective first before exploring the wider search space, proving the efficiency of the decoupled approach. The research concludes that the proposed method, which leverages a decoupled evaluation setting from multi-objective Bayesian optimization, can generate diverse and optimized molecules with high efficiency. By breaking free from the need to evaluate all properties simultaneously, the model successfully reduces computational costs and produces novel molecules that are distinct from the training dataset.