Genetic Algorithm-Based Interpretable Modeling Informed by Domain Knowledge
CRIS Lab @ Columbia University
Eligibility
All Students
Accepts Applications Until
Jul 31, 2025
Project Duration
Flexible
Description
Ordinary differential equations (ODEs) are essential tools for capturing the time-dependent behavior of complex systems. Recent advances in machine learning have accelerated model discovery by deriving governing equations directly from observational data. While these so-called black-box models often achieve accurate predictions, they tend to overlook fundamental laws that are critical in chemical engineering applications. Here, we develop a hybrid framework that integrates first-principles-based feature engineering with data-driven techniques to uncover underlying physicochemical mechanisms. Our approach leverages genetic algorithms to identify multiple best-fitting solutions under user-defined constraints informed by a priori knowledge. Building on our prior success in identifying algebraic systems—both linear and nonlinear in parameters—we extend AI-DARWIN, our interpretable, mechanism-based modeling framework, to dynamic systems governed by ODEs. We demonstrate its robust performance across diverse domains, including atmospheric chemistry, cellular signaling, and electrochemistry, using synthetically generated sparse and noisy data.
Required Skills
Python
Additional Information
Unpaid research internship
Compensation
Public Recognition, Letter of Completion
Quick Apply
SUCCESS!
Double-check Inputs