Camo: Capturing the Modularity by End-to-End Models for Symbolic Regression
Liu, Jingyi; Wu, Min; Yu, Lina; Li, Weijun; Li, Wenqiang; Li, Yanjie; Hao, Meilan; Deng, Yusong; Wei, Shu Source: SSRN, April 22, 2024;
Abstract:
Modularity serves as an omnipresent paradigm across the spectrum of natural phenomena, societal constructs, and human pursuits, spanning from biological systems to corporate hierarchies and further. Within the realm of Symbolic Regression, which seeks to deduce explicit formulas from empirical data, modularity is seen as a strategic asset for capturing essential substructures to augment fitting accuracy. Symbolic Regression inherently addresses a problem of compositional optimization; hence, preserving beneficial sub-structures enhances the efficiency of future explorations. In this study, we introduce a methodology to integrate modularity into the search mechanism, designating the term "module" to reflect invaluable sub-structures. We select an end-to-end model to assimilate the module into the search process, chosen for its scalability and ability to generalize. Modules function as higher-order knowledge and serve as primary operators, thereby enriching the search directory for Symbolic Regression. Our innovative algorithm fosters autonomous learning or evolution of modules within the learning framework. Furthermore, we introduce a hierarchical module extraction strategy from the expression tree and a module refinement system aimed at discarding redundant elements while assimilating new advantageous ones. We carried out ablation studies, the highest average R2 across several datasets was achieved by the model with module extraction and update mechanism. Furthermore, our method reached comparable R2 against state-of-the-art with half of the expression complexity.
© 2024, The Authors. All rights reserved. (42 refs.)