#

Within the Josef Ressel Centre for Symbolic Regression we work on new algorithms for symbolic regression and develop a methodological and technical framework for incremental model adaptation for handling concept drift with symbolic regression models.

## Facts

- Lead: Gabriel Kronberger, University of Applied Sciences Upper Austria (FH OÖ)
- Duration: 2 years initial phase, 3 years extension after successful evaluation
- Total budget: 1.140.000 Euro.
- 2 PhD Students, 2 Postdocs, Master students
- Partner: FH OÖ, AVL List, Miba Frictec, Christian Doppler Research Association

## Partner

The Josef Ressel Centre is a cooperation of three companies. It is led by University of Applied Sciences Upper Austria, Hagenberg Campus (FH OÖ). The two company partners are AVL List GmbH and Miba Frictec GmbH. They provide relevant business cases and expert knowledge as well as infrastructure (testing benches) and data. The company partners also provide 50% of the financial funding for the project. The other 50% are funded by the BMDW through the Christian Doppler Research Association.

## Publications

The project has been started in January 2018. There are publications so far:

**2018: **

B. Burlacu, M. Affenzeller (2018, July). "Schema-based diversification in genetic programming". In *Proceedings of the Genetic and Evolutionary Computation Conference* (pp. 1111-1118). ACM. PDF

G. Kronberger, M. Kommenda, A. Promberger, F. Nickel (2018, July). "Predicting friction system performance with symbolic regression and genetic programming with factor variables". In *Proceedings of the Genetic and Evolutionary Computation Conference* (pp. 1278-1285). ACM. PDF

F. Bachinger, J. Zenisek, L. Kammerer, M. Stimpfl, G. Kronberger (2018, September). "Performance of industrial sensor data persistence in data vault". In *Proceedings of the EMSS Conference* (pp. 226-233). PDF

**2019: **

W. Roland, M. Kommenda, C. Marschik, J.Miethinger (2019, February). "Extended Regression Models for Predicting the Pumping Capability and Viscous Dissipation of Two-Dimensional Flows in Single-Screw Extrusion". In *Polymers*. PDF.

G. Kronberger, L. Kammerer, B. Burlacu, S. M. Winkler, M. Kommenda, M. Affenzeller. "Cluster Analysis of a Symbolic Regression Search Space". *Genetic Programming Theory and Practice XVI* pp. 85-102. Springer. (2019). PDF.

E. Kabliman, A.H. Kolody, M. Kommenda, G. Kronberger. “Prediction of Stress-Strain Curves for Aluminium Alloys using Symbolic Regression”. *Proceedings of the 22nd International Conference on Material Forming (ESAFORM 2019)*, Vitoria-Gasteiz, Spain. AIP Conference Proceedings. (2019). PDF.

B. Burlacu, M. Affenzeller, G. Kronberger, M. Kommenda “Online Diversity Control in Symbolic Regression via a Fast Hash-based Tree Similarity Measure” *Conference Proceedings of the IEEE Congress on Evolutionary Computation (CEC)*, Wellington, New Zealand. (2019). PDF.

B. Burlacu, M. Kommenda, G. Kronberger, M. Affenzeller. “Parsimony Measures in Multi-objective Genetic Programming for Symbolic Regression” *in Proceedings of the Genetic and Evolutionary Computation Conference 2019 (GECCO ’19)*. ACM, New York, NY, USA, 9 pages. PDF.

G. Kronberger, L. Kammerer, M. Kommenda, B. Burlacu. “Visualisierung eines Suchraums für Symbolische Regression” *Tagungsband Forschungsforum der Österreichischen Fachhochschulen (FFH 2019)*, Wiener Neustadt, (2019).

S. Prieschl, D. Girardi, G. Kronberger, “Using Ontologies to Express Prior Knowledge for Genetic Programming”, in *Proceedings of Cross Domain Conference for Machine Learning and Knowledge Extraction (CD-MAKE) co-located with ARES 2019*, Canterbury, UK, August 26-29, LNCS. (2019). PDF.

Kommenda, M., Burlacu, B., Kronberger, G. et al., "Parameter identification for symbolic regression using nonlinear least squares", in *Genetic Programming and Evolvable Machines*. Springer. (2019). PDF.

Kronberger G., Scheidel S., Haider C., Kommenda M., Kordon M., "Integration of Physical Knowledge in Empirical Models - A New Approach to Regression Analysis - 8th International Symposium on Development Methodology, Wiesbaden, Germany pp. 1 - 9 (2019).

**2020:**

Kronberger G., Bachinger F., Affenzeller M. "Smart Manufacturing and Continuous Improvement and Adaptation of Predictive Models", in *Procedia Manufacturing*, Volume 42. (2020). *Part of special issue: International Conference on Industry 4.0 and Smart Manufacturing (ISM 2019)*. PDF.

Kammerer L., Kronberger G., Burlacu B., Winkler S.M., Kommenda M., Affenzeller M., "Symbolic Regression by Exhaustive Search: Reducing the Search Space Using Syntactical Constraints and Efficient Semantic Structure Deduplication", *Genetic Programming Theory and Practice XVII*. pp. 79-99. Springer. (2020). PDF.

Bachinger F., Kronberger G., "Concept for a Technical Infrastructure for Management of Predictive Models in Industrial Applications", *Computer Aided Systems Theory - EUROCAST 2019*. pp. 263-270. Springer. (2020). PDF.

Affenzeller M., Burlacu B., Dorfer V. et al., "White Box vs. Black Box Modeling: On the Performance of Deep Learning, Random Forests, and Symbolic Regression in Solving Regression Problems", *Computer Aided Systems Theory - EUROCAST 2019*. pp. 288-295. Springer. (2020). PDF.

Kronberger G., Kammerer L., Kommenda M., "Identification of Dynamical Systems Using Symbolic Regression", *Computer Aided Systems Theory - EUROCAST** 2019. pp. 370-377. Springer. *(2020). PDF.

Burlacu B., Kammerer L., Affenzeller M., Kronberger G., "Hash-Based Tree Similarity and Simplification in Genetic Programming for Symbolic Regression", *Computer Aided Systems Theory - EUROCAST** 2019. pp. 361-369. Springer. *(2020). PDF.

Kammerer L., Kronberger G., Kommenda M., "Data Aggregation for Reducing Training Data in Symbolic Regression", *Computer Aided Systems Theory - EUROCAST** 2019. pp. 378-386. Springer. *(2020). PDF.

Kronberger G., Colmenar J.M., Winkler S.M., Hidalgo J.I., "Mulilayer analysis of population diversity in grammatical evolution for symbolic regression", *Soft Computing (2020)*. Springer. PDF.

## Research

Symbolic regression is a data-based modelling method where the goal is to find a formula that describes given data. Similarly to other regression methods, a goal is to create a model that allows to predict one or multiple variables given known values for the variables used as input to the model. However, in symbolic regression one does not merely fit parameters to a fixed model structure. Instead, the goal is to identify the necessary model structure as well as optimal model parameters for the given dataset.

*Symbolic regression means to find a simple symbolic expression (formula) that fits a given dataset. It is a supervised learning technique.*

The term symbolic regression stems from earlier work by John Koza on genetic programming. In later developments different algorithm variants for symbolic regression have been proposed many of which are based on evolutionary algorithms.

When using genetic programming the users specifies which operators and basic functions are allowed to be used in the symbolic regression model. The algorithm starts with a set of random expressions and through selection and recombination evolves a well-fitting symbolic regression model.

The resulting symbolic regression model might look like the following:

*A symbolic regression model produced with genetic programming.*

A drawback of evolutionary algorithms is that they are non-deterministic. For industrial applications we need deterministic and efficient parameter-less solvers for symbolic regression problems. In the Josef Ressel Centre we develop and implement such algorithms.

We take up the idea of “Prioritized Grammar Enumeration” (Worm and Chiu, 2013) which uses dynamic programming to create symbolic regression models. The algorithm uses a formal grammar as input which describes the structure of the symbolic regression models and is then able to produce all models for this structure up to a certain maximum size. The approach has potentially exponential asymptotic runtime for increasing formula sizes or number of variables. Therefore, heuristics are necessary to guide the search process to potentially more interesting parts of the search tree.

*Prioritized grammar enumeration is a non-evolutionary solution methods for symbolic regression. Formulas are generated from a formal grammar, identical functions are detected via dynamic programming.*

## Applications

Symbolic regression is a general technique that can be used for modelling technical systems. In the Josef Ressel Centre we use it for modelling components of powertrains.

*An example for a hybrid power train with engine, electric motor with battery, transmission and differential drive. Modern powertrains are made up from many complex components and there are many possible design variants. Models are necessary for simulation of powertrains in a virtual design process. We plan to use symbolic regression e.g. for modelling engine performance, or, friction performance, or battery load state.*

For this we use data from testing benches as operated by our company partners AVL and Miba.

*Testing bench for friction plates as operated by Miba Frictec in Roitham. Foto: © Bernhard Plank (http://imBilde.at)*

Two company partners participate in the proposed JRC. AVL List GmbH (AVL) is the largest independent company for development, simulation and testing technology of powertrains. AVL uses various methods for regression modelling for powertrain development. We investigate whether symbolic regression models can be implemented successfully on engine or transmission control units for optimized control of powertrain components.

Miba Frictec GmbH (Miba) produces friction materials and components for friction systems as used for instance in clutch and automatic transmission systems and provides services for the design and dimensioning of friction systems. Miba plans to integrate the results of the JRC into their in‐house software‐tools so that Miba engineers are able to easily create and validate symbolic regression models for friction systems.

## HeuristicLab

We plan to integrate the methods developed in thr JRC into our open-source software framework HeuristicLab.

HeuristicLab is used by academics and practitioners all over the world for teaching and industrial projects.

## Team

*Christian Haider, Florian Bachinger, Lukas Kammerer, Michael Kommenda, Eva-Maria Holzleitner, Bogdan Burlacu, Gabriel **Kronberger** (Lead) (© Petra Wiesinger, FH OÖ) *

## Further Information

Project description on the CDG website:

http://www.cdg.ac.at/forschungseinheiten/labor/symbolische-regression/?tx_cdglabors_labors%5Baction%5D=show&tx_cdglabors_labors%5Bcontroller%5D=Labor