Data-Based Modeling and Analysis | HEAL

Data-based Modeling and Analysis

In all our research projects we use data to parameterize models for optimization or simulation. In many of those cases, we do not have a mathematical model and it is infeasible to build a mathematical model. In these cases we can use machine learning or statistical methods to find purely data-based phenomenological models.

For example the process for the fabrication of high-quality steel strips is complex and has many sub-processes along the production chain. There are so many parameters along this chain which have affect the quality of the steel strip, that it is impossible to capture this in a physics-based mathematical model. However, we can use the huge amount of data that is measured in the process to find the main drivers for product quality and build a phenomenological model that we can use for predicting the quality. These purely data-based models can subsequently be used for solving the "inverse problem" of optimizing the main parameters of the production process e.g. to achieve highest qualities.

In the research area for data-based modeling and analysis we mainly use machine learning methods for the identification of predictive models. When physics-based models are available, we may also use those models and combine them with ML models. We heavily rely on visualization techniques to quickly explore data sets to find potentially useful patterns. Our main aim however is always to capture these patterns explicitly in the form of a mathematical model.

One of our specialities is symbolic regression and we are recognized as one of the leading groups for the development of symbolic regression methods worldwide. Symbolic regression is a supervised learning method which produces models as short mathematical expressions.

Symbolic Regression Example

We have more than 15 years of experience in developing customized machine learning algorithms for industrial applications. HEAL researchers all have a background in computer science, mathematics, software engineering, and data science enabling us to develop production-ready software for our partners. Some of our software is open source.

Milestones in the area of data-based modeling

The following list shows some important milestones that we have reached in the development of data-based modeling methods and functionality implemented in HeuristicLab.
  • 2021 - Shape-constrained symbolic regression for physics-informed regression implemented, published, and available in HeuristicLab
  • 2018 - Factor variables for symbolic regression implemented, published, and available in HeuristicLab
  • 2018 - Kernel ridge regression and Barnes-Hut tSNE available in HeuristicLab
  • 2016 - Sensitivity analysis for symbolic regression models. Model response plots / partial dependence plots available in HeuristicLab
  • 2016 - Elastic-net regularized linear regression (glmnet) available in HeuristicLab
  • 2015 - Multi-objective algorithms for symbolic regression to optimize model complexity and error (ParetoGP) available in HeuristicLab
  • 2015 - Gradient boosted trees for regression and classification available in HeuristicLab
  • 2013 - Memetic local optimization of numeric parameters via gradient-based trust-region algorithms and automatic differentiation for symbolic regression models implemented, published, and available in HeuristicLab
  • 2013 - Gaussian processes regression available in HeuristicLab
  • 2012 - Grammar-guided genetic programming for symbolic regression available in HeuristicLab
  • 2012 - Support vector regression (libSVM) available in HeuristicLab
  • 2011 - Automatic algebratic simplification of symbolic regression models in HeuristicLab
  • 2011 - Random forest regression available in HeuristicLab
  • 2010 - Complete re-implementation of symbolic regression and genetic programming for HeuristicLab 3.3
  • 2009 - Linear scaling for symbolic regression in HeuristicLab
  • 2008 - Complete re-implementation of symbolic regression and genetic programming for HeuristicLab 2.0
  • 2005 - Variable frequency analysis for estimating variable relevance in HeuristicLab
  • 2004 - First implementation of symbolic regression and genetic programming for HeuristicLab 1.0

Selected projects

Josef Ressel Center for Symbolic Regression

Within the Josef Ressel Centre for Symbolic Regression we plan to develop new symbolic regression algorithms as well as a methodological and technical framework for incremental model adaptation for handling concept drift. We will apply the newly developed algorithms and frameworks for modelling powertrains and friction systems.

TransMet - Fundamentals and tools for engineering of high quality recycled and CO2 reduced strip steels

In this project which is part of the COMET Center MCL (Material Center Leoben) we work on algorithms for the adaptation of material models. Focus of our activities are the combination of physics-based models with purely data-driven models.

Machine Learning Methods for Identifying Features of Global Optimization Problems in the Non-Stationary Environment and for Automatic Adaptation of Evolutionary and Bio-Inspired Algorithms

Most optimization and machine learning tasks are modeled in a stationary fashion. This means that the optimization or modeling objective does not change during an algorithm run. This international FWF project is concerned with advancing into the non-stationary domain using various methodological approaches.

K2 Competence Center: Symbiotic Mechatronics

Predictive Maintenance for Industry Radial Ventilators

Research focuses on investigating proactive maintenance strategies for radial fans. Therefore, the fans will be equipped with additional sensors that record vibration, acceleration, air pressure, temperature, flux, power consumption, rotation, abrasive wear, and deposit building, in order to monitor the system at any given moment. This installation enables Predictive and Condition-Based Maintenance, which considers the usage history, the current condition as well as the prospective operational load.

FlashCheck - Electric Arc Detection in DC Circuits

Deals with the interdependences between electrical arcing and the inverter electronics, different battery solutions to buffer electrical energy, as well as other influencing factors such as the cable length of the electrical wires. The result is a description of the common characteristics of electrical arcs in different system configurations and a general concept to detect arcing.

HOPL - Heuristic Optimization in Production and Logistics

In this project we developed novel algorithms for integratively optimizing interrelated logistics and production processes.

AI for lead generation

Out of a database of more than 20 million companies we recommend those that are most similar to existing customers.

Recycling4Future

Together with EREMA Engineering Recycling Maschinen und Anlagen GmbH we develop customized machine learning algorithms that can be used for automatic and stable control of plastics recycling plants.

LIPOL - Learning in process optimization in the food industry

Together with the company partner GAMED GmbH and customers from the food industry we develop software tools using AI methods for monitoring and controlling product quality in industrial food production processes.

Selected talks

  • 23.02.2022: "Scientific Machine Learning" (online talk). Iași AI Community Meetup, Romania. DI Dr. Bogdan Burlacu. Slides
  • 17.02.2022: "Welchen Beitrag kann symbolische Regression zu Explainable AI leisten? – Praxis-Beispiele aus dem Josef Ressel Zentrum" (online talk). Softwarepark Hagenberg IT-Expert*innenreihe „Thinking AI“, FH-Prof. DI Dr. Gabriel Kronberger. Slides
  • 12.12.2020: "Genetic Programming and Symbolic Regression". AI Meetup Graz, FH-Prof. DI Dr. Gabriel Kronberger. Slides

Selected publications

2022:

C. Haider, F.O. de Franca, B. Burlacu, G. Kronberger. „Shape-constrained multi-objective genetic programming for symbolic regression". Applied Soft Computing, Volume 132, 2023. 109855. ISSN 1568-4946. https://doi.org/10.1016/j.asoc.2022.109855.

P. Fleck, S.M. Winkler, M. Kommenda, M. Affenzeller. „Vectorial Genetic Programming - Optimizing Segments for Feature Extraction". Proceedings of the 18th International Conference on Computer Aided Systems Theory - EUROCAST 2022, Las Palmas de Gran Canaria, Spain (2022). Slides

F. Holzinger, A. Beham. „Multi-criteria Optimization of Workflow-based Assembly Tasks in Manufacturing". Proceedings of the 18th International Conference on Computer Aided Systems Theory - EUROCAST 2022, Las Palmas de Gran Canaria, Spain (2022). Slides

D. Piringer, S. Wagner, C. Haider, G. Kronberger, A. Fohler, S. Silber, M. Affenzeller. „Improving the Flexibility of Shape-Constrained Symbolic Regression with Extended Constraints". Proceedings of the 18th International Conference on Computer Aided Systems Theory - EUROCAST 2022, Las Palmas de Gran Canaria, Spain (2022). Slides

J. Zenisek, S. Dorl, S.M. Winkler, M. Affenzeller. „Shapley Value based Variable Interaction Networks for Data Stream Analysis". Proceedings of the 18th International Conference on Computer Aided Systems Theory - EUROCAST 2022, Las Palmas de Gran Canaria, Spain (2022). Slides

G. Kronberger, E. Kabliman, J. Kronsteiner, M. Kommenda. „Extending a Physics-Based Constitutive Model using Genetic Programming”. Applications in Engineering Science. Vol. 9, 100080 (2022). doi: https://doi.org/10.1016/j.apples.2021.100080

2021:

C. Haider, F. O. de França, B. Burlacu, G. Kronberger. „Using Shape Constraints for Improving Symbolic Regression Models” (2021). arXiv preprint arXiv:2107.09458.

L. Kammerer, G. Kronberger, S. Winkler. „Empirical analysis of variance for genetic programming based symbolic regression”. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '21). Association for Computing Machinery, New York, NY, USA, pp. 251–252 (2021). doi: https://doi.org/10.1145/3449726.3459486

W. La Cava, P. Orzechowski, B. Burlacu, F. Olivetti de França, M. Virgolin, Y. Jin, M. Kommenda, J. H. Moore. “Contemporary symbolic regression methods and their relative performance” (2021). arXiv preprint arXiv:2107.14351.

W. Roland, C. Marschik, M. Kommenda, A. Haghofer, S. Dorl, S. Winkler. "Predicting the Non-Linear Conveying Behavior in Single-Screw Extrusion: A Comparison of Various Data-Based Modeling Approaches used with CFD Simulations". International Polymer Processing. Vol. 36, Issue 5, pp. 529-544 (2021). doi: https://doi.org/10.1515/ipp-2020-4094

L. Millán, G. Bokuchava, J.I. Hidalgo, et al. "Study of Microscopic Residual Stresses in an Extruded Aluminium Alloy Sample after Thermal Treatment". Journal of Surface Investigation: X-ray, Synchrotron and Neutron Techniques. 15, pp. 763–767 (2021). doi: https://doi.org/10.1134/S1027451021040145"

E. Kabliman, A. H. Kolody, J. Kronsteiner, M. Kommenda, G. Kronberger. "Application of symbolic regression for constitutive modeling of plastic deformation”. In Applications in Engineering Science, Volume 6, 100052, Elsevier. (June 2021). doi: https://doi.org/10.1016/j.apples.2021.100052

G. Kronberger, F. O. de Franca, B. Burlacu, C. Haider, M. Kommenda. “Shape-constrained Symbolic Regression – Improving Extrapolation with Prior Knowledge”. In Evolutionary Computation (2021). doi: https://doi.org/10.1162/evco_a_00294

L. Millán, G. Kronberger, J. I. Hidalgo, R. Fernández, O. Garnica, G. González-Doncel. "Estimation of Grain-Level Residual Stresses in a Quenched Cylindrical Sample of Aluminum Alloy AA5083 Using Genetic Programming". In Applications of Evolutionary Computation (Conference Proceedings EvoApplications 2021), Vol. 12694, pp. 421.436, (2021). doi: https://doi.org/10.1007/978-3-030-72699-7_27

F. Bachinger, G. Kronberger, M. Affenzeller. “Continuous improvement and adaptation of predictive models in smart manufacturing and model management". In IET Collaborative Intelligent Manufacturing, Vol. 3, Iss. 1, Special Issue: Selected Papers from Collaborative and Intelligent Manufacturing in Industry 4.0 (ISM @SMM 2019), pp. 48-63, (March 2021). doi: https://doi.org/10.1049/cim2.12009

2020:

G. Kronberger, F. Bachinger, M. Affenzeller. "Smart Manufacturing and Continuous Improvement and Adaptation of Predictive Models". In Procedia Manufacturing, Volume 42. (2020). Part of special issue: International Conference on Industry 4.0 and Smart Manufacturing (ISM 2019). PDF

L. Kammerer, G. Kronberger, B. Burlacu, S. M. Winkler, M. Kommenda, M. Affenzeller. "Symbolic Regression by Exhaustive Search: Reducing the Search Space Using Syntactical Constraints and Efficient Semantic Structure Deduplication". In Genetic Programming Theory and Practice XVII., pp. 79-99. Springer. (2020). PDF

F. Bachinger, G. Kronberger. "Concept for a Technical Infrastructure for Management of Predictive Models in Industrial Applications". In Computer Aided Systems Theory - EUROCAST 2019, pp. 263-270. Springer. (2020). PDF

M. Affenzeller, B. Burlacu, V. Dorfer et al. "White Box vs. Black Box Modeling: On the Performance of Deep Learning, Random Forests, and Symbolic Regression in Solving Regression Problems". In Computer Aided Systems Theory - EUROCAST 2019, pp. 288-295. Springer. (2020). PDF

G. Kronberger, L. Kammerer, M. Kommenda. "Identification of Dynamical Systems Using Symbolic Regression". In Computer Aided Systems Theory - EUROCAST 2019, pp. 370-377. Springer. (2020). PDF Slides

B. Burlacu, L. Kammerer, M. Affenzeller, G. Kronberger. "Hash-Based Tree Similarity and Simplification in Genetic Programming for Symbolic Regression". In Computer Aided Systems Theory - EUROCAST 2019, pp. 361-369. Springer. (2020). PDF

L. Kammerer, G. Kronberger, M. Kommenda. "Data Aggregation for Reducing Training Data in Symbolic Regression". In Computer Aided Systems Theory - EUROCAST 2019, pp. 378-386. Springer. (2020). PDF

J. Zenisek, G. Kronberger, J. Wolfartsberger, N. Wild, M. Affenzeller. "Concept Drift Detection with Variable Interaction Networks". In Computer Aided Systems Theory - EUROCAST 2019, pp. 296-303. Springer. (2020). PDF

G. Kronberger, J. M. Colmenar, S. M. Winkler, J. I. Hidalgo. "Multilayer analysis of population diversity in grammatical evolution for symbolic regression". In Soft Computing . Springer. (2020). PDF

B. Burlacu, G. Kronberger, M. Kommenda. "Operon C++: an efficient genetic programming framework for symbolic regression”. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion (GECCO ’20), pp. 1562–1570, (July 2020). PDF

2019:

W. Roland, M. Kommenda, C. Marschik, J. Miethlinger. "Extended Regression Models for Predicting the Pumping Capability and Viscous Dissipation of Two-Dimensional Flows in Single-Screw Extrusion". In Polymers. (February 2019). PDF

G. Kronberger, L. Kammerer, B. Burlacu, S. M. Winkler, M. Kommenda, M. Affenzeller. "Cluster Analysis of a Symbolic Regression Search Space". In Genetic Programming Theory and Practice XVI, pp. 85-102. Springer. (2019). PDF Slides

E. Kabliman, A.H. Kolody, M. Kommenda, G. Kronberger. “Prediction of Stress-Strain Curves for Aluminium Alloys using Symbolic Regression”. In Proceedings of the 22nd International Conference on Material Forming (ESAFORM 2019), Vitoria-Gasteiz, Spain. AIP Conference Proceedings. (2019). PDF

B. Burlacu, M. Affenzeller, G. Kronberger, M. Kommenda. “Online Diversity Control in Symbolic Regression via a Fast Hash-based Tree Similarity Measure”. In Conference Proceedings of the IEEE Congress on Evolutionary Computation (CEC), Wellington, New Zealand. (2019). PDF

B. Burlacu, M. Kommenda, G. Kronberger, M. Affenzeller. “Parsimony Measures in Multi-objective Genetic Programming for Symbolic Regression”. In Proceedings of the Genetic and Evolutionary Computation Conference 2019 (GECCO ’19). ACM, New York, NY, USA. (2019). PDF

G. Kronberger, L. Kammerer, M. Kommenda, B. Burlacu. “Visualisierung eines Suchraums für Symbolische Regression”. In Tagungsband Forschungsforum der Österreichischen Fachhochschulen (FFH 2019), Wiener Neustadt. (2019)

S. Prieschl, D. Girardi, G. Kronberger. “Using Ontologies to Express Prior Knowledge for Genetic Programming”. In Proceedings of Cross Domain Conference for Machine Learning and Knowledge Extraction (CD-MAKE) co-located with ARES 2019, Canterbury, UK, August 26-29, LNCS. (2019). PDF

M. Kommenda, B. Burlacu, G. Kronberger, et al. "Parameter identification for symbolic regression using nonlinear least squares". In Genetic Programming and Evolvable Machines. Springer. (2019). PDF

G. Kronberger, S. Scheidel, C. Haider, M. Kommenda, M. Kordon. "Integration of Physical Knowledge in Empirical Models - A New Approach to Regression Analysis". In Proceedings of the 8th International Symposium on Development Methodology, Wiesbaden, Germany pp. 1 - 9. (2019) Slides

F. Bachinger, G. Kronberger. "Management von lernfähigen Vorhersage-Modellen für Industrie-Anwendungen". In Proceedings of The Future of Work, Education and Living, Linz, pp. 253-59. (2019).

2018:

B. Burlacu, M. Affenzeller. "Schema-based diversification in genetic programming". In Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1111-1118. ACM. (July 2018). PDF

G. Kronberger, M. Kommenda, A. Promberger, F. Nickel. "Predicting friction system performance with symbolic regression and genetic programming with factor variables". In Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1278-1285. ACM. (July 2018). PDF Slides

F. Bachinger, J. Zenisek, L. Kammerer, M. Stimpfl, G. Kronberger. "Performance of industrial sensor data persistence in data vault". In Proceedings of the EMSS Conference, pp. 226-233. (September 2018). PDF