INTERNATIONAL JOURNAL OF RESEARCH AND SCIENTIFIC INNOVATION (IJRSI)

ISSN No. 2321-2705 | DOI: 10.51244/IJRSI |Volume XII Issue VIII August 2025

Page 527

www.rsisinternational.org

Symbolic Regression for Approximating Analytic Solutions to

Differential Equations

Dionisel Y. Regalado

North western Mindanao State College of Science and Technology, Philippines

DOI: https://doi.org/10.51244/IJRSI.2025.120800044

Received: 24 July 2025; Accepted: 31 July 2025; Published: 02 September 2025

ABSTRACT

Approximate analytic expressions are obtained for initial value problems with purely numerical solutions.

Symbolic regression was utilized to obtain such analytic expression. For functions that are Lipschitz

continuous, results revealed that the maximum absolute error (sup-norm) is bounded.

Keywords: symbolic regression, initial value problem, finite difference method

INTRODUCTION

Many differential equations used in Engineering and the Sciences have no closed-form analytic solutions but

are numerically solved (Hoffman et al., 2001). Numerical methods are techniques used to evaluate a solution at

a point without necessarily knowing the exact analytic form of the solution to an ordinary differential equation

(ODE). In most application, approximate numerical solutions may be sufficient but for purposes of analyzing

the mathematical properties of the solution, it may be necessary to derive an analytic expression for it. For

such situations, approximate symbolic expressions may serve as important bases for further mathematical

analysis. Recent developments in genetic programming can be exploited to provide such approximate solutions

through symbolic regression.

A first order differential equation is an initial value problem (IVP) if (Iserlas, 2008):

 

)(,)(

tutftu 

)( utu 

(1)

where





RxRtf ,:

and

Ru 

. For higher order systems, it is possible to analyze the system as a

larger set of first order systems such as (1) by employing extra variables. Thus, without loss of generality, one

may restrict attention to (1) since a higher order system can be converted to a larger system of first order

systems. For instance,

zuuu 

'''

and

uz 

(2)

where

is an extra variable.

Existence of a unique solution to (1) is guaranteed by the Picard-Lindelӧf Theorem which states that a unique

solution exists provided

is Lipschitz continuous. We state Lipschitz continuity in the case of real-valued

functions. A real-valued function

RRf :

is Lipschitz continuous if there exists a positive real

such that

for all real

and





󰇛





󰇜

 

󰇛





󰇜

 







 





(3)

In fact, let

RRg :

be everywhere differentiable, then

is Lipschitz continuous if and only if

has a

bounded first derivative (Grossman et al.,2007).

INTERNATIONAL JOURNAL OF RESEARCH AND SCIENTIFIC INNOVATION (IJRSI)

ISSN No. 2321-2705 | DOI: 10.51244/IJRSI |Volume XII Issue VIII August 2025

Page 528

www.rsisinternational.org

A numerical method for solving (1) with boundary conditions is the finite difference method. Starting from (1),

replace the derivative

 

by the approximation (Strikwerda, 2004):

 

   

tuhtu





(4)

which gives:

      

tuthftutu ,



(5)

To apply formula (5), choose a step-size

and construct the sequence

htt 

htt 2



, …,



and denote by:

 

tuu 

Expression (5) is then transformed into recursive relation:

 

nnnn

uthfuu ,





(6)

By choosing

small enough, we obtain the ordered pairs

        

utututut ,,...,,,,,,

221100

. We now seek a

function

 

tu *

that passes through all of the points such that:

   

tutu

*sup 



(7)

is minimum.

Theoretical and Experimental Results

We start by defining an initial value problem.

Definition 1. Let 

󰇟







󰇜

   be a real – valued function. A first order equations is an initial value

problem (IVP) if



󰆒

󰇛



󰇜

 





󰇛



󰇜





󰇛



󰇜

 



 .

The function 

󰇛



󰇜

is a solution to the IVP if it satisfies the differential equation. We seek necessary and

sufficient conditions for the IVP to have a unique solution.

Definition 2. A real-valued function    is Lipschitz continuous if there exists a constant    such

that for all 







 ,





󰇛





󰇜  󰇛



󰇜

 







 





Lemma 1. Let  be Lipschitz continuous, then  has a bounded first derivative.

Proof: From the definition of a derivative:





 





󰇛



󰇜

󰇛󰇜



By Lipschitz continuity of f, there exists as    for which





󰇛

  

󰇜

 󰇛󰇜



 







, 

which yields

INTERNATIONAL JOURNAL OF RESEARCH AND SCIENTIFIC INNOVATION (IJRSI)

ISSN No. 2321-2705 | DOI: 10.51244/IJRSI |Volume XII Issue VIII August 2025

Page 529

www.rsisinternational.org





 





󰇛



󰇜

󰇛󰇜



 







 

It follows that





   

The Picard – Lindelof Theorem guarantees a unique solution to the IVP.

Theorem 1.

󰇛

  󰇘

󰇜

. Let



󰆒

󰇛



󰇜

 





󰇛



󰇜





󰇛





󰇜

 



Let f be uniformly Lipschitz continuous in  and continuous in t. then for a given   , there exists a unique

solution to the IVP on

󰇟





 



 

󰇠

Proof. Write the IVP as the integral equation:



󰇛





󰇜

 













󰇛



󰇜



  󰇟













 󰇠. (8)

where  is to be determined. Define:

󰇟



 



 󰇠  󰇟



 



 󰇠







󰇛



󰇜



 













󰇛



󰇜









Hence, (8) is a fixed point of T. we show that T satisfies a Lipschitz condition:





󰇛



󰇜

󰇛󰇜  󰇛󰇜

󰇛



󰇜



󰇼









󰇛



󰇜



















󰇛



󰇜









󰇼













󰇛



󰇜



 





󰇛



󰇜



















󰇛



󰇜

 

󰇛



󰇜











󰇛

  



󰇜





  





 



  





(9)

where the norm on 󰇟



 



 󰇠 is:



  





 



󰇝



󰇛



󰇜

 󰇛󰇜

󰇞

Choose  





. It follows from this choice of  and (9) that T is a contraction mapping and is Lipschitz

continuous. By Banach’s fixed point theorem, there exists a unique fixed point 󰇛󰇜 for which:



󰇛



󰇜󰇛



󰇜

 󰇛󰇜

which solves the IVP. 

Finite Difference Method.

Consider the IVP and suppose that we wish to find 

󰇛





󰇜

 



at same later value 



. If a closed form analytic

solution can be obtained, then the problem is trivial. If, however, no such closed – form solution is available,

where M is independent of u

by Uniform continuity

INTERNATIONAL JOURNAL OF RESEARCH AND SCIENTIFIC INNOVATION (IJRSI)

ISSN No. 2321-2705 | DOI: 10.51244/IJRSI |Volume XII Issue VIII August 2025

Page 530

www.rsisinternational.org

then we turn to numerical methods (Strikwerda, 2004). The simplest and most commonly-used method is the

finite difference method. The method is based on the Taylor series expansion:



󰇛

  

󰇜

 

󰇛



󰇜





󰆓

󰇛



󰇜









󰆓󰆓

󰇛



󰇜







  (10)

By truncating all terms after the first derivative term, we have:



󰇛

  

󰇜

 

󰇛



󰇜

 

󰆒

󰇛



󰇜

󰇜  󰇛



󰇜 (11)

Definition 4. A function 󰇛󰇜 is “big Oh” 󰇛󰇜, written,



󰇛



󰇜

 󰇛󰇛󰇜󰇜

If there exists a constant    such that



󰇛



󰇜

 



󰇛󰇜



as   

or:









󰇛󰇜

󰇛󰇜

  

The error term in (11) is 󰇛



󰇜 which tends to zero as   . It follows that:



󰇛

  

󰇜

 

󰇛



󰇜

 

󰆒

󰇛



󰇜

 (12)

with an error proportional to 



. Equation (12) can be rewritten as a recursive relation:



󰇛



󰇜

 



󰇛

  

󰇜





 



󰇛

  

󰇜





 (13)





 



 





The closed interval

󰇟





 





󰇠

is divided into non-overlapping sub-intervals of length h, i.e.



󰇟



 



󰇠



  The

number of such sub-intervals is:

 











(14)

Convergence. For (13) to be useful, we need to show that the sequence

󰇝





󰇞





converges.

Lemma 2. For   󰇟



 



 󰇠, the sequence

󰇝





󰇞





of (13) is a Cauchy sequence.

Proof.

Let     



and let 



󰇛



󰇜

 



󰇛



󰇜

 

󰇛





󰇜

 where h is the step size and the function f is

Lipschitz continuous. Now,







 













 



 



  



 













 













 





 







 











󰇛





󰇜











󰇛





󰇜





 





󰇛





󰇜





 

󰇛󰇛





󰇜



󰇛





󰇜

 

󰇛





󰇜󰇜

(15)

where M is a Lipschitz constant. Let

INTERNATIONAL JOURNAL OF RESEARCH AND SCIENTIFIC INNOVATION (IJRSI)

ISSN No. 2321-2705 | DOI: 10.51244/IJRSI |Volume XII Issue VIII August 2025

Page 531

www.rsisinternational.org

  



󰇝





󰇞

(16)

Then:







 





  As     󰇛







󰇜  , (17)

hence,







 





  (18)

Theorem 2. The sequence

󰇝





󰇞





converges.

Proof.

Since

󰇝





󰇛󰇜

󰇞





is a Cauchy sequence of real numbers for each t, it follows that

󰇝





󰇛󰇜

󰇞

converges. The

  󰇘 theorem guarantees that 



󰇛󰇜  󰇛󰇜 as    

Symbolic Regression

Symbolic regression is a type of regression analysis that does not specify the functional form of relationships

between two variables. It utilizes a genetic algorithm to execute the analysis. This is included as an option in

much statistical software. For this purpose, a one-month trial version of the software EUREQA was used to

analyze the given data set (Regalado et al., 2019.

The data set analyzed consists of the ordered pairs

󰇝󰇛

󰇜󰇞

i=0

generated by the recursive relation (13). As

shown in the preceding sections, the computed values u

) can approximate the true solution u (t

) as closely

as desired. These approximations provide a reliable input for symbolic regression, enabling the discovery of an

analytic expression that best fits the numerical solution.

Let 



    and consider the pairs

󰇝󰇛









󰇜󰇞

. In traditional regression analysis, we assume a

model of the form:



󰇛



󰇜

 

󰇛



󰇜

 󰇛󰇜 (19)

where 

󰇛



󰇜

is a functional form that is completely specified except for the parameter values and 󰇛󰇜 are

random errors with zero expectation and constant variance. For instance, (t) may be a third degree

polynomial:



󰇛



󰇜

     



 



 󰇛󰇜 (20)

where a, b, c and d are parameter to be estimated from the data.

In symbolic regression, the functional form of 󰇛󰇜 is not specified but is assumed to be derived from a class

of functions called building blocks. Let

 







󰇛󰇜  󰇟







 󰇠



(21)

be the building blocks of function that are continuous. Symbolic regression, then, searches the space  for an

optimal combination of building blocks that best fit the observations. Fitness is a user-defined quantity such as

the mean – absolute error (MAE), the maximum error (ME) or the squared correlation goodness of fit 󰇛



󰇜.

The search process is implemented by applying the principles of genetic algorithm (GA). We consider the case

when  has a finite number of building blocks:

 

󰇝













󰇞

Each 



󰇛󰇜 is assigned as fitness value e.g. maximum error, when fitted to the observations. Let

INTERNATIONAL JOURNAL OF RESEARCH AND SCIENTIFIC INNOVATION (IJRSI)

ISSN No. 2321-2705 | DOI: 10.51244/IJRSI |Volume XII Issue VIII August 2025

Page 532

www.rsisinternational.org

 

󰇝



󰇛





󰇜



󰇛





󰇜



󰇛





󰇜󰇞

(22)

be the set of fitness values for the building blocks. The building are then arranged from the fittest to the least –

fit function. Let



󰇛󰇜











󰇛󰇜









󰇛󰇜









󰇛󰇜



where the subscripts denote ordering based on the fitness values i.e. 





󰇛󰇜



is more fit than 





󰇛󰇜



A second generation of building blocks is obtained by combining the most fit building blocks and recomputing

the fitness values of the resulting combinations of building blocks. The process continuous until a desired

fitness value is obtained. Gelly et al. (2017) proved under sufficient conditions , the solution given by Genetic

Programming converges when the number of examples goes to infinity , toward the actual function used to

generate the examples. This property is known in Statistical Learning as Universal Consistency. The authors

provided new results in the direction of bloat analysis: structural bloat and functional bloat. Structural bloat

occurs when no optimal solution , that is, when no function exactly matches all possible examples, is

approximated by the search space. In such cases, the authors demonstrated that optimal solutions of increasing

accuracy will also exhibit increasing complexity. ). On the other hand, functional bloat is defined as the bloat

that takes place when programs length keeps on growing even though an optimal solution (of known

complexity) does lie in the search space.

Illustration and Application

We provide a simple illustration of the proposed method for deriving an approximate analytic solution to a

differential equation based on a numerically-derived ordered pairs. The illustration can be solved analytically.

Thus, the solution is known which allows us to evaluate the proposed procedure more clearly.

  

󰆒

󰇛



󰇜

   

󰇛



󰇜

 



󰇛



󰇜

󰇛

󰇛



󰇜

 󰇜

We confine the search domain on the interval 󰇟󰇠 and we seek the value u(1). We chose step sizes h = .1,

.01, .001, .0001, .00001 and observed the maximum absolute error:

  







󰇛



󰇜

󰇛󰇜





The finite difference equation gives











󰇛





 

󰇜





 

Table 1 shows the relationship between step size and the maximum absolute error (sup).

Table 1: Relationship of step size and maximum absolute error

step size

sup

0.1

36.3

0.01

6.765

0.001

0.735

0.0001

0.0744

0.00001

0.0742

The estimated regression equation shows that as the step size increases, the maximum absolute error likewise

increases.

INTERNATIONAL JOURNAL OF RESEARCH AND SCIENTIFIC INNOVATION (IJRSI)

ISSN No. 2321-2705 | DOI: 10.51244/IJRSI |Volume XII Issue VIII August 2025

Page 533

www.rsisinternational.org



  

Predictor Coef SE Coef T P

Constant 0.8647 0.7886 1.10 0.353

step siz 356.63 17.55 20.33 0.000

S = 1.533 R-Sq = 99.3% R-Sq(adj) = 99.0%

Using h = 0.001, we used symbolic regression using a trial version of EUREQA to obtain a closed form

expression for the solution of the IVP. A portion of the data set is reproduced below:

Table 2: Portion of the Finite Difference Solution with h = .001

0.001

0.002

0.00401

0.003

0.00603

0.004

0.00806

0.005

0.010101

0.006

0.012151

0.007

0.014212

0.008

0.016283

0.009

0.018364

0.01

0.020456

The solution chosen has a maximum error of 0.00000515. The exact expression is shown on the screenshot

below:

INTERNATIONAL JOURNAL OF RESEARCH AND SCIENTIFIC INNOVATION (IJRSI)

ISSN No. 2321-2705 | DOI: 10.51244/IJRSI |Volume XII Issue VIII August 2025

Page 534

www.rsisinternational.org

Figure 1: Screenshot of the Final Solution

The final approximate solution to the IVP when rounded to the first decimal place is:



󰇛



󰇜



󰇛



󰇛



󰇜

 

󰇜



󰇛





󰇜



󰇛

  

󰇜

󰇛󰇜

The maximum difference between the approximate solution and the actual solution is:





󰇛



󰇜

󰇛󰇜







Let



󰇛





󰇜

solution obtained at 



by the finite difference method





󰇛





󰇜

 symbolic regression value at 



using 

󰇛





󰇜

as inputs



󰇛





󰇜

 actual value of the solution at 



Then













󰇛





󰇜

 



󰇛





󰇜



 











󰇛





󰇜

 

󰇛





󰇜

 

󰇛





󰇜

 



󰇛





󰇜









󰇛





󰇜

 

󰇛





󰇜









󰇛





󰇜

 



󰇛





󰇜





 



+



 

As   , 



  and if the solution 󰇛󰇜 is in the search space 



  Thus,





󰇛





󰇜

 



󰇛





󰇜





  as

  

REFERENCES

1. Blickle T. and Thiele L. (1994).Genetic programming and redundancy. In J. Hopf, editor, Genetic

Algorithms Workshop at KI-94, pages 33–38. Max-Planck-Institut f¨ur Informatik.

2. Gelly Sylvain, Teytaud Olivier, Bredeche Nicolas, Schoenauer Marc (2017) Symbolic regression,

parsimony, and some theoretical considerations about GP search space .Equipe TAO - INRIA Futurs,

LRI, Bat. 490, University Paris-Sud, 91405 Orsay Cedex. France

3. Grossmann, C, Hans-G. Roos; Martin Stynes (2007). Numerical Treatment of Partial Differential

Equations. Springer Science & Business Media.

4. Hoffman JD; Frankel S (2001). Numerical methods for engineers and scientists. CRC Press, Boca

Raton.

5. Iserlas, A. (2008). A first course in the numerical analysis of differential equations. Cambridge

University Press.

6. Koza J. R.(1992). Genetic Programming: On the Programming of Computers by Means of Natural

Selection. MIT Press, Cambridge, MA, USA.

7. Regalado et al. (2019). Approximate Analytic Solution To The Lotka-Volterra Predator–Prey

Differential Equations Model. Journal of Higher Education Research Disciplines 4 (1), 38-47.

8. Strikwerda, J/ (2004). Finite Difference Schemes and Partial Differential Equations (2nd ed.). SIAM.