Blog

Using date and time variables

Posted by Michael Schmidt

28.06.2013 04:01 PM

This post describes the best way to convert date or time values into numeric time values that can be used in Eureqa.

Time Values in Eureqa:
Eureqa can only store date and time values as numeric values (e.g. total seconds or total days). Therefore, you need to pick a reference point to measure a time duration from, and units to measure the time duration.For example, you could convert a time value “8:31 am” to 8.52 total hours since midnight. Similarly for dates, you could convert a date like “Dec. 6, 1981 8:31 am” to 81.9 total years since 1900.

You need to make date and time conversions to numeric duration values in another program like Excel before entering into Eureqa (see below for example).

Pitfalls:

Pitfalls of using time variables in Eureqa
1) Do not concatenate date and time strings to get a numeric value. For example, do not convert a date like “1981-12-06” to 19811206. This representation of time is extremely nonlinear. It can preserve order, but has lost all meaning. Additionally, the values are very large and numerically unstable.

2) Avoid measuring time durations from a very distant reference point. For example, if you’re data uses time values that span a few days, do not convert these time values to total seconds since the beginning of the century. The numeric values would be enormous and numerically unstable.

Instead, the best practice is to measure a time duration since the time point in your data set.

Convert in Excel:

Many programs can convert date and time values to numeric time duration values. In Excel, if you subtract two date cells, the result is the fractional number of days between the two dates. You could then convert days to hours or some other unit to get numeric values with reasonable numeric magnitudes. For example:

    =(A0-A$0)*24

and then repeated for all rows, would subtract the first date in cell A0, and multiply the resulting day values into hours.
Another useful function is the YEARFRAC function which converts the difference between a date and a reference date to the fraction of years difference between them. For example:
    =YEARFRAC(A$0, A0)

and repeated for all rows, returns the fractional value of years from cell A0.

See Also:

Topics: Advanced Techniques, Eureqa, Preparing Data, Techniques, Time Series

Use time-delays or time-lags of a variable in Eureqa

Posted by Michael Schmidt

A time delay retrieves the value of a variable or expression at a fixed offset in the past, according to the time ordering or index of each data point in the data set. This post describes the time-delay building-blocks available in Eureqa and different modeling techniques with delayed values.

Time Delay Building Blocks:

Eureqa provides the delay(x, c) building block to represent an arbitrary time-delay, where x could be any expression. The expression delay(x, c) returns the value of x at c time units in the past. When used as a building-block, Eureqa can automatically optimize expressions or variables to be delayed and the time-delay amount  c.

Eureqa Time Delay Building Blocks

The figure above plots an arbitrary variable x and a delayed value delay(x, 1.0), where the values are ordered by some time variable t. The delayed version is equal to x at 1.0 time units into the past.

To use time-delay building-blocks, your data must have some notion of time or ordering. You also need to tell Eureqa which variable in your data represents the time or ordering value:

Eureqa Time Delay Building Blocks

If you don’t specify a time variable, Eureqa will use the row number in the spreadsheet as the time value of each data point.

If a particular delayed time value falls between two points in the data set, the value is linearly interpolated between the two data points using the time value.

Eureqa also provides the delay_var(x, c) building-block which is identical to delay(x, c), except that it only accepts a variable as input. It’s provided as a special case of the delay(x, c) building-block to allow you to constrain the types delays used in the solutions. But in the end they are effectively identical.

Control the Fraction of Data Used for History

Notice that the delayed output plotted above does not have values on the left side of the graph for the first few time points. This is because these points request previous values of x that lie before the first point in our data set. Eureqa will automatically ignore these data points when calculating errors.

However, there is a way to control how much of the data set Eureqa is allowed to ignore – or effectively, specify a maximum delay offset. You can limit the fraction of data used for time-delay history values in the Advanced Solutions Options menu:

Eureqa Time Delay Modeling How to do time delay modeling in Eureqa

The default maximum fraction is 50% of the data. If you find that Eureqa is identifying solutions with very large time delays, perhaps just to avoid modeling difficult features in the first half of the data set, you may want reduce this fraction

Additionally, you can control the number of delayed values per variable (including a zero delay of an ordinary variable use) in this dialog.

Fixed Time-delays:

Another way to model a value as a function of its previous values is with fixed delays. You can enter in fixed time-delays, or “lags” of the variable, directly into the Search Relationship option. For example:

    x = f( delay(x, 2.1), delay(x, 5.6) )

This search relationship tells Eureqa to find an equation to model the value of x as a function of it’s value at 2.1 and 5.6 time units in the past.

Minimum Time-delays:

You may also want to specify a minimum time-delay offset. If you entered a search relationship such as x = f(x), Eureqa would find a trivial answer f(x) = x. More likely, you wanted to find a model of x, but as a function of x at least some amount of time in the past. The way to do this is to again use a fixed delay, such as:

    x = f(delay(x, 3.21))

Here, 3.21 is the minimum time-delay. Now, if the time-delay building-blocks are enabled, Eureqa can delay this delayed input further if necessary.

Delay Differential Equations:

Another common use for time-delays in for modeling using Delay Differential Equations. Finding delay differential equations is just like searching for ordinary differential equations. For example, entering a search relationship like:

    D(y,t) = f(y)

but also enabling time-delay building blocks. This relationship has a trivial solution however: Eureqa will return the slope formula such as

    f(y) = ( y – delay(y, 0.1) )/0.1

Therefore, you most-likely want to limit the total number of delays per variable to one (which includes the zero delay of the normal variable use). You can set this in the Advanced Solution Settings menu. The default is unlimited.

Implementing Delays Outside of Eureqa

In Matlab, you can implement a time delay using the interp1 function. For example, the expression delay(x, 1.23) would be implemented as:

    interp1(t, x, t – 1.23, ‘linear’)

Implementing delays in Excel is a littler harder. You need to download an Excel add-on that adds an interpolate function. For example, the package XlXtrFun adds a function “Interpolate” that is just like Matlab’s interp1. There are also other guides for Linear Interpolation with Excel.

Topics: Advanced Techniques, Eureqa, Techniques, Time Series, Tutorial

Follow Me

Posts by Topic

see all