Blog

Customer Spotlight: Radnostics

Posted by Jess Lin

19.11.2013 10:00 AM

We have amazing customers doing even more amazing things with their data. As we hear from our customers with their stories, we will be sharing them with you here on our blog. Hopefully they will help inspire you to think about what more you could be doing with your own data! Contact us if you have a case study using Eureqa you would like to share as well.

The diagnosis of acute appendicitis requires immediate surgery in cases where the appendix is still intact. To avoid exposing children to ionizing radiation in CT scans, ultrasound can be used to detect other features of the appendix and the surrounding area, but is unable to directly reveal the presence or absence of a perforation in the appendix. A team led by Dr. Einat Blumfield worked to discover the specific, associated ultrasound findings that would accurately point to perforation. Given the ultrasound images of 161 subjects diagnosed with acute appendicitis, the researchers needed to find a model that would identify correlations in the data.

Anthony Blumfield, the statistical modeler for the team, got a suggestion from his former classmate Hod Lipson (now advisor to Nutonian), to download Eureqa and use the power of symbolic regression to find a model for the data. After setting the software to work, Anthony Blumfield returned a couple hours later to find that the software had already found a model that predicted perforation as accurately as CT scans. More surprisingly, the model had discovered age categories that better defined which ultrasound findings would predict perforation. “We want to look at clinical findings that are associated with perforation such as duration of symptoms, white blood count, and fever,” Dr. Einat Blumfield explains. “We’ll then use Eureqa to search for a formula that combines ultrasound findings and clinical findings, and we hope to achieve an even higher level of accuracy.”

For more details, read the full case study here!

Jess

Topics: Case study, Eureqa, hod lipson, Modeling Outputs

Customer Spotlight: Performance Genetics

Posted by Jess Lin

05.11.2013 10:00 AM

We have amazing customers doing even more amazing things with their data. As we hear from our customers with their stories, we will be sharing them with you here on our blog. Hopefully they will help inspire you to think about what more you could be doing with your own data! Contact us if you have a case study using Eureqa you would like to share as well.

describe the imagePerformance Genetics empowers their customers to make confident purchasing decisions at yearling and two-year-old training sales. Using a combination of practical horse knowledge and horsemanship along with an intensely computational approach to data, they are able to spot elite potential in not-yet-raced horses. Knowing the key data points for generating an accurate prediction and being able to gather and analyze those data points quickly is critical in giving Performance Genetics an edge in the sale ring.

CEO Byron Rogers tried out Eureqa after his partner, Alan Porter, saw the software featured in a New Scientist article. Using Eureqa to analyze a historical data set, Rogers quickly determined that the tool returned powerful models that were immediately applicable in the field. Eureqa’s models were able to simplify the prediction algorithm with no loss to accuracy, generating improved algorithms with fewer data points. This had the added benefit of not only allowing Rogers to generate predictions on the field more quickly, but also freeing him to focus on capturing more nuances in the remaining, key data points. “As with the biomechanical data, I expect that Eureqa’s ability to pick out the important data points in these other areas, along with its ability to make use of prior solutions, will be integral in the development of a unified model,” he says. “It’s going to be very interesting.”

For more details, read the full case study here!

Topics: Case study, Eureqa, Modeling Outputs

Modeling outputs that have a range of values

Posted by Michael Schmidt

28.06.2013 03:57 PM

Often you might want to specify that the output of a model should fall within a certain range rather than an exact numerical value. This post shows one way to do this with Eureqa. The goal it so find the simplest equations who’s outputs always lie between some min/max value for each data point.

Enter Min and Max Values for each Data Point:

Step 1: For each data points that you only have a range of output values (the min and max values), you simply need to add two rows for that data point, one with minimum value and one with the maximum value (keeping all other variables in the row the same).

Step 2: Next, set the fitness metric to the “Mean Absolute Error” option.

Step 3: Start the Eureqa search as usual. Solutions that fall between the min and max values will have identical absolute error.

If a model output lies between the min and max values, the absolute error happens to be indifferent (mathematically) to where exactly this value lies. If the value moves closer to the max value, the error on the max value data point decreases linearly, but the error on the min value data point increases linearly also.

Example

In Eureqa, your data view should look similar to:

Modeling outputs that have a range of values in Eureqa

Where each input x is repeated twice, once with the minimum y value and again with maximum y value.

We can then start the search using the Mean Absolute Error fitness metric, and get various solutions that fall into the min/max ranges:

Modeling Outputs with a range of values in Eureqa Modeling outputs with a range of values in Eureqa

These solutions may have slightly different fitness values because some min/max data point pairs might get separated between the train and validation data sets. One way to avoid this is to change the train and validation sets to use all data or not shuffle their points in the Advanced Genetic Program Settings menu.

Using Separate Min and Max Values in a Custom Error Metric

Another option is to specify a custom error metric in the Search Relationship, this allows you to enter your min and max range values in different columns. For example, consider the following search relationship:

abs(y_min – f(x)) + abs(y_max – f(x)) = 0

where x is the input, and y_min and y_max are two different variables representing the min and max values of the range of outputs for each input x. The custom error in this relation is equivalent to the previous method.

Topics: Advanced Techniques, Eureqa, Modeling Outputs

Follow Me

Posts by Topic

see all