The Golden Age of Analytics

Posted by Guest Author

17.09.2015 10:30 AM

by Dan Woods

The supply chain of data in the modern world has evolved beyond careful curation in controlled data warehouses. A fundamental change to the analytic workflow is needed in order to make advanced analytics available to a mass audience.

Attaining the golden age of analytics requires the democratization of advanced analytics. We need systems that separate the ability to create data science analysis from the ability to consume it, allowing anyone to intuitively interact with the results. In this golden age, users shouldn’t need to know the difference between a decision tree and logistic regression, or debate the benefits of R^2 over MAE, in order to create personalized action plans for thousands or millions of products. The growing need for predictive models to uncover these hidden, data-driven business solutions will continue to outstrip the limited numbers of data scientists who can create them.

Nutonian’s machine intelligence leads the way to the golden age of analytics. Companies who adopt machine intelligence can automate the discovery of analytic models, bringing predictive modeling out of the shadows and into the light. Creating complex, non-linear models is no longer a virtuoso activity with machine intelligence, but something the average business user can accomplish on his own to quickly generate viable business actions. As a data science productivity tool, machine intelligence also empowers already proficient data scientists to automate menial data tasks and extend their existing abilities.

How is this possible? Instead of simple, incremental improvements over existing, decades-old data science processes, machine intelligence combines the virtually unlimited computational power available today with a proprietary evolutionary search process to take a fresh approach to analytics. Hand in hand with machine intelligence, anyone can: 1) sift big data down to the right data, 2) generate completely new models to describe previously unknown systems, 3) optimize the complexity and application of a solution for the exact situation at hand, and 4) incorporate human expertise and creativity into the machine through interactive iteration – all within one user-friendly system.

Machine intelligence is key to unlocking the golden age of analytics, as it transforms predictive modeling into a company-wide application for developing optimal strategies and driving sustainable competitive advantages.



Dan Woods is CTO and founder of CITO Research. He has written more than 20 books about the strategic intersection of business and technology. Dan writes about data science, cloud computing, mobility, and IT management in articles, books, and blogs, as well as in his popular column on

Topics: Eureqa, Machine Intelligence

How to Anchor Your Fantasy Football Team: Using Advanced Analytics to Pick the Best Available Quarterback

Posted by Jon Millis

10.09.2015 10:40 AM

Fantasy football: where owning a sports franchise is within anyone’s reach, a year of pride is put on the line, and grown adults cry…more than once…almost every Sunday. Yet for many football fans, we wouldn’t have it any other way. Except for you, Reggie Bush. 550 total yards…FIVE HUNDRED AND FIFTY YARDS?!? I COULD JOG BACKWARDS FOR MOR—let’s move on.

I’ve been playing competitive fantasy football for more than 10 years with 25 friends, all avid NFL fans. We play a “Champions League”-style system with a winners league and a consolation league, moving bottom and top finishers between leagues to reward high-performers and punish the slackers. Yes, to the chagrin of our bosses, families, and friends, we take our imaginary sports pretty seriously.

I’m the Peyton Manning of my league. I’m a fantastic regular season performer. Then the playoffs roll around, and every position player on my roster forgets he’s supposed to be good at football. Nutonian’s data scientists tell me this is likely either due to expected statistical variation in player performance or bad karma for wearing flip-flops to the office. I’m convinced it’s a curse from when I told my parents I knew multiplication but was actually holding flashcards under the dinner table. (Sorry, Mom and Dad. I learned them eventually, though. That’s what counts. Right?…Right???)

This fall, I’ll take any advantage I can get. One of the biggest strategic changes over the last five-plus seasons has been a power-shift from position players to quarterbacks. Four things have contributed to this development:

1)    Running backs: Most NFL teams have shifted to a two- or three-running back system, lowering season averages for feature backs but increasing the total number of viable fantasy options.
2)    Wide receivers: Receivers are running wild. There are more 1,000-yard receivers now than there have ever been in the NFL. Much of this can be attributed to relatively recent rule changes that limit the amount of contact defenses can make with receivers. But despite the increase in high-performing wide receivers, they tend to be among the most hair-pull-inducing of fantasy players because their week-to-week output is wildly inconsistent. One dropped ball could be the difference between a great week and a hole through the living room wall.
3)    Quarterbacks: The same rule changes benefitting receivers also benefit quarterbacks. But in addition to having more open receivers to throw to, quarterbacks have also been granted new protections, such as the NFL’s creation of a limited “strike zone” where defensive players can hit them. Additionally, officials are nowadays much quicker to throw flags for late hits and unsportsmanlike conduct. This has made quarterbacks safer and kept them upright enough, for long enough, to deliver more passes down the field.
4)    Game plans: Teams are throwing the ball more. There are varying theories behind this. Some people think today’s QBs are drafted more “league-ready”. Some people cite the new rules. Others, the aggressiveness of new offensive coordinators. I think it’s karma from wearing flip-flops to the practice facilities.

The takeaway? You want a good quarterback. You really want a good quarterback. Good quarterbacks now account for the most points in almost every league, and the guys at the top account for many more points than the guys in the middle of the pack. Long story short, for the love of Jon Gruden, draft a good quarterback.


What’s cool is that instead of relying on online research and fantasy football manuals that aggregate “expert analysis” from old guys living in their mothers’ basements, I remembered that I worked for a company that, you know…does analytics for a living. The benefit of machine intelligence over black box predictions from places like ESPN and Yahoo is that machine intelligence will not only project a quarterback’s stats, but it’ll tell me how it arrived at each prediction. Knowing the underlying components of each predictive model enables me to apply my “domain expertise” to determine whether or not the model, and the prediction, checks out. We thought it’d be interesting to a whole lot of people if we could leverage Eureqa to eat data for breakfast and have it tell us which quarterbacks we should be eyeing in our upcoming fantasy drafts. Eureqa has had pretty impressive results in the past when predicting things like the Kentucky Derby and March Madness, so we wanted to see if it could continue the trend.

We uploaded data¹ from 2007-2014 for all quarterbacks who started at least 10 games in a season. Could Eureqa build mathematical equations – “rules” – that accurately predict quarterback performance in 2015?

Sure enough, it can.² Using Eureqa, we automatically generated seven unique predictive models for pass yards, passing touchdowns, interceptions, rush yards, rushing touchdowns, two-point conversions made, and fumbles lost.³ That is, what are the signals from past data that are most influential in predicting a quarterback’s passing yards this season? And, based on those signals, how many yards will he actually throw for? We then aggregated all of the player’s predicted statistics to yield a “total points” column and stack-ranked our top 20 performers. Football fans will not only enjoy our recommended “cheat sheet” for quarterbacks; they’ll also be fascinated to learn about the most important signals that guided our predictions.

Here’s how things shake out:

At the top of the pack are the usual suspects. Aaron Rodgers is pretty good at throwing a football, it turns out, and Eureqa doesn’t expect that to change. He did just lose his best receiver in Jordy Nelson for the year with a torn ACL, so we’ll see if that affects his performance. If we’ve learned anything about Green Bay over the years, they’re a wide receiver factory, and they’ll have guys step up. It does surprise me to see Andrew Luck ranked behind Manning, though not by much (I’d be astounded if Luck threw for only 29 TDs, though he may be an outlier in the data.) It’s also surprising to see Eli Manning in the eight-spot, but machine intelligence is perhaps telling us something we don’t know. Eureqa also didn’t factor in Brady’s unquantifiable “unleash hell” variable, where former teammates vow a pissed-off Brady’s going to put on a performance for the ages to spite the clowns running the NFL (sorry, I’m a Pats fan. Had to throw in at least one shot at Goodell.). Tony Romo, Philip Rivers, Matt Stafford and Cam Newton also look low to me, but they’ve all also had a few pretty dud seasons mixed into their careers; this could be another one of those years. Overall, this list passes the sniff test pretty well. Let’s dig into how we got there.


Most Important Predictive Signals: total fantasy points (positive correlation), rush attempts (negative correlation)

Analysis: This makes some sense. A QB’s total points from last year is generally an indicator of his quality as a fantasy quarterback. A quarterback who performed well last year is likely to perform well this year, and throwing for lots of yards is a large part of that. Interestingly, last year’s rushing attempts also shows up. The more a QB runs the ball, the fewer yards he tends to throw for. Easy enough.

Pass TDs

Most Important Predictive Signals: passing TDs (positive), fumbles lost (positive), two-point conversions (positive), sack percentage (negative)

Analysis: This is where it starts to get fun. We have a limited data set, so we have to hypothesize what things Eureqa found that are incredibly valuable, or what things might’ve shown up as curious results simply because we didn’t have enough data. Last year’s pass TDs are highly predictive of this year’s pass TDs. Once a QB learns how to get in the end zone, assuming little roster turnover year-over-year, they’ll generally stay in the ballpark of last year’s performance. Fumbles lost (the more fumbles you lose, the more TDs you tend to throw) is the one that made me raise an eyebrow. Maybe we don’t have enough data. Maybe it’s right, and fumbles lost is an interesting proxy for how risky a quarterback is (they hold onto the ball longer, run more, etc.), which yields more big plays and touchdowns.

More two-point conversions could mean a few things. It could mean you’re scoring more touchdowns, which unsurprisingly means more opportunities to go for two. It could also mean you’re playing from behind more often, and teams tend to throw the ball more when they’re behind, which leads to more touchdowns. And lastly, and also pretty intuitively: don’t let your quarterback hit the ground. Sack percentage is the percentage of time a quarterback is sacked when he drops back for a pass play. A quarterback who doesn’t get sacked/pressured throws for more touchdowns. And he stays healthy. And he’s better friends with his offensive line.


Most Important Predictive Signals: sack percentage (negative)

Analysis: Here’s the tough one to rationalize, so I’ll first lead off why explaining this could be wrong, and then I’ll move along to sound like a crackhead and vigorously explain why it could make sense. Out of all the 150 variables that could predict next year’s interceptions, Eureqa found one that’s more explanatory than all the others, or any combination of any of the others: sack percentage. The more you got sacked last year, the less interceptions you’ll throw this year. Really, Eureqa? You could’ve found anything: pass attempts, TDs, age, complex stats that ESPN nerds code together, etc. Instead you brought me sack percentage, and made it negatively correlated. This could very well be a case of not enough data. Or, it could be that interceptions are just ridiculously hard to predict. Many of them are fluke incidents: a ball is tipped, the wind carries a throw too far, the receiver trips, the QB is hit right as he throws the ball. These are things that just can’t be accounted for with data.

But hang on a second. Season interceptions are somewhat predictable. Who’s going to throw more picks: 16 games of Geno Smith, or 16 games of Aaron Rodgers? What’s that? Geno Smith is out for six weeks? When did this happen?! (Sorry, Jets fans. I know, I know, too soon.) The point is, some quarterbacks absolutely, consistently, throw more interceptions over the course of the season. What if: a quarterback threw a lot of picks in large part because he was consistently pressured (or sacked). That means his sack percentage would be high that year. In response, 1. The quarterback works his tail off in the offseason to get the ball out of his hands more quickly, and/or 2. The coaches and front office work vigorously to improve the offensive line…which means in the following year, the quarterback has much better protection and time to throw. And most NFL QBs are talented enough to make most throws when they have the time. There is no data to support my second theory. All logic. All heavily coffee-driven logic.

Rush Yards

Most Important Predictive Signals: rush yards (positive), deep pass percentage (positive), times sacked (positive)

Analysis: I like what this one is telling me. If a QB had a lot of rushing yards last year, he’ll probably run this year. Once a runner, always a runner. Deep pass percentage is the percentage of throws in which a receiver is at least 15 yards down the field. This could be a proxy for how risky a quarterback plays. He runs, he throws deep, he goes for big plays. I don’t know if this is true, but it could be. Lastly, times sacked. If a QB was sacked a bunch last year, that’s probably an indicator that he was under a lot of pressure. When you’re under pressure, you scramble more. If you were under pressure last year, you’ll probably be under pressure again this year, and keep scrambling. Oh…that directly contradicts what I said in my interceptions analysis? Let’s move along.

Rush TDs

Most Important Predictive Signals: rush attempts (positive)

Analysis: Quarterbacks who run the ball more score more touchdowns. Scrambling quarterbacks continue to scramble when they’re in the red zone. Go figure.

Fumbles Lost and 2PT 

Most Important Predictive Signals: mean

Analysis: Our best models for fumbles lost (2.6) and two-point conversions (0.7) are constants. Fumbles lost and two-points conversions are both highly unpredictable, so guessing the average is our best bet to predict player-by-player outcomes. Many times, fumbles lost depend on fluke hits and bad bounces. Two-point conversions are relatively rare and dependent on unique in-game situations. While I know they will be a few players that deviate from the trend, I’ll gladly take averages here.

And with the above chart, and the smartest machine intelligence application on the planet informing us what’s likely to happen, we recommend applying any insider information you have about new player acquisitions, preseason performance, etc. to the mix to make the best judgment call possible. Man and machine, working together, to solve the world’s most pressing problems…like fantasy football.


¹ We included about 150 variables in all, ranging from the simple (age, completions, TDs) to the complex (clutch-weighted expected points added through rushes). Stats were pulled from,,, and
² This would be way less interesting if it couldn’t.
³ These are the stats that earn/lose a quarterback points in standard leagues.

Topics: Eureqa, Fantasy Football

Eureqa Will Augment – But Not Replace – Engineers

Posted by Jon Millis

12.08.2015 09:50 AM

A few weeks ago, wrote an article that nicely summarized Eureqa’s impact in the manufacturing space, slashing the cost and boosting the performance of metals. But while we lovingly call our product the “Robotic Data Scientist” for its ability to automatically inform engineers (or any analyst/data scientist) how to optimize processes, one thing Eureqa will never do is eliminate the need for good engineers.

Eureqa extends engineers’ capabilities so they can get better results, faster. Manufacturing complex widgets and running efficient supply chains involves a significant amount of predictive and analytical modeling to problem-solve: If we feed these inputs into the assembly line and manufacture them in this way, what’s likely to happen? Based on how we currently run this process and all the inputs at our disposal, what can we change to generate a more durable, lower-cost widget or an entirely new material?

Typically, the modeling process is extremely time-consuming. Most engineers do not come from statistical or computer science backgrounds. While brilliant, their comparative advantage is not creating analytical models to decipher which inputs and processes lead to the best results. Engineers are smart enough to figure it out, but it’s asking them to put extraneous time into something that’s not their specialty, and something that can be accomplished faster and more accurately with computers than with any possible team of humans.

Engineers are domain experts. They understand their project, and they understand all of the various factors at play. They are Eureqa’s coaches. They can apply Eureqa to a problem (why is this process failing 15% of the time?), feed it all of their data, and let it churn through to determine the best equations – from the very simple to the very complex – that characterize the problem. Eureqa can do in minutes what it takes even a well-trained team to do in weeks or months (in some cases, if ever). Under the hood, Eureqa leverages free form modeling, a treasure chest of techniques developed by the world’s top data scientists, to automatically search the infinite space of equations that could explain the data. Once Eureqa converges on completion, it gives the engineer a few models to choose from and iterate on.

This is where the engineer’s expertise is not only a nice to have, but a need to have. There may be certain features Eureqa finds and recommends changing, for example, which can’t be touched due to regulatory compliance measures. Other attributes, like process conditions, may be prohibitively expensive to adjust. Engineers can recognize this and omit such variables from the data for more actionable, telling models. They can put thresholds on variables, stray from models too simple or complex, and easily interpret what the models actually say. They don’t need to work for weeks manually testing hypotheses. They can build accurate predictive and analytical models, that are easily understandable in plain English, orders of magnitude faster than they could with legacy tools and techniques.

One of Nutonian’s goals is to enable any company – any engineer – to accelerate the speed and scale of any data science initiative. We’re teaching engineers to fish, not snatching the fish from their hands.

Topics: Eureqa, Manufacturing

Data Scientists Don’t Scale. Machine Intelligence Does.

Posted by Jon Millis

29.06.2015 12:30 PM

Data scientists are unicorns. We know they exist, we know they’re magical, we know they hold the answers to many of our business intelligence hopes and dreams. Unfortunately, the current market encounters a tri-fold of problems in trying to find, tame and leverage these unicorns.

  1. Data scientists are hard to find and attract.
  2. If you’re lucky enough to have a data scientist, he/she may already be overwhelmed with questions.
  3. Data scientists use pixie dust: complex programming languages and models. If a chef came to your office and cooked you a meal using pixie dust, and he promised it was going to be delicious for the company, wouldn’t you have some questions before you went feeding it to all your colleagues?
    “Chefman. What’s in this?”
    “Powdery white stuff.”
    “And what’s the white stuff made out of?”
    “Ingredients that I collected.”
    “How do I know it’s good for me?”
    “Because I’m a professional chef…”

    headdesk “

Side note: if you are a data scientist, God bless you, and come use us. You’ll be like Superman without a kryptonite (short time-to-answer cycles and transparent models).

If you’re not a data scientist, welcome back to my fantasy world, where I haven’t used the words “unicorn” or “pixie dust” since my parents left my merciless older sister home alone with me in grade school.

Stuart Frankel, a guest columnist for Harvard Business Review, recently published an article explaining that executives are finally becoming frustrated with their big data investments. Despite pumping large budgets into storage, analysis, reporting and visualization technologies, employees are still burning the midnight oil manually creating reports and interpretations of their data. Stuart notes:

To solve this problem and increase utilization of existing solutions, organizations are now contemplating even further investment, often in the form of $250,000 data scientists (if all of these tools we’ve purchased haven’t completely done the trick, surely this guy will!). However valuable these PhDs are, the organizations that have been lucky enough to secure these resources are realizing the limitations in human-powered data science: it’s simply not a scalable solution. The great irony is of course that we have more data and more ways to access that data than we’ve ever had; yet we know we’re only scratching the surface with these tools.

Instead of throwing in the towel on their big data initiatives, execs are doubling down and going data scientist-hunting. They know there’s gold in the data, but they’re having triple-bypasses trying to find it. They could wait on the sidelines and risk their competition making a breakthrough before them, or they could attempt the annexation of Puerto Rico to win the game. Naturally, as Little Giants fans, many execs are attempting the annexation of Puerto Rico.

So, where does this leave the market? Short.

Data scientists are rare commodities. The U.S. alone faces a shortage of 140,000-190,000 of them. What does that mean? An arms race for talent…all while data generation is increasing exponentially and the tools (R, SAS, Python, etc.) are remaining similar. So while the Googles, the Amazons, the ExxonMobils and the Walmarts may be able to shovel out the dough to attract data science talent, most can’t, even many companies as well-off as the Fortune 500s. And this is not a market that will quickly “clear” to match supply with demand; massive salaries have not been enough to lure more data scientists into training. One of the largest financial institutions in the world, now evaluating Eureqa, said it’s footing the bill to send 30 analysts to receive their Masters in Data Science, a program that won’t return them back to the company for years, because they were pessimistic they could fill the void in the open market. What data scientists do – curate data, ask the right questions, build explanatory analytical models, implement the models into various applications – is simply not scaling at the pace of demand.

Oh, is that all?

If we can’t address the problem from the demand side, let’s address it from the supply side. Give people the tools they want, that they need, to be successful information archaeologists and unearth transparent analytical models that communicate the patterns and relationships hidden within the data. Eureqa is automated machinery that does all the heavy lifting. We’ll help an analyst unearth dinosaurs, with museum-ready biographies attached. And with the speed and talents that data scientists already have, we’ll help them fill an entire museum.

Analytical models are the bridge between raw data and meaning. They impose structure on chaos, isolating the important factors driving a system. They explain how something works: how everything is connected to form the whole, how each individual input blends to drive an outcome. The engine of big data, of big insights, fundamentally is the analytical model. But models have shortcomings, too. They typically take weeks to hypothesize and to build. They’re inherently manual, technical. Eureqa is entirely automated, leveraging an evolutionary algorithm that searches an infinite equation space to bring the user the simplest, most accurate models that explain the data’s behavior. Models are typically difficult to interpret. Eureqa has an “explainer” section that translates the models into plain English, and even features an “interactive explainer” module that allows users to toggle changes in variable values to simulate “what-if” scenarios in real time. These are things that data scientists years ago could only dream about.

It may be true that data scientists don’t scale. But answers do. Evaluate Eureqa to see for yourself.

Topics: Eureqa, Scaling data science

Letter from a Grateful Hobbyist Who’s Predicting the Financial Markets with Eureqa

Posted by Jon Millis

22.06.2015 02:00 PM

Nutonian users aren’t just large corporations. They’re also hobbyist data modelers leveraging Eureqa to predict the popularity of songs, analyze home science experiments, and even determine what makes some Porsches faster than others. The letter below was sent to us by a former real estate investor and manager named Bill Russell, who’s been using Eureqa to anticipate relatively short-term movements in stock prices. Hopefully Bill’s note will not only shed light on Eureqa’s potential, but will encourage our non-commercial fans to start thinking how they might apply Eureqa to some cool personal projects outside the office.


To Michael Schmidt and the team at Nutonian:

Michael, I want to express my deep appreciation for what you have created and shared. I first started following Eureqa in early 2010 when my mathematician brother alerted me to your double pendulum demo and beta download when you were at Cornell.

By way of background, I’m 70 years old, retired from a career in real estate finance and management. My degree was in economics, but I always loved numbers and the numerical analysis side of that business. My serious hobby over the years has been an attempt to predict short-term moves in the financial markets. I never had an impressive level of success, but always a lot of enjoyment with the puzzle of it all. In retrospect, I am sobered by how much time and the many resources I’ve previously put into this hobby.

My attempts in market prediction began with Fourier analysis (thanks to my brother’s programming and math skills) on an HP-85 desktop computer that had 16kB of ram. Next, things got more serious with the IBM XT, Lotus, and very large worksheets of pre-processed data obtained from Pinnacle Data and TradeStation. Over the years, I went to seminars given by John Hill, Larry Williams, Bill Williams, Tom DiMark and others. I purchased the Market Logic course for Peter Steidlmayer’s market profile approach and the trading course Relevance III from Maynard Holt in Nashville. There were many ideas here and there for indicators and inter-market relationships, but choosing which to use, and how to use them together, was daunting. Eureqa has changed that. Along the way, I used some impressive programs at the time. Brainmaker Professional, a neural network program, took plenty of my time in searching for useable predictions. HNET Professional, a holographic neural network program was fast and impressive. AbTech’s Statnet was excellent as was Neuralware’s Neuralsim. Yet despite the prolonged, multi-year and serious approach, I could never find an integrated, consistent pathway to success.

Because Eureqa incorporates so much analytical power in one place and finds relationships that were simply impossible to find previously, I am encouraged as never before. With the opportunity to utilize Eureqa, so much of my past approach is obsolete and elementary. I have left most of my previous analytical programs behind and many of my technical market books have now been donated to the public library. Of great significance for individual traders is that you have diminished the gap between the professional and nonprofessional in approaching the markets. Each group can utilize Eureqa, and Eureqa is equally powerful for each.

In the past, my best insights into what data might be useful came from hundreds of tedious runs of Pearson correlations and trial-and-error runs in the neural networks. I looked for ways to recast and understand the data in S-Plus and now the R language, but I am not a programmer. Trying to smooth data with splines in R was almost an insurmountable task for me. Eureqa is enabling me now to pursue options that were previously impossible. Here are some of the reasons:

1) Power and Speed: I’m able to pursue so many more alternatives than were previously within my reach. Because Eureqa is so fast, I am now able to compare runs with a) raw data; b) the same data recast to binary form; c) the data uniformly redistributed; d) the data in a de-noised wave-shrunk form. There was simply not enough time to do this before I found Eureqa.

2) Fast Data Processing and Visualization in Eureqa: I had previously done smoothing, normalizing, and rescaling in S-Plus or R. Here Eureqa saves significant time and I have complete confidence that it is being done correctly. I was often uncertain if I was getting it right on my own with the R language.

3) Tighter Selection of Input Variables: I had previously looked for any correlated relationships among a bar’s open, high, low, close, and volume, and relationships with each of those inputs delayed four periods back. I likewise did this for inter-market correlations. There was lots of manual work with Excel. All this has become moot since Eureqa does this in a flash. I have been able to substantially reduce the number input variables.

4) Most importantly, Eureqa is finding predictive relationships that had simply been impossible to find.

Michael, it is a delight to be alive at 70, and see the breathtaking leaps in technology. I programmed a little in college, utilizing punched cards; I bought a cutting-edge four-function electronic calculator before finals in 1971 for $345 (a Sharp EL-8) and thought it was a bargain. And now there is Eureqa…….Wow!!! I can appreciate some of the incredible differences this product will continue to make in so many areas. Thank you so much for what you and your team have created, for sharing it in beta form in the past, and for still keeping it within reach for individuals.

With much appreciation,

Bill Russell

Topics: Big data, Eureqa, Financial Services

Follow Me

Posts by Topic

see all