Blog

Machine Intelligence Strips Off Our Data Science Blinders

Posted by Guest Author

07.10.2015 10:00 AM

by Dan Woods

In our increasingly digital lives, we have been trained to trust the way that technology works. That is, right up until it doesn’t.

Consider a GPS. A lot of powerful technology is used to correctly make an optimal GPS route. Few people understand why their GPS system chooses the routes that it does, but we’ve come to simply accept its recommended navigation directions because they tend to be good enough. It’s OK even when the predicted route doesn’t work – say, it prompts you to turn the wrong way on a one-way street or you run into construction and need to make a detour – we have corrective mechanisms in place to override its instructions.

However, accepting blinders on data-driven solutions can be dangerous. The higher the cost of a mistake, the higher the consequences are for false positives and false negatives. Have you ever started internet sleuthing and found a symptom checker that declared that your runny nose and painful headache meant you had cancer? Instead of being gently let down by your exasperated doctor the next morning, imagine if the hospital immediately enrolled you in chemotherapy treatment based solely on this output. While this is an extreme example, outsourcing too much responsibility to machines could lead to mistakes just as costly.

A fundamentally new approach to data science is needed to accomplish this partnership – one that allows each side to equally communicate ideas and strategies to each other, rather than one side dictating the constraints of the connection. This approach is machine intelligence, with the driving philosophy that the partnership between man and machines is greater than the sum of its parts.

Nutonian’s machine intelligence system, Eureqa, doesn’t put blinders on users. In fact, the system purposefully shows its work, surfaces user-friendly ways to reach advanced results, and encourages rapid iteration to incorporate the user’s domain expertise into the results. Regardless of technical expertise, users all across the organization can use Eureqa to discover new business strategies, while retaining the ability to audit and correct sub-optimal paths before committing to them.

The abundance of data in the business world needs more than a one-sided discussion. Use machine intelligence to open up a new horizon of possibilities in the golden age of analytics.

 


ABOUT THE AUTHOR

Dan Woods is CTO and founder of CITO Research. He has written more than 20 books about the strategic intersection of business and technology. Dan writes about data science, cloud computing, mobility, and IT management in articles, books, and blogs, as well as in his popular column on Forbes.com.

Topics: Eureqa, Golden Age of Analytics, Machine Intelligence

Intelligent Partners: Man and the Machine

Posted by Guest Author

30.09.2015 10:30 AM

by Dan Woods

When it comes to the creative processes inherent in predictive modeling, it is time for a new paradigm, one in which the user and machine learning work in tandem to achieve better results than could be achieved working separately. Nutonian’s vision for this is machine intelligence.

What’s important to understand is how collaboration between people and machine intelligence powers statistical creativity. However, this new paradigm first requires unlearning some of the patterns established by early forms of artificial intelligence.

Consider a game of chess. A chess master has a great memory and can evaluate a lot of positions, but that’s child’s play compared to Deep Blue. This chess-playing computer is known for being the first piece of artificial intelligence to win both a chess game and a chess match against a reigning master. Deep Blue can evaluate every possible move it can take at each turn, considering 200 million positions every second.

While Deep Blue’s power to play great chess is an awesome achievement, we need to put it into context. Deep Blue’s wins were the culmination of 12 years of development towards an extremely specialized task, and its potential moves were reliant on a static list of previous games. The computer can’t invent new moves that weren’t already in its database, and it would have to start back from square one if the rules of chess ever changed.

Imagine instead that the chess master and Deep Blue were on the same side of the table, working together. What if the two could communicate? Collaboratively creating and vetting potential strategies – one using his hard-earned expertise to handle new information and uncommon situations and the other using its vast database to discover optimal strategies and provide a sounding board? Wouldn’t this combination be more powerful?

The machine intelligence paradigm puts man and machine learning on the same team as equal partners. While Nutonian’s Eureqa automatically generates potential solutions through a powerful evolutionary search process, it communicates how it arrived at its results and flexibly accommodates outside guidance. This transparency allows anyone to incorporate their expertise into the system and seed the next round of discovery.

Nutonian believes that the best results happen when the user and the machines work together as partners in the process of invention. This productive partnership between man and machine heralds the golden age of analytics.

 


ABOUT THE AUTHOR

Dan Woods is CTO and founder of CITO Research. He has written more than 20 books about the strategic intersection of business and technology. Dan writes about data science, cloud computing, mobility, and IT management in articles, books, and blogs, as well as in his popular column on Forbes.com.

Topics: Eureqa, Golden Age of Analytics, Machine Intelligence

Are Machines Partners or Foes?

Posted by Guest Author

22.09.2015 10:30 AM

by Dan Woods

The exploitation of data in the business world demands a new data-driven approach to innovation. Human-driven data analysis needs to make way for new machine-driven methods capable of handling access to the new abundance of data. However, much hysteria has been recently directed at the dangers of big data and over-reliance on Artificial Intelligence (AI). Is this fear warranted, or is it just much ado about nothing?

Some science and technology experts have called AI “our greatest threat,” one which may spell “the end of the human race.” In its specific use with business data, many more have decried the perils of using “big data” to predict the future. The traditional data science approach is blind to unquantifiable factors and can be fooled by misleading correlations, but many businesses deal with sensitive subjects that require informed judgment and imprecise factors. On a more personal level, if even a skilled career like data science can be automated by a machine, what is there left for the rest of us?

What is still important to remember is that machines have their own strengths and weaknesses, just as humans do, and both sides have important roles to play in supporting each other. Machine algorithms have the capacity to churn through endless amounts of data but are subject to the biases of how they were programmed and limited by the inputs they are given. Humans can synthesize decades of experience into bursts of creativity but struggle to visualize data once it goes beyond three dimensions.

Here is where machine intelligence comes in. Machine intelligence allows its users, regardless of technical expertise, to harness and guide the power of today’s virtually unlimited compute power while encoding nuance and domain expertise from the user into the results. While automation allows machine intelligence to create new predictive models, the results are specifically designed to be transparent, interpretable and interactive. The end user can investigate how the system arrived at its conclusions and easily kick out false correlations, recognize mismatches with business realities, audit the robustness of potential models to assuage stakeholder concerns, and export the results into any number of other systems for further analysis.

Instead of perpetuating the machine vs. man rhetoric, Nutonian’s introduction of machine intelligence establishes a new machine-as-partner paradigm. Allowing the strengths of each side to bolster each other’s weaknesses empowers businesses to scale their data science initiatives across all levels and functional areas, exponentially increasing their analytical capacity to answer high-value questions. Augmentation, not replacement, is the key to the golden age of analytics.

 


ABOUT THE AUTHOR

Dan Woods is CTO and founder of CITO Research. He has written more than 20 books about the strategic intersection of business and technology. Dan writes about data science, cloud computing, mobility, and IT management in articles, books, and blogs, as well as in his popular column on Forbes.com.

Topics: Eureqa, Golden Age of Analytics, Machine Intelligence

The Golden Age of Analytics

Posted by Guest Author

17.09.2015 10:30 AM

by Dan Woods

The supply chain of data in the modern world has evolved beyond careful curation in controlled data warehouses. A fundamental change to the analytic workflow is needed in order to make advanced analytics available to a mass audience.

Attaining the golden age of analytics requires the democratization of advanced analytics. We need systems that separate the ability to create data science analysis from the ability to consume it, allowing anyone to intuitively interact with the results. In this golden age, users shouldn’t need to know the difference between a decision tree and logistic regression, or debate the benefits of R^2 over MAE, in order to create personalized action plans for thousands or millions of products. The growing need for predictive models to uncover these hidden, data-driven business solutions will continue to outstrip the limited numbers of data scientists who can create them.

Nutonian’s machine intelligence leads the way to the golden age of analytics. Companies who adopt machine intelligence can automate the discovery of analytic models, bringing predictive modeling out of the shadows and into the light. Creating complex, non-linear models is no longer a virtuoso activity with machine intelligence, but something the average business user can accomplish on his own to quickly generate viable business actions. As a data science productivity tool, machine intelligence also empowers already proficient data scientists to automate menial data tasks and extend their existing abilities.

How is this possible? Instead of simple, incremental improvements over existing, decades-old data science processes, machine intelligence combines the virtually unlimited computational power available today with a proprietary evolutionary search process to take a fresh approach to analytics. Hand in hand with machine intelligence, anyone can: 1) sift big data down to the right data, 2) generate completely new models to describe previously unknown systems, 3) optimize the complexity and application of a solution for the exact situation at hand, and 4) incorporate human expertise and creativity into the machine through interactive iteration – all within one user-friendly system.

Machine intelligence is key to unlocking the golden age of analytics, as it transforms predictive modeling into a company-wide application for developing optimal strategies and driving sustainable competitive advantages.

 


ABOUT THE AUTHOR

Dan Woods is CTO and founder of CITO Research. He has written more than 20 books about the strategic intersection of business and technology. Dan writes about data science, cloud computing, mobility, and IT management in articles, books, and blogs, as well as in his popular column on Forbes.com.

Topics: Eureqa, Machine Intelligence

How to Anchor Your Fantasy Football Team: Using Advanced Analytics to Pick the Best Available Quarterback

Posted by Jon Millis

10.09.2015 10:40 AM

Fantasy football: where owning a sports franchise is within anyone’s reach, a year of pride is put on the line, and grown adults cry…more than once…almost every Sunday. Yet for many football fans, we wouldn’t have it any other way. Except for you, Reggie Bush. 550 total yards…FIVE HUNDRED AND FIFTY YARDS?!? I COULD JOG BACKWARDS FOR MOR—let’s move on.

I’ve been playing competitive fantasy football for more than 10 years with 25 friends, all avid NFL fans. We play a “Champions League”-style system with a winners league and a consolation league, moving bottom and top finishers between leagues to reward high-performers and punish the slackers. Yes, to the chagrin of our bosses, families, and friends, we take our imaginary sports pretty seriously.

I’m the Peyton Manning of my league. I’m a fantastic regular season performer. Then the playoffs roll around, and every position player on my roster forgets he’s supposed to be good at football. Nutonian’s data scientists tell me this is likely either due to expected statistical variation in player performance or bad karma for wearing flip-flops to the office. I’m convinced it’s a curse from when I told my parents I knew multiplication but was actually holding flashcards under the dinner table. (Sorry, Mom and Dad. I learned them eventually, though. That’s what counts. Right?…Right???)

This fall, I’ll take any advantage I can get. One of the biggest strategic changes over the last five-plus seasons has been a power-shift from position players to quarterbacks. Four things have contributed to this development:

1)    Running backs: Most NFL teams have shifted to a two- or three-running back system, lowering season averages for feature backs but increasing the total number of viable fantasy options.
2)    Wide receivers: Receivers are running wild. There are more 1,000-yard receivers now than there have ever been in the NFL. Much of this can be attributed to relatively recent rule changes that limit the amount of contact defenses can make with receivers. But despite the increase in high-performing wide receivers, they tend to be among the most hair-pull-inducing of fantasy players because their week-to-week output is wildly inconsistent. One dropped ball could be the difference between a great week and a hole through the living room wall.
3)    Quarterbacks: The same rule changes benefitting receivers also benefit quarterbacks. But in addition to having more open receivers to throw to, quarterbacks have also been granted new protections, such as the NFL’s creation of a limited “strike zone” where defensive players can hit them. Additionally, officials are nowadays much quicker to throw flags for late hits and unsportsmanlike conduct. This has made quarterbacks safer and kept them upright enough, for long enough, to deliver more passes down the field.
4)    Game plans: Teams are throwing the ball more. There are varying theories behind this. Some people think today’s QBs are drafted more “league-ready”. Some people cite the new rules. Others, the aggressiveness of new offensive coordinators. I think it’s karma from wearing flip-flops to the practice facilities.

The takeaway? You want a good quarterback. You really want a good quarterback. Good quarterbacks now account for the most points in almost every league, and the guys at the top account for many more points than the guys in the middle of the pack. Long story short, for the love of Jon Gruden, draft a good quarterback.

2015_Quarterbacks

What’s cool is that instead of relying on online research and fantasy football manuals that aggregate “expert analysis” from old guys living in their mothers’ basements, I remembered that I worked for a company that, you know…does analytics for a living. The benefit of machine intelligence over black box predictions from places like ESPN and Yahoo is that machine intelligence will not only project a quarterback’s stats, but it’ll tell me how it arrived at each prediction. Knowing the underlying components of each predictive model enables me to apply my “domain expertise” to determine whether or not the model, and the prediction, checks out. We thought it’d be interesting to a whole lot of people if we could leverage Eureqa to eat data for breakfast and have it tell us which quarterbacks we should be eyeing in our upcoming fantasy drafts. Eureqa has had pretty impressive results in the past when predicting things like the Kentucky Derby and March Madness, so we wanted to see if it could continue the trend.

We uploaded data¹ from 2007-2014 for all quarterbacks who started at least 10 games in a season. Could Eureqa build mathematical equations – “rules” – that accurately predict quarterback performance in 2015?

Sure enough, it can.² Using Eureqa, we automatically generated seven unique predictive models for pass yards, passing touchdowns, interceptions, rush yards, rushing touchdowns, two-point conversions made, and fumbles lost.³ That is, what are the signals from past data that are most influential in predicting a quarterback’s passing yards this season? And, based on those signals, how many yards will he actually throw for? We then aggregated all of the player’s predicted statistics to yield a “total points” column and stack-ranked our top 20 performers. Football fans will not only enjoy our recommended “cheat sheet” for quarterbacks; they’ll also be fascinated to learn about the most important signals that guided our predictions.

Here’s how things shake out:

At the top of the pack are the usual suspects. Aaron Rodgers is pretty good at throwing a football, it turns out, and Eureqa doesn’t expect that to change. He did just lose his best receiver in Jordy Nelson for the year with a torn ACL, so we’ll see if that affects his performance. If we’ve learned anything about Green Bay over the years, they’re a wide receiver factory, and they’ll have guys step up. It does surprise me to see Andrew Luck ranked behind Manning, though not by much (I’d be astounded if Luck threw for only 29 TDs, though he may be an outlier in the data.) It’s also surprising to see Eli Manning in the eight-spot, but machine intelligence is perhaps telling us something we don’t know. Eureqa also didn’t factor in Brady’s unquantifiable “unleash hell” variable, where former teammates vow a pissed-off Brady’s going to put on a performance for the ages to spite the clowns running the NFL (sorry, I’m a Pats fan. Had to throw in at least one shot at Goodell.). Tony Romo, Philip Rivers, Matt Stafford and Cam Newton also look low to me, but they’ve all also had a few pretty dud seasons mixed into their careers; this could be another one of those years. Overall, this list passes the sniff test pretty well. Let’s dig into how we got there.

Yards

Most Important Predictive Signals: total fantasy points (positive correlation), rush attempts (negative correlation)

Analysis: This makes some sense. A QB’s total points from last year is generally an indicator of his quality as a fantasy quarterback. A quarterback who performed well last year is likely to perform well this year, and throwing for lots of yards is a large part of that. Interestingly, last year’s rushing attempts also shows up. The more a QB runs the ball, the fewer yards he tends to throw for. Easy enough.

Pass TDs

Most Important Predictive Signals: passing TDs (positive), fumbles lost (positive), two-point conversions (positive), sack percentage (negative)

Analysis: This is where it starts to get fun. We have a limited data set, so we have to hypothesize what things Eureqa found that are incredibly valuable, or what things might’ve shown up as curious results simply because we didn’t have enough data. Last year’s pass TDs are highly predictive of this year’s pass TDs. Once a QB learns how to get in the end zone, assuming little roster turnover year-over-year, they’ll generally stay in the ballpark of last year’s performance. Fumbles lost (the more fumbles you lose, the more TDs you tend to throw) is the one that made me raise an eyebrow. Maybe we don’t have enough data. Maybe it’s right, and fumbles lost is an interesting proxy for how risky a quarterback is (they hold onto the ball longer, run more, etc.), which yields more big plays and touchdowns.

More two-point conversions could mean a few things. It could mean you’re scoring more touchdowns, which unsurprisingly means more opportunities to go for two. It could also mean you’re playing from behind more often, and teams tend to throw the ball more when they’re behind, which leads to more touchdowns. And lastly, and also pretty intuitively: don’t let your quarterback hit the ground. Sack percentage is the percentage of time a quarterback is sacked when he drops back for a pass play. A quarterback who doesn’t get sacked/pressured throws for more touchdowns. And he stays healthy. And he’s better friends with his offensive line.

Interceptions

Most Important Predictive Signals: sack percentage (negative)

Analysis: Here’s the tough one to rationalize, so I’ll first lead off why explaining this could be wrong, and then I’ll move along to sound like a crackhead and vigorously explain why it could make sense. Out of all the 150 variables that could predict next year’s interceptions, Eureqa found one that’s more explanatory than all the others, or any combination of any of the others: sack percentage. The more you got sacked last year, the less interceptions you’ll throw this year. Really, Eureqa? You could’ve found anything: pass attempts, TDs, age, complex stats that ESPN nerds code together, etc. Instead you brought me sack percentage, and made it negatively correlated. This could very well be a case of not enough data. Or, it could be that interceptions are just ridiculously hard to predict. Many of them are fluke incidents: a ball is tipped, the wind carries a throw too far, the receiver trips, the QB is hit right as he throws the ball. These are things that just can’t be accounted for with data.

But hang on a second. Season interceptions are somewhat predictable. Who’s going to throw more picks: 16 games of Geno Smith, or 16 games of Aaron Rodgers? What’s that? Geno Smith is out for six weeks? When did this happen?! (Sorry, Jets fans. I know, I know, too soon.) The point is, some quarterbacks absolutely, consistently, throw more interceptions over the course of the season. What if: a quarterback threw a lot of picks in large part because he was consistently pressured (or sacked). That means his sack percentage would be high that year. In response, 1. The quarterback works his tail off in the offseason to get the ball out of his hands more quickly, and/or 2. The coaches and front office work vigorously to improve the offensive line…which means in the following year, the quarterback has much better protection and time to throw. And most NFL QBs are talented enough to make most throws when they have the time. There is no data to support my second theory. All logic. All heavily coffee-driven logic.

Rush Yards

Most Important Predictive Signals: rush yards (positive), deep pass percentage (positive), times sacked (positive)

Analysis: I like what this one is telling me. If a QB had a lot of rushing yards last year, he’ll probably run this year. Once a runner, always a runner. Deep pass percentage is the percentage of throws in which a receiver is at least 15 yards down the field. This could be a proxy for how risky a quarterback plays. He runs, he throws deep, he goes for big plays. I don’t know if this is true, but it could be. Lastly, times sacked. If a QB was sacked a bunch last year, that’s probably an indicator that he was under a lot of pressure. When you’re under pressure, you scramble more. If you were under pressure last year, you’ll probably be under pressure again this year, and keep scrambling. Oh…that directly contradicts what I said in my interceptions analysis? Let’s move along.

Rush TDs

Most Important Predictive Signals: rush attempts (positive)

Analysis: Quarterbacks who run the ball more score more touchdowns. Scrambling quarterbacks continue to scramble when they’re in the red zone. Go figure.

Fumbles Lost and 2PT 

Most Important Predictive Signals: mean

Analysis: Our best models for fumbles lost (2.6) and two-point conversions (0.7) are constants. Fumbles lost and two-points conversions are both highly unpredictable, so guessing the average is our best bet to predict player-by-player outcomes. Many times, fumbles lost depend on fluke hits and bad bounces. Two-point conversions are relatively rare and dependent on unique in-game situations. While I know they will be a few players that deviate from the trend, I’ll gladly take averages here.

And with the above chart, and the smartest machine intelligence application on the planet informing us what’s likely to happen, we recommend applying any insider information you have about new player acquisitions, preseason performance, etc. to the mix to make the best judgment call possible. Man and machine, working together, to solve the world’s most pressing problems…like fantasy football.

 

¹ We included about 150 variables in all, ranging from the simple (age, completions, TDs) to the complex (clutch-weighted expected points added through rushes). Stats were pulled from advancedfootballanalytics.com, espn.com, footballoutsiders.com, and pro-football-reference.com.
² This would be way less interesting if it couldn’t.
³ These are the stats that earn/lose a quarterback points in standard leagues.

Topics: Eureqa, Fantasy Football

Follow Me

Posts by Topic

see all