Blog

How to Master Business Planning with Time Series Forecasting

Posted by Jess Lin

17.01.2017 10:15 AM

Every analyst report, news article, and business conference has drummed into our collective minds that predictive analytics is the way of the future. If you can predict future sales, you can funnel that knowledge into driving optimal sales and business operations. Smart businesses are leveraging big data insights to leave their competitors in the dust, or at least so they say.

As this new time series forecasting tech makes its way to the enterprise, trumpeting billions of dollars behind it, you feel ready for the challenge. You’ve learned a bewildering array of new vocabulary, sat through endless meetings on empowering a data-driven mindset, invested in top of the line data warehouses, and cracked your allotted share of jokes on the size of your Big Data. Best of all, you’ve somehow even managed to hire a crack data science team. Profit?

Not so fast.

The Three Stages of Data Science....png

Before those stacks of paper come rolling in and you deliver on all your ROI promises, let’s walk through this step by step.

  • Phase 1: Connect a beautiful data pipeline of well-cleaned and transformed data (yeah, we all wish) over to your data scientists.
  • Phase 2: Convert data into actionable insights through predictive analytics.
  • Phase 3: Feed those insights into well-executed, shareable dashboards that enable data-driven business planning decisions.

Even skipping over the problematic phases 1 and 3, let’s level set on what your data scientists can do. Even a team of rock star data scientists has limits, and working at top efficiency can still only produce a limited number of predictive models. Do you want dependable results at frequent, regular intervals? Best you can get is a single prediction for overall sales. Do you want per-unit forecasts across thousands of SKUs? Be prepared for a wait and infrequent forecast updates. Bringing in consulting firms doesn’t solve the problem either – fees can run into the multi-million dollar range for a years-long engagement that still only addresses part of your needs.

This is the situation that a major car manufacturer found themselves in when they came across Eureqa. With ever-increasing and increasingly-complex data streaming in from all sources, they had outgrown their existing, manual methods of data analysis and needed a new solution that could handle their data and provide innovative predictive analytics. Nutonian’s AI-powered modeling engine, Eureqa, seemed like the perfect fit.

Can Eureqa automatically ingest and analyze hundreds of data sources at once? Check ✓

Can Eureqa provide hundreds of highly accurate sales forecasts on a monthly basis across all levels from operations, from high-level national forecasts all the way down to individual dealers? Check ✓

Can Eureqa allow end-users to manipulate model results and simulate the impact on the sales forecast given different potential scenarios? Check ✓

Can Eureqa isolate and quantify the impact of individual attributes to the sales forecast? Check ✓

Can Eureqa identify and automatically recommend highest-impact actions for both a corporate strategist as well as an individual dealership? Check ✓

So we did all of that, and wrapped it up into a user-friendly application that works across the entire business team. With a marketing budget that runs into the millions, corporate can optimize their ad spend mix by using Eureqa to pinpoint where to increase spend and where spend can be safely decreased. At the same time, managers can use Eureqa to evaluate upcoming sales forecasts for their dealerships and design the best action plan for success, both short-term and long-term. By providing actionable insights and leveraging AI to automate interpretable predictive modeling, Eureqa is the “wheel deal” for this client.

Eureqa doesn’t stop at forecasting sales, either. Sales forecasts can be leveraged into powerful use cases like optimizing staffing, pricing, inventory, supply chain, marketing, and store expansion. Eureqa’s fundamental time series forecasting approach can also be applied to detecting financial fraud, diagnosing equipment predictive maintenance, building a portfolio trading strategy, and more.

What could you do with thousands of accurate, automated, and actionable time series forecasts? Reach out to us at contact@nutonian.com to see how we can work together!

Topics: Business Planning, Eureqa, Time Series Forecasting

Eureqa Hits Wall Street; Automatically Identifies Key Predictive Relationships

Posted by Jason Kutarnia

01.12.2016 10:46 AM

As a team of data scientists, analysts and software developers, we didn’t expect to be praised as financial gurus. But in an industry of ever-present uncertainly and huge financial gains and losses at stake, Eureqa, the dynamic modeling engine, displays a unique competitive advantage in the technology stack: the ability to quickly derive extremely accurate and simple-to-understand models that predict what will happen in the future, and why.

Typically, Wall Street employs elite squadrons of quants and analysts to build models to make forecasts about where individual stocks and other financial instruments are headed. Some firms, such as the consistently elite hedge funds, make delightful profits by “beating the market,” i.e., outperforming an industry-standard index like the S&P 500. Other financial institutions make their money simply off of the fees they charge for commissions. The laggards have significant room for improvement, where instead of leveraging only industry news and well-known metrics like return on equity, price/earnings ratio and idiosyncratic volatility, they could use stockpiles of data to search for signals and early indicators that an investment is primed to tumble or soar. Hunches and over-simplified metrics should be a thing of the past, and the proof should be in the pudding (the data). Some things, like natural disasters and leadership changes, are not always part of the data, but for everything else…there’s Mastercard. Err, Eureqa.

And for those overachievers – the hedge funds, the private wealth management firms, the day traders – who think they have mastered their own domain, we’re here to tell you, there’s a lot of room for improvement. Financial models are time-consuming to build, often to the tune of weeks or months to refine…and meanwhile, the markets, whether moving up or down, are making people money while you’re on the sidelines crafting your models. In addition to the time sink, manual human-made models with tools like R and SAS are not as accurate as they come, nor are they easy to interpret. The result is that firms are leaving millions on the table, and not understanding why the markets or assets behave as they do. It’s one thing to predict that real estate will beat the market in 2017, based on an algorithm that contains 2,000 variables and mind-numbingly complex transformations of those variables. But what if I could accurately predict that real estate in the Northeast U.S. will appreciate 10-12%, while I should leave the Midwest untouched, and the “drivers” of this growth will be 4 truly impactful variables: demographic growth of Millennials moving into the cities, wage increases, job growth, and a slowing of new construction permits. I could not only make more money, but I could justify all of my investments beforehand with a comprehensive understanding of “how things work.”

In order to validate Eureqa’s approach to a major investment firm, I built a simple trading strategy using the stocks in the S&P 500. The goal was to forecast whether a stock’s excess monthly return – the difference between the stock’s return and the overall S&P 500 return – would be positive or negative. In our strategy, we bought a stock if Eureqa predicted its excess return would be positive, and we shorted any stocks Eureqa thought would be negative.

Immediately, the client saw the enormous value of Eureqa. Leveraging publicly available data sets through 2014, in a matter of a few hours Eureqa created classification models unique to each industry (retail, finance, technology, etc.), and we plugged individual companies into the models to predict whether the stock would achieve excess return for 2015. We then hypothetically created a simple, equal portfolio of the predicted “overachievers”. Remarkably, Eureqa’s anticipated winning portfolio achieved a compound excess return of 14.1% for the following year, compared with the S&P 500’s disappointing -.7%. Not only was our portfolio’s performance exceptional, but so was our fundamental understanding of the causes of its success. We could convey to our hypothetical clients, bosses and others that not only did our strategy work this year, but it’s likely to work again next year, because some of the key drivers of excess returns for stock X are variables Q, R, S, T, U and V, and this is how it’ll move in the context of the current economy. In a matter of hours, with Eureqa at my side, a graduate student in tissue motion modeling transformed into a powerful financial analyst with a theoretical market-beating investment portfolio. Now, imagine what this application could do with even more data, and in the hands of a true industry expert…

Topics: Eureqa, Financial Analysis, Machine Intelligence

Machine Intelligence with Michael Schmidt: Searching data for causation

Posted by Michael Schmidt

27.07.2016 10:03 AM

The holy grail of data analytics is finding “causation” in data: identifying which variables, inputs, and processes are driving the outcome of a problem. The entire field of econometrics, for example, is dedicated to studying and characterizing where causation exists. Actually proving causation, however, is extremely difficult, typically involving carefully controlled experiments. To even get started, analysts need to know which variables are important to include in the evaluation, which need to be controlled for, and which to ignore. From there, they can build a model, design an experiment to test its causal predictions, and iterate until they arrive at a conclusion.

Proving causation relies heavily on these smart assumptions. What if you forgot to control for age, demographics, or socioeconomic conditions? It’s difficult to figure out how to start framing the problem to analyze causal impact. But this is a task that machines were born to solve.

There are two important steps required to identify causation: 1) among many possible variables, finding the few that are actually relevant, and 2) given a limited set of variables, executing the transformations needed to reveal the extent of each variable’s impact.

Determining_causation_with_machine_intelligence.jpg

For the first time, there exists software that helps companies reliably determine causation from raw, seemingly chaotic data.

People often use Eureqa for its ability to start from the ground up and “think like a scientist,” sifting through billions of potential models, structures, and nonlinear operations from scratch to create the ideal analytical model for your unique dataset – without needing to know the important variables or model algorithm ahead of time. Eureqa’s modeling engine effectively generates theories of causation via its processes of building analytical models from a dataset. Eureqa doesn’t attempt to prove causality on its own, but instead yields a very special form of model that can be interpreted physically for causal effects.

One of the biggest open problems in machine learning (and analytics in general) is avoiding spurious correlations and similar non-causal effects. In fact, there’s likely no perfect solution despite the advances we’ve made; ultimately a person needs to interpret the findings and provide context not contained by the data alone. One of the most-used visuals in Eureqa is the covariates window and the ability to block and replace variables from a model – features we’ve added specifically to interact with users to model complex systems.

There is some exciting research taking place, however, connecting Eureqa to live biological experiments to automatically guide experimentation and test predictions. While this research is still on-going, perhaps a physical robot scientist is around the corner.

Topics: Causation, Eureqa, Machine Intelligence

The “First Mover’s” Analytics Stack, 2015 vs. 2016

Posted by Jon Millis

01.07.2016 10:00 AM

The irony of data science is the glacial and blazing speed at which the industry seems to move. It’s been more than 10 years since the origin of the phrase “big data”, and yet what we initially set out to accomplish – extracting valuable answers from data – is still a painstaking process. Some of this could be attributed to what Gartner refers to as the “Hype Cycle”, which hypothesizes that emerging technologies experience a predictable wave of hype, trials and tribulations before the they hit full-scale market maturity: technology trigger → peak of inflated expectations → trough of disillusionment → slope of enlightenment → plateau of productivity.

The true skeptics call it all a data science bubble. But answer me this. If we’re in the midst of a bubble, how can we explain the sustained, consistent movement of tech luminaries and innovators into the market over the course of years and years? Sure, a healthy economy is full of new competitors competing for market share, creative destruction, and eventual consolidation, but take a look at this diagram and try to explain how so many people could be so wrong about data science? It’s hard to imagine we’re in a bubble when all around us is an indefinitely growing ecosystem of tools, technologies and investment. As we’re well aware, nothing bad happened after heaps of money were piled into mortgage-backed securities in the early 2000s, and oil speculators have made a killing off of $5/gallon gas prices in 2016.

We kid, we kid. Of course there are illogical investments and industries that miss, but we maintain our belief that there is astounding value in data. Not all companies have capitalized on it yet, but the problems, the dollars, and the benefits to society as a whole are real. Data science is here to stay.

With an ecosystem now wildly overwhelming with tools, approaches and technologies, how can we understand general market trends? What kinds of tools and technologies make up a typical company’s analytics “stack”? More importantly, where are the “first movers” moving and making investments to capitalize on data? To find out, we share general insights we’ve gleaned from talking with our customer base and clients, a mix of Fortune 500 behemoths and data-driven start-ups.

Here’s what the 2015 analytics stack looked like:

Data_analytics_stack_2015.png

Let’s take an outside-in approach, beginning with the raw data and getting closer and closer to the end user.

Data preparation – The cleansing layer of the ecosystem, where raw streams of data are prepped for storage or analysis. Ex., Informatica, Pentaho, Tamr, Trifacta

Data management – The data storage and management layer of the ecosystem, where data sits either structured, semi-structured or unstructured. Ex., ArcSight, Cloudera, Greenplum, Hortonworks, MapR, Oracle, Splunk, Sumo Logic, Teradata Aster, Vertica

Visualization – The visualization and dashboarding layer of the ecosystem, where business analysts can interact with, and “see”, their data and track KPIs. Ex., Microstrategy, Qlik, Tableau

Statistical – The statistical layer of the ecosystem, where statisticians and data scientists can build analytical and predictive models to predict future outcomes or dissect how a system/process “works” to make strategic changes. Ex., Cognos, H2O, Python, R, RapidMiner, SAS, SPSS

Simple enough, right? The most data-savvy organizations make it look like a cakewalk. But take a closer look, and you’ll notice there’s a significant difference between the outer two “orbits” and the inner orbit: the inner orbit is fragmented. This does not fit with the smooth flow of the rest of the solar system.

Why are two systems occupying the same space? Because they’re both end-user analyst and data science tools that aim to deliver answers to the business team. Nutonian’s bashfully modest vision is to occupy the entire inner sphere of how people extract answers from data, with the help of “machine intelligence”. While Nutonian’s AI-powered modeling engine, Eureqa, plays nicely with statistical and visualization tools via our API, we’re encouraging companies who are either frustrated by their lack of data science productivity or who have greenfield projects to invest in Eureqa as their one-size-fits-almost-all answers machine.

Our vision is to empower organizations and individual users to make smart data-driven decisions in minutes. Eureqa automates nearly everything accomplished in the statistical layer and the visualization layer of the analytics stack – with the exception of the domain expert himself, who’s vital to guiding Eureqa in the right direction. The innovative “first movers” in 2016 are putting the data they’ve collected to good use, and consolidating the asteroid belt of tools and technologies banging together in the inner orbit of their solar systems. It’s the simple law of conservation of [data science] energy.

Topics: Analytics stack, Big data, Eureqa, Machine Intelligence

Eureqa vs the Kentucky Derby: Triple the Hat, Triple the Fun

Posted by Jess Lin

07.05.2016 01:07 PM

 

KentuckyDerby2016.jpegAfter 2 years in a row of coming up roses, we’ve got our sights set on a 3rd year of success with the Kentucky Derby. We’ve got our handicapping data from Brisnet.com and we’ve prepped with plenty of mint juleps (drinks help you bet smarter, right?). Now we’ve spent the past couple days combining Eureqa’s data discovery horsepower with the raw horse power on the track to find out who’ll be in the winner’s circle for the 142nd running of the Kentucky Derby.

Rather than skimming through the daily racing form  before madly rushing the tellers with our bets, we turned to our tame A.I.-powered modeling engine, Eureqa, to automatically build, evaluate, and analyze billions of models to discover the most predictive factors. Eureqa’s machine intelligence lets us read and interpret the models, helping us steer the engine towards fruitful paths and away from red herrings. In the end, we found a model that combined these 5 key factors:

  • Standardized live odds probability
  • Speed over the past two races
  • Post position
  • Racing style
  • Track conditions

So where does that leave us?

Eureqa’s Top 5:

  1. Nyquist
  2. Gun Runner
  3. Exaggerator
  4. Creator
  5. Mohaymen

Want to try your paces with your own data against Eureqa? Come talk to us — and in the meantime, check back with us after the race to see how our predictions have panned out. With Eureqa at the wheel, we’re sure we’ll be riding “derby”.

Topics: Eureqa, Kentucky Derby

Follow Me

Posts by Topic

see all