Blog

Nutonian Shows Off Data Science Automation to Congress

Posted by Jay Schuren

14.08.2014 10:00 AM

In June Nutonian, along with a number of leaders from national research facilities and companies, was invited to present to Congress to showcase the scientific and technical advancements happening across the country. This conference was organized by the National User Facility Organization (NUFO), which represents national research facilities open to the public.

NUFO_presents_to_Congress

As the event was focused on scientific innovation, we took the opportunity to show off how our machine intelligence software can be used to significantly improve the underlying performance of characterization equipment – cameras, medical imaging equipment, scanning electron microscopes, etc. – at the heart of many medical and scientific breakthroughs.

Instead of just looking for the common image artifacts or distortion present in imaging equipment (i.e., fish eye lens), we targeted the harder questions: 1) Does changing the imaging equipment settings change your answer? 2) Can Eureqa identify the underlying relationships governing how the image is being distorted? 3) Can we use these relationships to significantly improve the accuracy of measurements made with the equipment? Working with both x-ray image data from the Advanced Photon Source at Argonne National Laboratory and scanning electron microscopy data, we showed that the answers to these questions are yes, yes, and absolutely (see Tuesday’s blog post for more detail). Our findings showed that for the systems considered, we were able to reduce the measurement error by >10x.

We had captivating conversations with real decision-makers like Dr. Patricia Dehmer, Deputy Director of Science Programs for the Department of Energy, and a few Representatives, most notably a former entrepreneur, then physicist, then Congressman Representative Bill Foster from Illinois. All discussions started at the distortion tool we were showing off, but quickly moved on to our vision for the future: automated data science lowering the barriers for anyone leveraging complex analysis to gain understanding from their data. This could help dramatically improve a variety of circumstances today, ranging from more targeted, lower-cost healthcare treatment to energy efficiency to crime reduction. The biggest impediment to making the world a “smarter planet” is our inability to quickly discern what the waterfalls of data actually mean and which factors actually matter. Flow that data through Eureqa, however, and the most important elements will automatically come to light.

As a society, what if we could isolate the true drivers of obesity? What if we could distill the most effective compounds to treat harmful illnesses, become more precise about electricity generation, or predict and eradicate a famine – before locals even knew it was imminent? Problems like the distortion issue we presented often go unsolved, and worse, unidentified. By harnessing the power of machine intelligence, we can accelerate the next wave of discovery, not only for businesses, but for people. Happy modeling.

Topics: Big data, Eureqa, nutonian, U.S. Congress

Nutonian Piercing the Veil of Distortion Over Mission-Critical Images

Posted by Jay Schuren

12.08.2014 10:00 AM

Screen_Shot_2014-08-08_at_2.36.47_PMImaging and advanced characterization are at the heart of a range of industries across aircraft component inspections, medical imaging, CPU manufacturing and more. As society continues to push the envelope in technological innovation, demand for the best quantitative characterization possible is always present. Distortion, or warping of image data, arises from the imaging equipment itself; common examples are fish eye lenses or fun house mirrors. A little-discussed issue is that changing the instrument settings often changes the distortion – limiting peak characterization performance across fields, reducing accuracy and escalating time and effort for discovery.

The majority of the innovation in imaging systems has been focused on showing ever-smaller features. But those expensive high-magnification machines still have distortion issues that no one has addressed. Inspired by work with the Air Force Research Laboratory, Nutonian has developed tools able to dynamically eliminate image distortion. Applying Nutonian’s data science automation engine, Eureqa, allows users to rapidly identify specific relationships between the instrument settings/environmental conditions and the limiting distortions.

Screen_Shot_2014-08-08_at_2.34.20_PM

Now, instead of pretending that image distortion either doesn’t exist or remains static across different instrument settings, companies can computationally model, predict and calculate behavior at a high degree of accuracy based on the measurements they take from an image. Understanding these causal relationships gives users the ability to “unwarp” image distortions, enabling accurate insight and peak performance when it really matters. The implications could mean saved lives and >10x improvements in quantitative measurements for systems such as Scanning Electronic Microscopes, realizing peak performance even in outdated equipment.

Whether companies are looking for cracks in aircraft turbine blades or tumors in a mammogram, current limits of characterization systems govern the status quo for early identification. Applying Eureqa to a Scanning Electron Microscope and a mammography detector over a range of conditions resulted in >10x improvements in quantitative measurements. Gain competitive advantage with access to improved detection systems that will save lives, reduce costs and accelerate the development of next generation products.

Topics: Advanced Techniques, Big data, Case study, nutonian, U.S. Air Force

#DataDrivenNYC: Entrepreneurs should strive for simplicity, clarity, and customer proximity

Posted by Jon Millis

24.07.2014 10:00 AM

Data Driven NYC is a community of tech enthusiasts passionate about big data, data technologies, and data-driven products and businesses. The community hosts monthly meet-ups featuring presentations from start-ups and entrepreneurs. At its most recent get-together, Chris Lynch, a serial-entrepreneur-turned-VC, sat down with the group to speak about current tech market conditions and how to get an idea funded. 

“If you build it, he will come.” The classic quote from Phil Alden Robinson’s Field of Dreams also doubles as an inspiring metaphor for entrepreneurs seeking VC funding: build something new, and the money will flow like wine.

The problem, according to Chris Lynch, a big data industry luminary, is that “too few people are building real products from their science projects, and those who are building products are focusing on the wrong market”. Now a partner at Atlas Venture in Cambridge, Lynch, to say the least, has a well-rounded perspective on what it takes to build market-disrupting technologies. Before Atlas, he was CEO at Vertica (acquired by HP) and Acopia (acquired by F5) and SVP of Sales and Marketing at ArrowPoint (IPO and acquired by Cisco).

Christopher_Lynch

Lynch told Data Driven NYC members that massive opportunity exists for big data start-ups that can deliver on three things the current market craves:

Opportunity #1: Simple analytics

“It all starts with being able to solve a problem,” Lynch said. “Don’t start another NoSQL database. If you’re thinking about a new platform, don’t waste your time. It’s been built. If you can take the power of big data from the one-percenters and drive it to the dummies like me, if you can put the power of insight and analytics into your smart phone and push the dummy button, that’s when big data can be transformational.

Opportunity #2: A clear value proposition

“At Vertica, we had to simplify our message so that the greatest number of people could absorb it. I pushed the engineering team to help me understand why a column-store really mattered. It boiled down to: it was faster. It was the first real-time database for people to make decisions that made or saved money for their companies. From there, we re-messaged Vertica around being a real-time database and what that meant for life sciences, for retail, etc.”

Opportunity #3: Customer proximity 

While the big data market has gotten crowded, Lynch said, the focus has been almost entirely on infrastructure instead of applications.

“Most of what gaming company Zynga does – identifying mavens and selling them virtual goods – is done using Vertica. But because we didn’t build an application that touched the user, we were dis-intermediated in a big way. The closer you can get to the customer, the further you can move up the stack, the more opportunity there is for monetization, the more opportunity there is for value.”

And while he believes some start-up valuations are overheated, Lynch brushes off skeptics who think that big data is just a flash in the pan.

“I’ve heard the term ‘hype’ used as related to big data. I don’t think there’s a hype cycle around big data. Big data is absolutely for real. It’s not hype. You think about it, anything with an on/off switch is generating data today – that’s a fact. Now, how we create value from that, utility from it, and cure cancer and do real stuff with it, that’s a different story, and that’s just promise unfulfilled. That’s not hype.”

Among Lynch’s favorite investments? A Somerville-based technology company called Nutonian…where data science is bundled up into one easy-to-use, customer-facing application.

Topics: Big data, Data Driven NYC, Meetup, nutonian

How we beat stats guru Nate Silver – with only 2 days and a $7.56 budget

Posted by Jon Millis

15.04.2014 10:00 AM

April is the pinnacle of excitement for American sports fans. Baseball stadiums lift the tarps off freshly groomed fields, NBA and NHL teams begin their playoff quests for championship trophies, and of course: the college basketball world witnesses the last stretch of thrilling (and seemingly unpredictable) postseason play.

The “unpredictability” of the NCAA tournament is one of the primary reasons it’s so entrancing. While sports networks like ESPN employ college basketball “gurus” who form predictions based on some combination of gut feeling and statistics, their historical picks suggest they aren’t much better than the average American in picking tournament outcomes before the games tip off. 

One man who has drawn much fanfare for his pure statistics-based approach to predicting everything from player performance in Major League Baseball to pristine state-by-state forecasts for the 2012 presidential election (which put most analysts to shame) is Nate Silver. Silver began to get in on the fun of the NCAA tournament by unveiling his first stats-powered bracket in 2011. He has participated in each year since and has not failed to impress, correctly predicting the winners in both 2012 and 2013. Last month, Silver self-deprecatingly quipped that “this year’s NCAA basketball tournament is designed to make me look dumb. There aren’t any favorites.”

uconn-huskiesBut despite the anticipated madness, we decided to see how we’d fare against the best of the best, Mr. Silver himself, using Nutonian’s own Eureqa, a cognitive computing platform that automatically discovers cause and effect relationships within complex data. We spent a few hours Googling for publicly available team statistics, expert rankings, and computerized rankings, and fed the data into Eureqa to model the “physics” of what determines the winner of a postseason college basketball game.

As noted in last week’s blog post, Eureqa ran for two hours on nine cloud servers, returning a model with 75% predictive accuracy. After adding a few additional parameters, we were up to 80%. Our equation pinpointed six variables with significant predictive power: assist-to-turnover ratio (regular season), average scoring differential (regular season), field goal percentage (regular season), three-point field goal attempts (regular season), active win/loss streak, and tournament seed. For each match-up in the 2014 bracket, we ran a simple “symmetric” simulation, plugging in statistics for both schools and deeming the team with the higher output the winner.

How did we do? In spite of the wildest tournament in recent memory, including the lowest-combined seeds ever to reach the championship, Eureqa did impressively well. Mr. Silver’s model, which perhaps required weeks of manpower and leveraged hundreds of variables, correctly predicted 4/8 teams to reach the “Elite Eight”, 1/4 teams to reach the “Final Four”, and 0/2 teams to reach the “Championship”. Eureqa, which required two days of casual work by one of our software developers (and for the sake of time excluded many potentially interesting variables included by Silver, such as pre-season rankings, player injuries, and geography), correctly predicted 4/8 “Elite Eight” teams, 2/4 “Final Four” teams, and 0/2 “Championship” teams.

Without having to know anything about our input variables, their relative importance, or even the sport itself, we became basketball gurus equipped with an informative mathematical model that identified the core “drivers” of tournament wins, and beat out one of the most prominent statisticians in the world. The total cost? $7.56 for compute time in the cloud and a handful of chocolate-covered almonds to keep Dylan happy while he input data.

If Eureqa has this sort of predictive and prescriptive capability for something as volatile as college basketball games, imagine the impact it could have in your business: possessing the ability to not only understand what will happen, but when it will happen and why it will happen. This simple example switches from “Wisconsin will beat Arizona in the tournament, because it will excel at these specific aspects of the game” to “We should charge Jon this amount for an insurance premium, because here are the 12 variables that truly matter in assessing his risk.” Or, “We should keep these products in stock next month to maximize revenue, because here are our most important sales drivers during unseasonably warm winters.”

Mr. Silver, we’ll see you at the next tournament. In the meantime, download your free trial of Eureqa and let us know what you think.

Topics: Big data, Eureqa, March Madness, Nate Silver

March Madness Meets Moneyball

Posted by Jess Lin

17.03.2014 10:30 AM

This Thursday, 3/17, Nutonian will be travelling to Chicago for an entertaining Meetup where they’ll take on Warren Buffett’s Billion Dollar March Madness Challenge. In case you haven’t heard, Warren Buffett is offering one billion dollars ($1,000,000,000) to anyone who fills out a perfect NCAA basketball championship bracket.

We will be building our NCAA bracket with Eureqa™ using data from Kaggle’s March Machine Learning Mania competition, as well as a number of other sources. Using Eureqa™ to analyze the data allows us to combine domain expertise with the power to automatically derive meaning from the massive quantity of data available. Leave subjective opinions out and uncover the hidden keys for filling out a perfect bracket!

If you’re in the Chicago area, come out and join us! Otherwise, stay tuned for our submitted bracket and let us know what you think.

To join the Chicago meetup, register here.

March MadnessEureqa Cognitive Computing

Topics: Big data, Eureqa, March Madness

Follow Me

Posts by Topic

see all