NSF Uses Artificial Intelligence to Tackle Illegal Tiger Poaching

Posted by Jon Millis

26.04.2016 09:23 AM

AI to the rescue. Forget doomsday scenarios of robots transforming humans into paperclips. This time it’s more positive. The National Science Foundation (NSF) announced it’s turned to artificial intelligence as a critical weapon in the fight against poaching.

Whether killed for skins, “medicine”, or trophy hunting, tigers have been devastated by illegal shooters. Poachers have driven the population of wild tigers down from 60,000 in the early 1900s to just 3,200 today. And with protection relying heavily on human capital and resources that just aren’t there, governments and nonprofits have to get smarter about how they enforce the rule of law before tigers (as well as other species, forests, and coral reefs) disappear.


Currently, ranger patrol routes are mostly “reactive”, keeping tabs on the areas that have been hit hard before and preventing what they can. An NSF-funded team at the University of Southern California, however, has built an AI-driven application called Protection Assistant for Wildlife Security (PAWS) that makes patrolling more predictive, and hence, more effective. PAWS incorporates data on past patrols, evidence of poaching, and complex terrain information like topography, to determine the highest-probability patrol routes while minimizing elevation changes, saving time and energy. As it receives more data, the system “learns” and improves its patrol planning. The application also randomizes patrol routes to avoid falling into predictable patterns that can be anticipated by poachers.

The NSF said that since 2015, non-governmental organizations Panthera and Rimbat have used PAWS to protect forests in Malaysia. The research won the Innovative Applications of Artificial Intelligence award for deployed application, as one of the best AI applications with measurable benefits.

This is not the first instance of leveraging AI for good. Unfortunately, the public is bombarded with negative depictions of AI, with stories like targeted online ads and Facebook’s almost eerie knowledge of its user base dominating the headlines. That’s because negativity sometimes sells more headlines. The truth is, like any technological advancement, the power of AI is in the hands of its users. AI can vastly improve human productivity and thus raise living standards, solve problems, discover new breakthroughs. As more applications like PAWS come to light, we hope that more people will see the incredible good that comes from the power of data, supplementing human expertise to drive towards solutions for the most pressing social, economic and environmental issues of our day.

Topics: Artificial intelligence

The Perils of Black Box Machine Learning: Baseball, Movies, and Predicting Deaths in Game of Thrones

Posted by Jon Millis

22.04.2016 10:17 AM

Making predictions is fun. I was a huge baseball fan growing up. There was nothing quite like chatting with my dad and my friends, crunching basic statistics and watching games, reading scouting reports, and finally, expressing my opinion on what would happen (the Braves would win the World Series) and why things were happening (Manny Ramirez was on a hot streak because he was facing inexperienced left-handed pitchers). I was always right…unless I was wrong.*

One of the reasons business people, scientists and hobbyists like predictive modeling is that in many cases, it allows us to sharpen our predictions. There’s a reason why Moneyball became a best-selling book, as it was one of the first widely publicized examples of applying analytics to gain a competitive advantage, in this case by identifying the most important player statistics that translate to winning baseball games. Predictive modeling was the engine that drove the Oakland A’s from a small budget cellar-dweller to a perennial championship contender. By being able to understand the components of a valuable baseball player – not merely predict their statistics – the A’s held on to a valuable advantage for years.


High five, Zito! You won 23 games for the price of a relief pitcher!

The A’s were ahead of their time, focusing on forecasting wins and diagnosing the “why”. With this dual-pronged approach, they could make significant tweaks to change future outcomes. But many times, predictive modeling is different, and takes the form of “black box” predictions. “Leonardo DiCaprio will win an Oscar,” or, “Your farm will yield 30 trucks-worth of corn this season.” That’s nice to hear if you’re confident your system will be right 100% of the time. Sometimes you don’t need to know the “why”; you just need an answer. But more often, if you want to be sure you’ll be getting an accurate prediction, you need to understand not only what will happen, but why it will happen. How did you come to that conclusion? What are the different inputs or variables at play?

If, for example, a machine learning algorithm predicts that Leonardo DiCaprio will win an Oscar – but one of the deciding variables is that survival movie stars always win the award if they wear a black bow tie to the awards ceremony – we would want to know this, so we could tweak our model and remove the false correlation, as it likely has nothing to do with the outcome. We might consequently be left with a model that now only includes box office sales, number of star actors, type of movie, and the number of stars given by Roger Ebert. This model is one we can be more confident in because as movie buffs, we’re mini “domain experts” that can confirm the model makes sense. And to boot, we can have full insight into why Leonardo will win the Oscar, so we can place a more confident bet in Vegas. (You know…if that’s your thing.) Operating in the “black box” confines of most machine learning approaches would render the previous iterations impossible.


I don’t always win the Oscars…but when I do, I wear a black bow tie.

That’s why my head continues to spin at the mistakenly godlike, magic bullet appeal of black box modeling. It’s fantastic for certain applications: specifically, answering the question, “What will happen?” when you don’t care how the answer is derived, so long as it’s right. One example of this could be a high-frequency trading application that executes a trade based on the fact that its algorithm can predict with 95% accuracy when a stock will appreciate.

But for most things, the value of a prediction is understanding the “what will happen” and the “why”. I almost shook my computer with frustration this morning when I read that a team of researchers at the Technical University of Munich had used artificial intelligence (more specifically, machine learning) to predict that Tommen Baratheon would be the first character killed off in the upcoming season of Game of Thrones – but didn’t give any indication of how or why that will happen. It’s because the algorithm said so. Are you kidding me, guys?! That’s like saying, “Jon, you will eat a horrendous dinner tonight that will verrrrry likely leave you violently ill for days, buuuuuut unfortunately come back later to see how that happened or where you ate for dinner, because we just don’t feel like telling you.” What good is a prediction without context and understanding? Will I get sick from the spinach in my fridge, from bad meatloaf at a restaurant, or from a coworker who decided to come over and sneeze on my food as I’m finishing up this blog post far too late on a Thursday night?? (Stay away from my food, Michael. STAY AWAY!!!) Without that context, I can’t make any change to improve my outcome of being home sick as a dogface for an entire week.

There’s a reason that people see artificial intelligence and machine learning as fairy dust. A lot of the time, it works, but it’s hard to use, requires technical expertise, and it frequently operates in a total black box. I like to understand my predictions. That’s why when I was a 10-year-old kid, I decided I’d work on bringing machine intelligence – the automated creation and interpretation of models – to the world and join Nutonian. Well…that may not be entirely true. More likely, I was trying to predict how well I’d have to hit a curveball to make it to the MLB.


*Sayings like this always remind me of one of my favorite off-the-field players of all-time, Yogi Berra, a Hall of Famer known as much for his wit and turns of phrases as his talent:


Topics: Baseball, Game of Thrones, Machine Intelligence, Machine learning

Industry Trends at the Bio-IT World Conference & Expo

Posted by Jon Millis

15.04.2016 09:30 AM

In our market ascent, Nutonian has noticed that many of our most “cutting-edge” customers are biotech and healthcare companies. From drug demand forecasting to clinical trials analysis, Eureqa has been a critical cog in helping companies go from raw data to pattern detection, pinpoint forecasts, and root cause discoveries.

On the heels of this success, we posted up at booth 136 at Bio-IT World, a massive showcase of technologies enabling the biomedical, pharmaceutical and healthcare industries. A few lessons learned from the realms of artificial intelligence and data science…

There’s a problem with the status quo

…And it centers around analytics. Almost every attendee we talked to, from the analysts to the department heads, had massive amounts of data and deep domain expertise about their problem. But there was a glaring gap in their ability to extract equally deep understanding about what their data meant. They were like parents who were preeeeetty sure their kids were sneaking out at night but had no proof of anything – and they were setting up traps (manual, time-intensive statistical analyses) that were taking months to confirm or deny their allegations. It’s particularly painful to have this discussion with scientists who are spinning their wheels on problems that can immediately improve lives. If drug effectiveness for a particular disease can be accurately forecasted or optimized early in the R&D process, there’s a tangible value to the vendor and their patients.

R is still the modeling tool of choice

Based on an entirely unscientific survey of people who stopped by to chat, a majority of them are using R for their data science needs. R is, of course, not a biotech-specific tool, and their users face a grueling time-to-answer cycle, and at times, questionable model accuracy. Before a 5-minute Eureqa conversation, they had no idea how unhealthy and behind the times they really were. Shameless plug: Eureqa, with its academic roots out of the artificial intelligence lab at Cornell, was initially invented to accelerate scientific research intiatives and automatically unravel how scientific systems behave. It has since become popular in the private sector, unlocking the “rules of business.”

Big data is still a buzzword

Some of the smartest people in the world, with more of the right data than many other people in the world, are yet to incorporate much analytic technology into their processes. A common question we got was, “Are there data discovery techniques we can use to help us be more targeted in our research?”

Many times, the most overwhelming part of data science is simply figuring out what data is relevant. Multiple prominent universities and corporations were stymied by feature creation. They had so many thousands of variables and inputs in play during their research, that even the idea of analytical modeling was intimidating and unrealistic, and they were working off of trial and error hunches. Another shameless plug: Eureqa automates the entire process of feature engineering. Feed it all of your data at once, relevant or irrelevant, and it will tell you within minutes what matters and what doesn’t.

We’ll keep the community posted as interesting new use cases from the show materialize into Nutonian customers.

Topics: Biotech, Eureqa, Healthcare

The Road to Recovery: Automated data science will drive important discoveries and efficiencies in healthcare

Posted by Jon Millis

11.11.2015 10:00 AM

Healthcare drives people mad. There’s nothing neutral or uncontroversial about it. Politicians argue about it, countries overspend on it, citizens fret about it. Providers walk a fine line between providing quality care for the sake of positive patient outcomes, while routinely being incentivized to package quantity over quality. Insurers are customer-focused corporations trying to provide a universal good, while also acting as “rational players” seeking to line their pockets. Patients, meanwhile, are the ones with true skin in the game. After all, as the cabaret of politics, insurers, lobbyists and pharmaceutical companies spins and twirls, what some people keep seeming to forget is that healthcare is inherently about the patient. And at some point in our lives, to better our existences and our enjoyment of life, we’re all patients.

It seems logical, then, that a majority of us – you, me, grandma – want quality care at an affordable price (we’ll ignore some healthcare providers and manufacturers who stand to benefit far more than us from high prices). In a country where many decisions are based on the will of the people, how is it that the United States sees such exorbitant healthcare spending, totaling 17% of our GDP, while also consistently bearing overflowing waiting rooms, less procedural coverage, and less transparency than an alarming number of other developed countries? We’re the leaders of the free, prosperous world, and we can’t even get the system that monitors our very own physical well-being close to right.

Healthcare costs in the U.S. eclipse other developed countries in nearly every procedure.

Like you and like many others reading this, we don’t have the answers. We’re not policy-driven, we’re not experts in the nuances and systematic understanding of medicine, and we don’t have employees dedicated to figuring it out. But we do know that public policy aside, two of the most important things that will enable the U.S. to dramatically improve healthcare “return on investment” for patients are: 1) data transparency, and 2) smart analytics.

Data transparency sounds simple, but as any analytics professional will tell you – particularly in healthcare – it’s not. Data transparency in our working definition refers to the ability to collect, store and access clean data sets, and have the necessary permissions to use them for analysis. This is an issue that goes beyond our understanding of the healthcare and law realms, and cannot possibly be solved in a blog post. Perhaps reward-type programs will incentivize patients to share anonymous data. Perhaps payers/providers will better articulate the potential future cost benefits to patients if they allow them to anonymously collect and use their information. Perhaps patients will be given rebates in exchange for their consent. However it gets done, it has to get done. The amount of public and private resources being dedicated to healthcare is plainly not sustainable, and better answers and better problem-solving capabilities all start with data. As William Edwards Demming once said, “In God we trust. All others, bring data.”

Now, let’s assume we have the data. This is our ideal world as analytics folks. What’s the bridge to our end goal of more efficient pricing, lower costs, better diagnoses, better drugs, better doctor-patient matches? Understanding. The engine for understanding for the past few decades has been a cobbling-together of infrastructure, visualization tools, statistics packages, and smart, expensive business and technical specialists. While some problems can certainly be solved in this way, the explosion of volume and velocity of data will ultimately make this melting pot obsolete. It’s expensive, slow, specialized and complicated. With data sets as massive and complex as healthcare, it’s also likely to be impossible. If I’m a hospital trying to determine a patient’s risk of coronary heart disease, and I know thousands of things about him/her, it’s infeasible to rely on human ability to arrive at an accurate prediction. And how that patient is cared for is almost entirely dependent on that prognosis.

More spending does not equate to better results.

Source: Huffington Post

So if scale, speed, accuracy and simplicity of predictive capabilities are all impediments to an exceptional U.S. healthcare system, what if, similar to beefing up a car manufacturing plant with robots, we could automate these analyses? What if we didn’t even need a degree in statistics, or a PhD in computer science, or even a deep understanding of our data, to pull incredibly meaningful, powerful, system-changing information from our data? What if I were an insurance company, and I could identify the true variables that determine health risks, so I didn’t have to overcharge all of my customers to compensate for an understanding I just don’t have? What if I had thousands of patient characteristics from millions of patients, and I could unravel the most important biological relationships that drive Parkinson’s disease, so that I could develop a better drug to treat it? What if I had doctor and patient outcomes from thousands of different cases, and I could match the right doctor to the right patient at the right time? Even more captivatingly…what if I could do these things with an astonishingly small amount of human capital?

This is Nutonian’s vision in pioneering the machine intelligence market. Through all the noise about big data nearing the “trough of disillusionment”[1], being promise unfulfilled, only accessible to the technically elite and giant corporations seeking to maximize ad revenue and purchases, we have a vision of democratizing insights that not only serves large companies, but enables better outcomes for our local world in Cambridge, Massachusetts, New England and the world. IBM talks about building a smarter planet. Watson is a catchy cocktail party conversation, but who gives a —- about machine learning if technology like that isn’t accessible to the average person? Is that a smarter planet, is that just another weapon for the top 1% of tech-savvy programmers?

Disruptive technologies and trends have pros and cons. Unfortunately, in the case of big data, in the minds of many, the cons have been the first to prominently surface, with privacy concerns, targeted advertising, dynamic “individualized” pricing for plane tickets, etc. But what if this was because the power of big data, until recently, has only been accessible and able to be tapped by the technical elite? By the nimble start-up businesses shoveling out stock to hire data scientists to maximize profits? This assertion is guided by nothing more than speculation and personal experience, but today, the talent required to leverage big data is more frequently found staring at a computer screen in a business than at a healthcare hack-a-thon.

Eureqa is leading the way in extracting simple, powerful answers from healthcare data.

And this is where the whole game may change. “Automated data science” tools like Eureqa remove the complexity and scalability bottlenecks from big data. And that’s underselling its power. This is the first time we have an application that, should you provide it data – sensor data, internally generated data, external data, etc. – it will tell you, without having to weave together 15 queries, how the “system” works. What are the six variables accounting for 90% of the variation in this data, and how can we change the outcome to the one we want? We have an “interactive explainer” that concisely articulates this to you. What if I want to actually see my data, to get a holistic appreciation for what’s happening, and why it’s happening? We have a visualization layer that creates beautiful canvases showing how the data speaks, its underlying patterns, its driving attributes and features. We help you comprehensively understand the roots of successes or challenges, so you can either water them or weed them. And we do this automatically.

We’re not going to solve a national healthcare problem by saturating the market with Eureqa. What we will do is provide a self-driving car for people who want to start navigating an incredibly daunting road. Give it all your data, and it will get you to your destination. You might notice the cost of petrol drop along the way.



Topics: Eureqa, Healthcare

Machine Intelligence Strips Off Our Data Science Blinders

Posted by Guest Author

07.10.2015 10:00 AM

by Dan Woods

In our increasingly digital lives, we have been trained to trust the way that technology works. That is, right up until it doesn’t.

Consider a GPS. A lot of powerful technology is used to correctly make an optimal GPS route. Few people understand why their GPS system chooses the routes that it does, but we’ve come to simply accept its recommended navigation directions because they tend to be good enough. It’s OK even when the predicted route doesn’t work – say, it prompts you to turn the wrong way on a one-way street or you run into construction and need to make a detour – we have corrective mechanisms in place to override its instructions.

However, accepting blinders on data-driven solutions can be dangerous. The higher the cost of a mistake, the higher the consequences are for false positives and false negatives. Have you ever started internet sleuthing and found a symptom checker that declared that your runny nose and painful headache meant you had cancer? Instead of being gently let down by your exasperated doctor the next morning, imagine if the hospital immediately enrolled you in chemotherapy treatment based solely on this output. While this is an extreme example, outsourcing too much responsibility to machines could lead to mistakes just as costly.

A fundamentally new approach to data science is needed to accomplish this partnership – one that allows each side to equally communicate ideas and strategies to each other, rather than one side dictating the constraints of the connection. This approach is machine intelligence, with the driving philosophy that the partnership between man and machines is greater than the sum of its parts.

Nutonian’s machine intelligence system, Eureqa, doesn’t put blinders on users. In fact, the system purposefully shows its work, surfaces user-friendly ways to reach advanced results, and encourages rapid iteration to incorporate the user’s domain expertise into the results. Regardless of technical expertise, users all across the organization can use Eureqa to discover new business strategies, while retaining the ability to audit and correct sub-optimal paths before committing to them.

The abundance of data in the business world needs more than a one-sided discussion. Use machine intelligence to open up a new horizon of possibilities in the golden age of analytics.



Dan Woods is CTO and founder of CITO Research. He has written more than 20 books about the strategic intersection of business and technology. Dan writes about data science, cloud computing, mobility, and IT management in articles, books, and blogs, as well as in his popular column on

Topics: Eureqa, Golden Age of Analytics, Machine Intelligence

Follow Me

Posts by Topic

see all