The Pulse of Politics and Science
The midterm elections are (mostly) behind us. Phew! I must admit that I like political drama and love watching returns come in and (nerd-alert) comparing those returns to pre-election polling data. Polling aggregators like fivethirtyeight.com forecasted an expected gain of 39 seats for Democrats in the House and one seat for Republicans in the Senate. They also hedged their bets with a reported 80% confidence interval around those mean estimates of 21 to 59 Democratic seats in the House and -2 to +4 Republican seats in the Senate. As of November 13 (with 13 House and two Senate seats undecided), forecasters expect Democrats to gain 38 House seats and Republicans to gain two Senate seats, which would align well with the pre-election forecasts.
What does all this have to do with Product Stewardship? Well, one of the functions of product stewardship is to aggregate scientific opinion to gain the best understanding possible of the potential for a chemical, product, or other commercial activity to harm human health or the environment. The results of an individual scientific study testing a hypothesis of harm attributable to some chemical exposure is like an individual pre-election poll result. The study points in a direction but, like a single poll, it comes with a significant margin of error. By aggregating information across the many studies that comprise a single literature product stewards gain a far more accurate sense of the likelihood of harm.
It’s a whole lot easier to extract results from a poll, though, than it is to interpret the results of a scientific study. I don’t need to tell you that the task of aggregating scientific opinion is slow-going. Indeed, given the rapid pace of scientific research these days, I would say the task is nearly intractable at human-scale.
But what if we could train machines to aggregate scientific literatures for us? This is precisely what we’ve been working on at Praedicat. We use machine-learning algorithms to identify articles that address a causal hypothesis of chemical exposure and injury and then extract key metadata like study design, effect size, and confidence interval along with measures of study influence. We then aggregate all that information into a single score that tells us whether the literature is evolving toward or away from a consensus of harm.
What’s more, like polling aggregators, we can then publish a probabilistic forecast of where this literature will be years from today. Our model, for example, calculates a 70 percent chance that support for the hypothesis that perfluorohexane sulfonic acid (a PFOS substitute) causes developmental injury will continue to grow, but a negligible chance that there will be additional support for the hypothesis that propylparaben increases the risk of breast cancer. Perhaps a human team can make assessments of this nature for a few such hypotheses. At machine-scale though, thousands of hypotheses can be scored and updated as rapidly as scientists can publish new results.
If you like tracking political opinion, then I dare say you’re going to love tracking scientific opinion. Come visit us at praedicat.com to learn more about how we’re using human-guided knowledge to help product stewardship organizations keep their fingers on the pulse of scientific discovery.