Is it important to measure the unknown?

Finding statically significant ways to compare the world for your reading pleasure.

Apr 23, 2021

Hello friends,

Today I want to talk about a scientific paper recommended to me. It’s a heavy paper with formulas that are understood only by a select few people, and with words that require google searches.

But, I read it and found it interesting, and if you bear with me, I hope I can represent it in a fair and useful manner to you all.

The paper is about statistical significance.

Right there I should stop and bring people up to speed.

A statistic is something measured in a way to denote accuracy about something unknown. What I named my newsletter after is data the unknown, and statistics in essence is about measuring the unknown.

So a statistic is a measurement of the unknown. Okay, got it?

Significance is a noun defined by google as “the quality of being worthy of attention; importance.”

In combination it becomes is this measurement of the unknown important?

If the answer to the above question is ‘yes’ then we can take action based on this knowledge. To fund one project or another, to choose one material of a car over another, the list is almost endless. But it starts with a simple question: did I measure something important.

The paper titled “Overlapping confidence intervals or standard error intervals: What do they mean in terms of statistical significance?” This paper is from the journal of insect science looks to answer this question. Written by: Mark E. Payton, Matthew H. Greenstone, and Nathaniel Schenker.

A quick disclaimer: Measuring the unknown, a chance of something uncertain happening is possible. Black swan events happen even if it is said to have a low probability.

The paper as written deals with computer simulations as a way of testing the unknown. These computer simulations create data by faking random sampling as the main usage of understanding. Measuring random samples of the unknown is the best practice of finding out if it is important.

A measurement of something known isn’t important since we know what it is already. So measuring random parts of the unknown allows for us to take a more accurate picture.

Once measured and fancy calculations are done the researchers can say we are X percent certain that we accurately measured the unknown. Which is good enough for lots of projects.

A note on good enough, driving a car is not 100% certain of no harm, but the airbags, seatbelts, traffic laws are good enough for us, and everyone else to go about their day. So a researcher saying we are 95% certain our measurement of the unknown is true, we can say "great” and go eat our frosted flakes and be sure they won’t kill us.

In the case of insect deaths of a new insecticide, the researchers were searching for this magical number where they are statically confident in it killing an insect. But the way the measuring process works they may look at the data and conclude it works, while someone else may look at it and conclude it doesn’t work. How does a group of researchers come to a consensus?

The paper argues that a visualization of the accuracy of the measurement is not good enough. It creates conservative measurements. If you are only to look and see whether something overlaps then you would be left with the assumption of only tests that are hugely apparent are ones that are statistically significant. I.e. if it's big enough for you to see the difference then you are on the conservative side of things.

Which I think is a pretty interesting metaphor for life. If problems or differences are only so big that you act on them, then you are going to be playing in the conservative side of things.

Whether or not that is a right or wrong way to live, I think depends on your situation. Take sharing this newsletter. Do you have to see the benefits of it to share it, or will you do it because I ask you to, which is much more subtle.

I will leave it up to you.

All the best,

Greg

Data The Unknown

Discussion about this post