What drives the gulf stream?

I’m not much of a fan of todays’ kids’ science books because they don’t teach science IMHO. They have nice pictures and a few numbers; almost no equations, and lots of words. You can’t do science that way. On the odd occasion that they give the right answer to some problem, the lack of math means the kid has no way of understanding the reasoning, and no reason to believe the answer. Professional science articles on the web are bad in the opposite direction: too many numbers and for math, hey rely on supercomputers. No human can understand the outcome. I like to use my blog to offer science with insight, the type you’d get in an old “everyman science” book.

In previous posts, I gave answers to why the sky is blue, why it’s cold at the poles, why it’s cold on mountains, how tornadoes pick stuff up, and why hurricanes blow the way they do. In this post, we’ll try to figure out what drives the gulf-stream. The main argument will be deduction — disproving things that are not driving the gulf stream to leave us with one or two that could. Deduction is a classic method of science, well presented by Sherlock Holmes.

The gulf stream. The speed in the white area is ≥ 0.5 m/s (1.1 mph.).

For those who don’t know, the Gulf stream is a massive river of water that runs within the Atlantic ocean. As shown at right, it starts roughly at the end of Florida, runs north to the Carolinas, and then turns dramatically east towards Spain. Flowing east, It’s about 150 miles wide, but only about 62 miles (100 km) when flowing along the US coast. According to some of the science books of my youth this massive flow was driven by temperature according to others, by salinity (whatever that means), and yet other books of my youth wind. My conclusion: they had no clue.

As a start to doing the science here, it’s important to fill in the numerical information that the science books left out. The Gulf stream is roughly 1000 meters deep, with a typical speed of 1 m/s (2.3 mph). The maximum speed is the surface water as the stream flows along the US coast. It is about 2.5 metres per second (5.6 mph), see map above.

From the size and the speed of the Gulf Stream, we conclude that land rivers are not driving the flow. The Mississippi is a big river with an outflow point near the head waters of the gulf stream, but the volume of flow is vastly too small. The volume of the gulf stream is roughly

Q=wdv = 100,000 x 1000 x .5 =  50 million m3/s = 1.5 billion cubic feet/s.

This is about 2000 times more flow than the volume flow of the Mississippi, 18,000 m3/s. The great difference in flow suggests the Mississippi could not be the driving force. The map of flow speeds (above) also suggest rivers do not drive the flow. The Gulf Stream does not flow at its maximum speed near the mouth of any river.  We now look for another driver.

Moving on to temperature. Temperature drives the whirl of hurricanes. The logic for temperature driving the gulf stream is as follows: it’s warm by the equator and cold at the poles; warm things expand and as water flows downhill, the polls will always be downhill from the equator. Lets put some math in here or my explanation will be lacking. First lets consider how much hight difference we might expect to see. The thermal expansivity of water is about 2x 10-4 m/m°C (.0002/°C) in the desired temperature range). To calculate the amount of expansion we multiply this by the depth of the stream, 1000m, and the temperature difference between two points, eg. the end of Florida to the Carolina coast. This is 5°C (9°F) I estimate. I calculate the temperature-induced seawater height as:

∆h (thermal) ≈ 5° x .0002/° x 1000m = 1 m (3.3 feet).

This is a fair amount of height. It’s only about 1/100 the height driving the Mississippi river, but it’s something. To see if 1 m is enough to drive the Gulf flow, I’ll compare it to the velocity-head. Velocity-head is a concept that’s useful in plumbing (I ran for water commissioner). It’s the potential energy height equivalent of any kinetic energy — typically of a fluid flow. The kinetic energy for any velocity v and mass of water, m is 1/2 mv2 . The potential energy equivalent is mgh. Combine the above and remove the mass terms, and we have:

∆h (velocity) = v2/2g.

Where g is the acceleration of gravity. Let’s consider  v = 1 m/s and g= 9.8 m/s2.≤ 0.05 m ≈ 2 inches. This is far less than the driving force calculated above. We have 5x more driving force than we need, but there is a problem: why isn’t the flow faster? Why does the Mississippi move so slowly when it has 100 times more head.

To answer the above questions, and to check if heat could really drive the Gulf Stream, we’ll check if the flow is turbulent — it is. A measure of how turbulent is based on something called the Reynolds number, Re#, it’s the ratio of kinetic energy and viscous loss in a fluid flow. Flows are turbulent if this ratio is more than 3000, or so;

Re# = vdρ/µ.

In the above, v is velocity, say 1 m/s, d is depth, 1000m, ρ = density, 1000 kg/m3 for water, and  0.00133 Pa∙s is the viscosity of water. Plug in these numbers, and we find a RE# = 750 million: this flow will be highly turbulent. Assuming a friction factor of 1/20 (.05), e find that we’d expect complete mixing 20 depths or 20 km. We find we need the above 0.05 m of velocity height to drive every 20 km of flow up the US coast. If the distance to the Carolina coast is 1000 km we need 1000*.05m/20 = 1 meter, that’s just about the velocity-head that the temperature difference would suggest. Temperature is thus a plausible driving force for 0.5 m/s, though not likely for the faster 2.5 m/s flow seen in the center of the stream. Turbulent flow is a big part of figuring the mpg of an automobile; it becomes rapidly more important at high speeds.

World sea salinity. The maximum and minimum are in the wrong places.

What about salinity? For salinity to work, the salinity would have to be higher at the end of the flow. As a model of the flow, we might imagine that we freeze arctic seawater, and thus we concentrate salt in the seawater just below the ice. The heavy, saline water would flow down to the bottom of the sea, and then flow south to an area of low salinity and low pressure. Somewhere in the south, the salinity would be reduced by rains. If evaporation were to exceed the rains, the flow would go in the other direction. Sorry to say, I see no evidence of any of this. For one the end of the Gulf Stream is not that far north; there is no freezing, For two other problems: there are major rains in the Caribbean, and rains too in the North Atlantic. Finally, while the salinity head is too small. Each pen of salinity adds about 0.0001g/cc, and the salinity difference in this case is less than 1 ppm, lets say 0.5ppm.

h = .0001 x 0.5 x 1000 = 0.05m

I don’t see a case for northern-driven Gulf-stream flow caused by salinity.

Surface level winds in the Atlantic. Trade winds in purple, 15-20 mph.

Now consider winds. The wind velocities are certainly enough to produce 5+ miles per hour flows, and the path of flows is appropriate. Consider, for example, the trade winds. In the southern Caribbean, they blow steadily from east to west slightly above the equator at 15 -20 mph. This could certainly drive a circulation flow of 4.5 mph north. Out of the Caribbean basin and along the eastern US coat the trade winds blow at 15-50 mph north and east. This too would easily drive a 4.5 mph flow.  I conclude that a combination of winds and temperature are the most likely drivers of the gulf stream flow. To quote Holmes, once you’ve eliminated the impossible, whatever remains, however improbable, must be the truth.

Robert E. Buxbaum, March 25, 2018. I used the thermal argument above to figure out how cold it had to be to freeze the balls off of a brass monkey.

Yogurt making for kids

Yogurt making is easy, and is a fun science project for kids and adults alike. It’s cheap, quick, easy, reasonably safe, and fairly useful. Like any real science, it requires mathematical thinking if you want to go anywhere really, but unlike most science, you can get somewhere even without math, and you can eat the experiments. Yogurt making has been done for centuries, and involves nothing more than adding some yogurt culture to a glass of milk and waiting. To do this the traditional way, you wait with the glass sitting outside of any refrigeration (they didn’t have refrigeration in the olden days). After a few days, you’ll have tasty yogurt. You can get taster yogurt if you add flavors. In one of my most successful attempts at flavoring, I added 1/2 ounce of “skinny syrup” (toffee flavor) to a glass of milk. The results were most satisfactory, IMHO.

My latest batch of home-made flavored yogurt, made in a warm spot behind this coffee urn.

Now to turn yogurt-making into a science project. We’ll begin with a hypothesis. I generally tell people to not start with a hypothesis, (it biases your thinking), but here I will make an exception as I have a peculiarly non-biased hypothesis to suggest. Besides, most school kids are told they need one. My hypothesis is that there must be better ways to make yogurt and worse ways. A hypothesis should be avoided if it contains any unfounded assumptions, or if it points to a particular answer — especially an answer that no one would care about.

As with all science you’ll want to take numerical data of cause and effect. I’d suggest that temperature data is worth taking. The yogurt-making bacteria is called lactose thermophillis, and this suggests that warm temperatures will be good (lact = milk in Latin, thermophilic = loving heat). Also making things interesting is the suspicion that if you make things too warm, you’ll cook your organisms and you won’t get any yogurt. I’ve had this happen, both with over-heat and under-heat. My first attempt was to grow yogurt in the refrigerator, but I got no results. I then tried the kitchen counter and got yogurt, and then I heated things a bit more by growing next to a coffee urn, and got better yogurt; yet more heat and nothing.

For a science project, you might want to make a few batches of yogurt, at least 5, and these should be made at 2-3 different temperatures. If temperature is a cause for the yogurt to come out better or worse, you’ll need to be able to measure how much “better”? You may choose to study taste, and that’s important, but it’s hard to quantify, so that should not be the whole experiment. I would begin by testing thickness, or the time to a get some fixed degree of thickness; I’d measure thickness by seeing if a small weight sinks. A penny is a cheap, small weight, and I know it sinks in milk, but not in yogurt. You’ll want to wash your penny first, or no one will eat the yogurt. I used hot water from the urn to clean and sterilize my pennies.

Another thing that is worth testing is the effect of using different milks: whole milk, 2%, 1% or skim; goat milk, or almond milk. You can also try adding stuff to it, or starting with different starter cultures, or different amounts. Keep numerical records of these choices, then keep track of how they effect how long it takes for the gel to form, and how the stuff looks or tastes to you. Before you know it, you’ll have some very good product at half the price of the stuff in the store. If you really want to move forward fast, you might apply semi-random statistics to your experimental choices. Good luck.

Robert Buxbaum, March 2, 2018. My latest observation: what happens if you leave the yogurt to mold too long? It doesn’t get moldy, perhaps the lactic acid formed kills germs (?), but the yogurt separated into curds and whey. I poured off the whey, the unappealing, bitter yellow liquid. The thick white remainder is called “Greek” yogurt. I’m not convinced this tastes better, or is healthier, BTW.

How Tesla invented, I think, Tesla coils and wireless chargers.

I think I know how Tesla invented his high frequency devices, and thought I’d show you, while also explaining the operation of some devices that develop from in. Even if I’m wrong in historical terms, at least you should come to understand some of his devices, and something of the invention process. Either can be the start of a great science fair project.

The start of Tesla’s invention process, I think, was a visual similarity– I’m guessing he noticed that the physics symbol for a spring was the same as for an electrical, induction coil, as shown at left. A normal person would notice the similarity, and perhaps think about it for a few seconds, get no where, and think of something else. If he or she had a math background — necessary to do most any science — they might look at the relevant equations and notice that they’re different. The equation describing the force of a spring is F = -k x  (I’ll define these letters in the bottom paragraph). The equation describing the voltage in an induction coil is not very similar-looking at first glance, V = L di/dt.  But there is a key similarity that could appeal to some math aficionados: both equations are linear. A linear equation is one where, if you double one side you double the other. Thus, if you double F, you double x, and if you double V, you double dI/dt, and that’s a significant behavior; the equation z= atis not linear, see the difference?

Another linear equation is the key equation for the motion for a mass, Newton’s second law, F = ma = m d2x/dt2. This equation is quite complicated looking, since the latter term is a second-derivative, but it is linear, and a mass is the likely thing for a spring to act upon. Yet another linear equation can be used to relate current to the voltage across a capacitor: V= -1/C ∫idt. At first glance, this equation looks quite different from the others since it involves an integral. But Nicola Tesla did more than a first glance. Perhaps he knew that linear systems tend to show resonance — vibrations at a fixed frequency. Or perhaps that insight came later.

And Tesla saw something else, I imagine, something even less obvious, except in hindsight. If you take the derivative of the two electrical equations, you get dV/dt = L d2i/dt2, and dV/dt = -1/C i . These equations are the same as for the spring and mass, just replace F and x by dV/dt and i. That the derivative of the integral is the thing itself is something I demonstrate here. At this point it becomes clear that a capacitor-coil system will show the same sort of natural resonance effects as shown by a spring and mass system, or by a child’s swing, or by a bouncy bridge. Tesla would have known, like anyone who’s taken college-level physics, that a small input at the right, resonant frequency will excite such systems to great swings. For a mass and spring,

Basic Tesla coil. A switch set off by magnetization of the iron core insures resonant frequency operation.

resonant frequency = (1/2π) √k/m,

Children can make a swing go quite high, just by pumping at the right frequency. Similarly, it should be possible to excite a coil-capacitor system to higher and higher voltages if you can find a way to excite long enough at the right frequency. Tesla would have looked for a way to do this with a coil capacitor system, and after a while of trying and thinking, he seems to have found the circuit shown at right, with a spark gap to impress visitors and keep the voltages from getting to far out of hand. The resonant frequency for this system is 1/(2π√LC), an equation form that is similar to the above. The voltage swings should grow until limited by resistance in the wires, or by the radiation of power into space. The fact that significant power is radiated into space will be used as the basis for wireless phone chargers, but more on that later. For now, you might wish to note that power radiation is proportional to dV/dt.

A more -modern version of the above excited by AC current. In this version, you achieve resonance by adjusting the coil, capacitor and resistance to match the forcing frequency.

The device above provides an early, simple way to excite a coil -capacitor system. It’s designed for use with a battery or other DC power source. There’s an electromagnetic switch to provide resonance with any capacitor and coil pair. An alternative, more modern device is shown at left. It  achieves resonance too without the switch through the use of input AC power, but you have to match the AC frequency to the resonant frequency of the coil and capacitor. If wall current is used, 60 cps, the coil and capacitor must be chosen so that  1/(2π√LC) = 60 cps. Both versions are called Tesla coils and either can be set up to produce very large sparks (sparks make for a great science fair project — you need to put a spark gap across the capacitor, or better yet use the coil as the low-voltage part of a transformer.

Another use of this circuit is as a transmitter of power into space. The coil becomes the transmission antenna, and you have to set up a similar device as a receiver, see picture at right. The black thing at left of the picture is the capacitor. One has to make sure that the coil-capacitor pair is tuned to the same frequency as the transmitter. One also needs to add a rectifier, the rectifier chosen here is designated 1N4007. This, fairly standard-size rectifier allows you to sip DC power to the battery, without fear that the battery will discharge on every cycle. That’s all the science you need to charge an iPhone without having to plug it in. Designing one of these is a good science fair project, especially if you can improve on the charging distance. Why should you have to put your iPhone right on top of the transmitter battery. Why not allow continuous charging anywhere in your home. Tesla was working on long-distance power transmission till the end of his life. What modifications would that require?

Symbols used above: a = acceleration = d2x/dt2, C= capacitance of the capacitor, dV/dt = the rate of change of voltage with time, F = force, i = current, k = stiffness of the spring, L= inductance of the coil, m = mass of the weight, t= time, V= voltage, x = distance of the mass from its rest point.

Robert Buxbaum, October 2, 2017.

summer science: a toad or turtle terrarium

Here’s an easy summer science project, one I just made: a toad habitat. It’s similar to a turtle terrarium (I’ll show how to make that too). I’d made the turtle terrarium ten years ago for my 8-year-old daughter (here’s some advice I gave her on her 16th birthday).

For this project you’ll need: a large flower-pot, fish tank, or plastic clothes bin. You’ll need some dirt for the bottom, and a small plastic bin, jar, or Tupperware for toad (or turtle) transport. You’ll also need a smallish plastic dish or tub (~6″ by 1″ deep) to serve as a lake in the toad habitat. For the turtle version you don’t need the lake, but will need a rock or brick. And that’s all, besides your toad or turtle. The easy way to get your pet is to find one by a river. If that doesn’t work, go to a pet-store and get one that is native to your area of the country. Local fauna (fauna= animals) will be heartier and cheeper, and will allow you to keep your terrarium outside if you choose. Keeping my toad outside means he (or she) can catch bugs without me having to buy them all the time. It also seems more “natural” to study animals in their natural temperature cycles. I caught my toads three weeks ago, in mid April after the last frost — I plan to set one free in the fall –the other I gave away.

Some good toad hunting spots in Keego Harbor MI

The first place I went was the banks of the Rouge river near Lawrence Tech. Sorry to say, the area showed no signs of toads, frogs, turtles, or even fish. There was an illegally connected drain, though — not good. I plan to bring the illegal grain up with the “Friends of the Rouge” (good group). I then went to an oak swamp on the Rouge. The area was beautiful and scenic, but there was no oxygen in the water and so no fish or toads; oxygen is important for the health of a river; without it, you’ve got  a swamp. I finally hit pay-dirt in Keego Harbor, MI, see map, a rural community 10 miles away from my home. In Keego harbor I found American toads aplenty: jumping all over, and big, hollow toad-mounds by the river. The locals were friendly too. Toad catching is a good conversation starter. I put two toads in my bin with some lake water and took them home to the terrarium, see movie.

My neighbor got the other toad and put him/her in a fish-tank terrarium in his bathroom. His terrarium has a screen on top with holes small enough to keep the toad and his food from escaping. He is feeding his toad meal worms, but I don’t have a movie. Apparently they like it.

The main difference between this project, and the turtle terrarium I’d made is that the turtle terrarium was mostly water, with a brick, and this is mostly mud with a lake. I made the turtle terrarium in a laundry bin, a bigger environment, and flooded it except for the brick. I bought the turtles (a red-ears and a snapping) and fed it chicken bits and dandelion leaves. As with this terrarium, I kept the turtles outside through the spring, summer, and fall, but I brought the turtles in the winter. They lasted that way for about 8 years. Toads only live for 2-3 years, and mime may be a year or two old already. I won’t be too surprised if it croaks on my watch. For now, she seems safe and hoppy.

Robert Buxbaum, May 3, 2017. Here are some other science fair projects, chemical, and biological.

if everyone agrees, something is wrong

I thought I’d try to semi-derive, and explain a remarkable mathematical paper that was published last month in The Proceedings of the Royal Society A (see full paper here). The paper demonstrates that too much agreement about a thing is counter-indicative of the thing being true. Unless an observation is blindingly obvious, near 100% agreement suggests there is a hidden flaw or conspiracy, perhaps unknown to the observers. This paper has broad application, but I thought the presentation was too confusing for most people to make use of, even those with a background in mathematics, science, or engineering. And the popular versions press versions didn’t even try to be useful. So here’s my shot:

Figure 2 from the original paper. For a method that is 80% accurate, you get your maximum reliability at 3-5 witnesses. More agreement suggests a flaw in the people or procedure.

I will discuss only on specific application, the second one mentioned in the paper, crime (read the paper for others). Lets say there’s been a crime with several witnesses. The police line up a half-dozen, equal (?) suspects, and show them to the first witness. Lets say the first witness points to one of the suspects, the police will not arrest on this because they know that people correctly identify suspects only about 40% of the time, and incorrectly identify perhaps 10% (the say they don’t know or can’t remember the remaining 50% of time). The original paper includes the actual factions here; they’re similar. Since the witness pointed to someone, you already know he/she isn’t among the 50% who don’t know. But you don’t know if this witness is among the 40% who identify right or the 10% who identify wrong. Our confidence that this is the criminal is thus .4/(.4 +.1) = .8, or 80%.

Now you bring in the second witness. If this person identifies the same suspect, your confidence increases; to roughly (.4)2/(.42+.12) = .941,  or 94.1%. This is enough to make an arrest, but let’s say you have ten more witnesses, and all identify this same person. You might first think that this must be the guy with a confidence of (.4)10/(.410+.110) = 99.99999%, but then you wonder how unlikely it is to find ten people who identify correctly when, as we mentioned, each person has only a 40% chance. The chance of all ten witnesses identifying a suspect right is small: (.4)10 = .000104 or 0.01%. This fraction is smaller than the likelihood of having a crooked cop or a screw up the line-up (only one suspect had the right jacket, say). If crooked cops and systemic errors show up 1% of the time, and point to the correct fellow only 15% of these, we find that the chance of being right if ten out of ten agree is (0.0015 +(.4)10)/( .01+ .410+.110) = .16%. Total agreement on guilt suggests the fellow is innocent!

The graph above, the second in the paper, presents a generalization of the math I just presented: n identical tests of 80% accuracy and three different likelihoods of systemic failure. If this systemic failure rate is 1% and the chance of the error pointing right or wrong is 50/50, the chance of being right is P = (.005+ .4n)/(.01 +.4n+.1n), and is the red curve in the graph above. The authors find you get your maximum reliability when there are two to four agreeing witness.

Confidence of guilt as related to the number of judges that agree and the integrity of the judges.

The Royal Society article went on to a approve of a feature of Jewish capital-punishment law. In Jewish law, capital cases are tried by 23 judges. To convict a super majority (13) must find guilty, but if all 23 judges agree on guilt the court pronounces innocent (see chart, or an anecdote about Justice Antonin Scalia). My suspicion, by the way, is that more than 1% of judges and police are crooked or inept, and that the same applies to scientific analysis of mental diseases like diagnosing ADHD or autism, and predictions about stocks or climate change. (Do 98% of scientists really agree independently?). Perhaps there are so many people in US prisons, because of excessive agreement and inaccurate witnesses, e.g Ruben Carter. I suspect the agreement on climate experts is a similar sham.

Robert Buxbaum, March 11, 2016. Here are some thoughts on how to do science right. Here is some climate data: can you spot a clear pattern of man-made change?

Patterns in climate; change is the only constant

There is a general problem when looking for climate trends: you have to look at weather data. That’s a problem because weather data goes back thousands of years, and it’s always changing. As a result it’s never clear what start year to use for the trend. If you start too early or too late the trend disappears. If you start your trend line in a hot year, like in the late roman period, the trend will show global cooling. If you start in a cold year, like the early 1970s, or the small ice age (1500 -1800) you’ll find global warming: perhaps too much. Begin 10-15 years ago, and you’ll find no change in global temperatures.

Ice coverage data shows the same problem: take the Canadian Arctic Ice maximums, shown below. If you start your regression in 1980-83, the record ice year (green) you’ll see ice loss. If you start in 1971, the year of minimum ice (red), you’ll see ice gain. It might also be nice to incorporate physics thought a computer model of the weather, but this method doesn’t seem to help. Perhaps that’s because the physics models generally have to be fed coefficients calculated from the trend line. Using the best computers and a trend line showing ice loss, the US Navy predicted, in January 2006, that the Arctic would be ice-free by 2013. It didn’t happen; a new prediction is 2016 — something I suspect is equally unlikely. Five years ago the National Academy of Sciences predicted global warming would resume in the next year or two — it didn’t either. Garbage in -garbage out, as they say.

Arctic Ice in Northern Canada waters, 1971-2014 from the Canadian ice service 2014 is not totally in yet , but is likely to exceed 2013. If you are looking for trends, in what year do you start?

The same trend problem appears with predicting sea temperatures and el Niño, a Christmastime warming current in the Pacific ocean. This year, 2013-14, was predicted to be a super El Niño, an exceptionally hot, stormy year with exceptionally strong sea currents. Instead, there was no el Niño, and many cities saw record cold — Detroit by 9 degrees. The Antarctic ice hit record levels, stranding a ship of anti warming activists. There were record few hurricanes.  As I look at the Pacific sea temperature from 1950 to the present, below, I see change, but no pattern or direction: El Nada (the nothing). If one did a regression analysis, the slope might be slightly positive or negative, but r squared, the significance, would be near zero. There is no real directionality, just noise if 1950 is the start date.

El Niño and La Niña since 1950. There is no sign that they are coming more often, or stronger. Nor is clear evidence that the ocean is warming.

This appears to be as much a fundamental problem in applied math as in climate science: when looking for a trend, where do you start, how do you handle data confidence, and how do you prevent bias? A thought I’ve had is to try to weight a regression in terms of the confidence in the data. The Canadian ice data shows that the Canadian Ice Service is less confident about their older data than the new; this is shown by the grey lines. It would be nice if some form of this confidence could be incorporated into the regression trend analysis, but I’m not sure how to do this right.

It’s not so much that I doubt global warming, but I’d like a better explanation of the calculation. Weather changes: how do you know when you’re looking at climate, not weather? The president of the US claimed that the science is established, and Prince Charles of England claimed climate skeptics were headless chickens, but it’s certainly not predictive, and that’s the normal standard of knowledge. Neither country has any statement of how one would back up their statements. If this is global warming, I’d expect it to be warm.

Robert Buxbaum, Feb 5, 2014. Here’s a post I’ve written on the scientific method, and on dealing with abnormal statistics. I’ve also written about an important recent statistical fraud against genetically modified corn. As far as energy policy, I’m inclined to prefer hydrogen over batteries, and nuclear over wind and solar. The president has promoted the opposite policy — for unexplained, “scientific” reasons.

It’s always nice when a study is retracted, especially so if the study alerts the world to a danger that is found to not exist. Retractions don’t happen often enough, I think, given that false positives should occur in at least 5% of all biological studies. Biological studies typically use 95% confidence limits, a confidence limit that indicates there will be false positives 5% of the time for the best-run versions (or 10% if both 5% tails are taken to be significant). These false positives will appear in 5-10% of all papers as an expected result of statistics, no matter how carefully the study is done, or how many rats used. Still, one hopes that researchers will check for confirmation from other researchers and other groups within the study. Neither check was not done in a well publicized, recent paper claiming genetically modified foods cause cancer. Worse yet, the experiment design was such that false positives were almost guaranteed.

Séralini published this book, “We are all Guinea Pigs,” simultaneously with the paper.

As reported in Nature, the journal Food and Chemical Toxicology retracted a 2012 paper by Gilles-Eric Séralini claiming that eating genetically modified (GM) maize causes cancerous tumors in rats despite “no evidence of fraud or intentional misrepresentation.” I would not exactly say no evidence. For one, the choice of rats and length of the study was such that a 30% of the rats would be expected to get cancer and die even under the best of circumstances. Also, Séralini failed to mention that earlier studies had come to the opposite conclusion about GM foods. Even the same journal had published a review of 12 long-term studies, between 90 days and two years, that showed no harm from GM corn or other GM crops. Those reports didn’t get much press because it is hard to get excited at good news, still you’d have hoped the journal editors would demand their review, at least, would be referenced in a paper stating the contrary.

A wonderful book on understanding the correct and incorrect uses of statistics.

The main problem I found is that the study was organized to virtually guarantee false positives. Séralini took 200 rats and divided them into 20 groups of 10. Taking two groups of ten (one male, one female) as a control, he fed the other 18 groups of ten various doses of genetically modified grain, either alone of mixed with roundup, a pesticide often used with GM foods. Based on pure statistics, and 95% confidence, you should expect that, out of the 18 groups fed GM grain there is a 1- .9518 chance (60%) that at least one group will show cancer increase, and a similar 60% chance that at least one group will show cancer decrease at the 95% confidence level. Séralini’s study found both these results: One group, the female rats fed with 10% GM grain and no roundup, showed cancer increase; another group, the female rats fed 33% GM grain and no roundup, showed cancer decrease — both at the 95% confidence level. Séralini then dismissed the observation of cancer decrease, and published the inflammatory article and a companion book (“We are all Guinea Pigs,” pictured above) proclaiming that GM grain causes cancer. Better editors would have forced Séralini to acknowledge the observation of cancer decrease, or demanded he analyze the data by linear regression. If he had, Séralini would have found no net cancer effect. Instead he got to publish his bad statistics, and (since non of the counter studies were mentioned) unleashed a firestorm of GM grain products pulled from store shelves.

Did Séralini knowingly design a research method aimed to produce false positives? In a sense, I’d hope so; the alternative is pure ignorance. Séralini is a long-time, anti GM-activist. He claims he used few rats because he was not expecting to find any cancer — no previous tests on GM foods had suggested a cancer risk!? But this is mis-direction; no matter how many rats in each group, if you use 20 groups this way, there is a 60% chance you’ll find at least one group with cancer at the 95% confidence limit. (This is Poisson-type statistics see here). My suspicion is that Séralini knowingly gamed the experiments in an effort to save the world from something he was sure was bad. That he was a do-gooder twisting science for the greater good.

It’s important to cite previous work and aspects of the current work that may undermine the story you’d like to tell; BC Comics, Johnny Hart.

This was not the only major  retraction of the month, by the way. The Harrisburg Patriot & Union retracted its 1863 review of Lincoln’s Gettysburg Address, a speech the editors originally panned as “silly remarks”, deserving “a veil of oblivion….” In a sense, it’s nice that they reconsidered, and “…have come to a different conclusion…” My guess is that the editors were originally motivated by do-gooder instinct; they hoped to shorten the war by panning the speech.

There is an entire blog devoted to retractions, by the way:  http://retractionwatch.com. A good friend, Richard Fezza alerted me to it. I went to high school with him, then through under-grad at Cooper Union, and to grad school at Princeton, where we both earned PhDs. We’ll probably end up in the same old-age home. Cooper Union tried to foster a skeptical attitude against group-think.

Robert Buxbaum, Dec 23, 2013. Here is a short essay on the correct way to do science, and how to organize experiments (randomly) to make biassed analysis less likely. I’ve also written on nearly normal statistics, and near poisson statistics. Plus on other random stuff in the science and art world: Time travel, anti-matter, the size of the universe, Surrealism, Architecture, Music.

The 2013 hurricane drought

News about the bad weather that didn’t happen: there were no major hurricanes in 2013. That is, there was not one storm in the Atlantic Ocean, the Caribbean Sea, or the Gulf of Mexico with a maximum wind speed over 110 mph. None. As I write this, we are near the end of the hurricane season (it officially ends Nov. 30), and we have seen nothing like what we saw in 2012; compare the top and bottom charts below. Barring a very late, very major storm, this looks like it will go down as the most uneventful season in at least 2 decades. Our monitoring equipment has improved over the years, but even with improved detection, we’ve seen nothing major. The last time we saw this lack was 1994 — and before that 1986, 1972, and 1968.

Hurricanes 2012 -2013. This year there were only two hurricanes, and both were category 1 The last time we had this few was 1994. By comparison, in 2012 we saw 5 category 1 hurricanes, 3 Category 2s, and 2 Category 3s including Sandy, the most destructive hurricane to hit New York City since 1938.

In the pacific, major storms are called typhoons, and this year has been fairly typical: 13 typhoons, 5 of them super, the same as in 2012.  Weather tends to be chaotic, but it’s nice to have a year without major hurricane damage or death.

In the news, a lack of major storms lead to the lack of destruction of the boats, beaches, and stately homes of the North Carolina shore.

The reason you have not heard of this before is that it’s hard to write a story about events that didn’t happen. Good news is as important as bad, and 2013 had been predicted to be one of the worst seasons on record, but then it didn’t happen and there was nothing to write about. Global warming is supposed to increase hurricane activity, but global warming has taken a 16 year rest. You didn’t hear about the lack of global warming for the same reason you didn’t hear about the lack of storms.

Here’s why hurricanes form in fall and spin so fast, plus how they pick up stuff (an explanation from Einstein). In other good weather news, the ozone hole is smaller, and arctic ice is growing (I suggest we build a northwest passage). It’s hard to write about the lack of bad news, still Good science requires an open mind to the data, as it is, or as it isn’t. Here is a simple way to do abnormal statistics, plus why 100 year storms come more often than once every 100 years.

Robert E. Buxbaum. November 23, 2013.

Why random experimental design is better

In a previous post I claimed that, to do good research, you want to arrange experiments so there is no pre-hypothesis of how the results will turn out. As the post was long, I said nothing direct on how such experiments should be organized, but only alluded to my preference: experiments should be organized at randomly chosen conditions within the area of interest. The alternative, shown below is that experiments should be done at the cardinal points in the space, or at corner extremes: the Wilson Box and Taguchi design of experiments (DoE), respectively. Doing experiments at these points implies a sort of expectation of the outcome; generally that results will be linearly, orthogonal related to causes; in such cases, the extreme values are the most telling. Sorry to say, this usually isn’t how experimental data will fall out.

First experimental test points according to a Wilson Box, a Taguchi, and a random experimental design. The Wilson box and Taguchi are OK choices if you know or suspect that there are no significant non-linear interactions, and where experiments can be done at these extreme points. Random is the way nature works; and I suspect that’s best — it’s certainly easiest.

The first test-points for experiments according to the Wilson Box method and Taguchi method of experimental designs are shown on the left and center of the figure above, along with a randomly chosen set of experimental conditions on the right. Taguchi experiments are the most popular choice nowadays, especially in Japan, but as Taguchi himself points out, this approach works best if there are “few interactions between variables, and if only a few variables contribute significantly.” Wilson Box experimental choices help if there is a parabolic effect from at least one parameter, but are fairly unsuited to cases with strong cross-interactions.

Perhaps the main problems with doing experiments at extreme or cardinal points is that these experiments are usually harder than at random points, and that the results from these difficult tests generally tell you nothing you didn’t know or suspect from the start. The minimum concentration is usually zero, and the minimum temperature is usually one where reactions are too slow to matter. When you test at the minimum-minimum point, you expect to find nothing, and generally that’s what you find. In the data sets shown above, it will not be uncommon that the two minimum W-B data points, and the 3 minimum Taguchi data points, will show no measurable result at all.

Randomly selected experimental conditions are the experimental equivalent of Monte Carlo simulation, and is the method evolution uses. Set out the space of possible compositions, morphologies and test conditions as with the other method, and perhaps plot them on graph paper. Now, toss darts at the paper to pick a few compositions and sets of conditions to test; and do a few experiments. Because nature is rarely linear, you are likely to find better results and more interesting phenomena than at any of those at the extremes. After the first few experiments, when you think you understand how things work, you can pick experimental points that target an optimum extreme point, or that visit a more-interesting or representative survey of the possibilities. In any case, you’ll quickly get a sense of how things work, and how successful the experimental program will be. If nothing works at all, you may want to cancel the program early, if things work really well you’ll want to expand it. With random experimental points you do fewer worthless experiments, and you can easily increase or decrease the number of experiments in the program as funding and time allows.

Consider the simple case of choosing a composition for gunpowder. The composition itself involves only 3 or 4 components, but there is also morphology to consider including the gross structure and fine structure (degree of grinding). Instead of picking experiments at the maximum compositions: 100% salt-peter, 0% salt-peter, grinding to sub-micron size, etc., as with Taguchi, a random methodology is to pick random, easily do-able conditions: 20% S and 40% salt-peter, say. These compositions will be easier to ignite, and the results are likely to be more relevant to the project goals.

The advantages of random testing get bigger the more variables and levels you need to test. Testing 9 variables at 3 levels each takes 27 Taguchi points, but only 16 or so if the experimental points are randomly chosen. To test if the behavior is linear, you can use the results from your first 7 or 8 randomly chosen experiments, derive the vector that gives the steepest improvement in n-dimensional space (a weighted sum of all the improvement vectors), and then do another experimental point that’s as far along in the direction of that vector as you think reasonable. If your result at this point is better than at any point you’ve visited, you’re well on your way to determining the conditions of optimal operation. That’s a lot faster than by starting with 27 hard-to-do experiments. What’s more, if you don’t find an optimum; congratulate yourself, you’ve just discovered an non-linear behavior; something that would be easy to overlook with Taguchi or Wilson Box methodologies.

The basic idea is one Sherlock Holmes pointed out (Study in Scarlet): It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.” (Case of Identity). Life is infinitely stranger than anything which the mind of man could invent.

Robert E. Buxbaum, September 11, 2013. A nice description of the Wilson Box method is presented in Perry’s Handbook (6th ed). SInce I had trouble finding a free, on-line description, I linked to a paper by someone using it to test ingredient choices in baked bread. Here’s a link for more info about random experimental choice, from the University of Michigan, Chemical Engineering dept. Here’s a joke on the misuse of statistics, and a link regarding the Taguchi Methodology. Finally, here’s a pointless joke on irrational numbers, that I posted for pi-day.

The Scientific Method isn’t the method of scientists

A linchpin of middle school and high-school education is teaching ‘the scientific method.’ This is the method, students are led to believe, that scientists use to determine Truths, facts, and laws of nature. Scientists, students are told, start with a hypothesis of how things work or should work, they then devise a set of predictions based on deductive reasoning from these hypotheses, and perform some critical experiments to test the hypothesis and determine if it is true (experimentum crucis in Latin). Sorry to say, this is a path to error, and not the method that scientists use. The real method involves a few more steps, and follows a different order and path. It instead follows the path that Sherlock Holmes uses to crack a case.

The actual method of Holmes, and of science, is to avoid beginning with a hypothesis. Isaac Newton claimed: “I never make hypotheses” Instead as best we can tell, Newton, like most scientists, first gathered as much experimental evidence on a subject as possible before trying to concoct any explanation. As Holmes says (Study in Scarlet): “It is a capital mistake to theorize before you have all the evidence. It biases the judgment.”

Holmes barely tolerates those who hypothesize before they have all the data: “It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.” (Scandal in Bohemia).

Then there is the goal of science. It is not the goal of science to confirm some theory, model, or hypothesis; every theory probably has some limited area where it’s true. The goal for any real-life scientific investigation is the desire to explain something specific and out of the ordinary, or do something cool. Similarly, with Sherlock Holmes, the start of the investigation is the arrival of a client with a specific, unusual need – one that seems a bit outside of the normal routine. Similarly, the scientist wants to do something: build a bigger bridge, understand global warming, or how DNA directs genetics; make better gunpowder, cure a disease, or Rule the World (mad scientists favor this). Once there is a fixed goal, it is the goal that should direct the next steps: it directs the collection of data, and focuses the mind on the wide variety of types of solution. As Holmes says: , “it’s wise to make one’s self aware of the potential existence of multiple hypotheses, so that one eventually may choose one that fits most or all of the facts as they become known.” It’s only when there is no goal, that any path will do

In gathering experimental data (evidence), most scientists spend months in the less-fashionable sections of the library, looking at the experimental methods and observations of others, generally from many countries, collecting any scrap that seems reasonably related to the goal at hand. I used 3 x5″ cards to catalog this data and the references. From many books and articles, one extracts enough diversity of data to be able to look for patterns and to begin to apply inductive logic. “The little things are infinitely the most important” (Case of Identity). You have to look for patterns in the data you collect. Holmes does not explain how he looks for patterns, but this skill is innate in most people to a greater or lesser extent. A nice set approach to inductive logic is called the Baconian Method, it would be nice to see schools teach it. If the author is still alive, a scientist will try to contact him or her to clarify things. In every SH mystery, Holmes does the same and is always rewarded. There is always some key fact or observation that this turns up: key information unknown to the original client.

Based on the facts collected one begins to create the framework for a variety of mathematical models: mathematics is always involved, but these models should be pretty flexible. Often the result is a tree of related, mathematical models, each highlighting some different issue, process, or problem. One then may begin to prune the tree, trying to fit the known data (facts and numbers collected), into a mathematical picture of relevant parts of this tree. There usually won’t be quite enough for a full picture, but a fair amount of progress can usually be had with the application of statistics, calculus, physics, and chemistry. These are the key skills one learns in college, but usually the high-schooler and middle schooler has not learned them very well at all. If they’ve learned math and physics, they’ve not learned it in a way to apply it to something new, quite yet (it helps to read the accounts of real scientists here — e.g. The Double Helix by J. Watson).

Usually one tries to do some experiments at this stage. Homes might visit a ship or test a poison, and a scientist might go off to his, equally-smelly laboratory. The experiments done there are rarely experimenti crucae where one can say they’ve determined the truth of a single hypothesis. Rather one wants to eliminated some hypotheses and collect data to be used to evaluate others. An answer generally requires that you have both a numerical expectation and that you’ve eliminated all reasonable explanations but one. As Holmes says often, e.g. Sign of the four, “when you have excluded the impossible, whatever remains, however improbable, must be the truth”. The middle part of a scientific investigation generally involves these practical experiments to prune the tree of possibilities and determine the coefficients of relevant terms in the mathematical model: the weight or capacity of a bridge of a certain design, the likely effect of CO2 on global temperature, the dose response of a drug, or the temperature and burn rate of different gunpowder mixes. Though not mentioned by Holmes, it is critically important in science to aim for observations that have numbers attached.

The destruction of false aspects and models is a very important part of any study. Francis Bacon calls this act destruction of idols of the mind, and it includes many parts: destroying commonly held presuppositions, avoiding personal preferences, avoiding the tendency to see a closer relationship than can be justified, etc.

In science, one eliminates the impossible through the use of numbers and math, generally based on your laboratory observations. When you attempt to the numbers associated with our observations to the various possible models some will take the data well, some poorly; and some twill not fit the data at all. Apply the deductive reasoning that is taught in schools: logical, Boolean, step by step; if some aspect of a model does not fit, it is likely the model is wrong. If we have shown that all men are mortal, and we are comfortable that Socrates is a man, then it is far better to conclude that Socrates is mortal than to conclude that all men but Socrates is mortal (Occam’s razor). This is the sort of reasoning that computers are really good at (better than humans, actually). It all rests on the inductive pattern searches similarities and differences — that we started with, and very often we find we are missing a piece, e.g. we still need to determine that all men are indeed mortal, or that Socrates is a man. It’s back to the lab; this is why PhDs often take 5-6 years, and not the 3-4 that one hopes for at the start.

More often than not we find we have a theory or two (or three), but not quite all the pieces in place to get to our goal (whatever that was), but at least there’s a clearer path, and often more than one. Since science is goal oriented, we’re likely to find a more efficient than we fist thought. E.g. instead of proving that all men are mortal, show it to be true of Greek men, that is for all two-legged, fairly hairless beings who speak Greek. All we must show is that few Greeks live beyond 130 years, and that Socrates is one of them.

Putting numerical values on the mathematical relationship is a critical step in all science, as is the use of models — mathematical and otherwise. The path to measure the life expectancy of Greeks will generally involve looking at a sample population. A scientist calls this a model. He will analyze this model using statistical model of average and standard deviation and will derive his or her conclusions from there. It is only now that you have a hypothesis, but it’s still based on a model. In health experiments the model is typically a sample of animals (experiments on people are often illegal and take too long). For bridge experiments one uses small wood or metal models; and for chemical experiments, one uses small samples. Numbers and ratios are the key to making these models relevant in the real world. A hypothesis of this sort, backed by numbers is publishable, and is as far as you can go when dealing with the past (e.g. why Germany lost WW2, or why the dinosaurs died off) but the gold-standard of science is predictability.  Thus, while we a confident that Socrates is definitely mortal, we’re not 100% certain that global warming is real — in fact, it seems to have stopped though CO2 levels are rising. To be 100% sure you’re right about global warming we have to make predictions, e.g. that the temperature will have risen 7 degrees in the last 14 years (it has not), or Al Gore’s prediction that the sea will rise 8 meters by 2106 (this seems unlikely at the current time). This is not to blame the scientists whose predictions don’t pan out, “We balance probabilities and choose the most likely. It is the scientific use of the imagination” (Hound of the Baskervilles)The hope is that everything matches; but sometimes we must look for an alternative; that’s happened rarely in my research, but it’s happened.

You are now at the conclusion of the scientific process. In fiction, this is where the criminal is led away in chains (or not, as with “The Woman,” “The Adventure of the Yellow Face,” or of “The Blue Carbuncle” where Holmes lets the criminal free — “It’s Christmas”). For most research the conclusion includes writing a good research paper “Nothing clears up a case so much as stating it to another person”(Memoirs). For a PhD, this is followed by the search for a good job. For a commercial researcher, it’s a new product or product improvement. For the mad scientist, that conclusion is the goal: taking over the world and enslaving the population (or not; typically the scientist is thwarted by some detail!). But for the professor or professional research scientist, the goal is never quite reached; it’s a stepping stone to a grant application to do further work, and from there to tenure. In the case of the Socrates mortality work, the scientist might ask for money to go from country to country, measuring life-spans to demonstrate that all philosophers are mortal. This isn’t as pointless and self-serving as it seems, Follow-up work is easier than the first work since you’ve already got half of it done, and you sometimes find something interesting, e.g. about diet and life-span, or diseases, etc. I did some 70 papers when I was a professor, some on diet and lifespan.

One should avoid making some horrible bad logical conclusion at the end, by the way. It always seems to happen that the mad scientist is thwarted at the end; the greatest criminal masterminds are tripped by some last-minute flaw. Similarly the scientist must not make that last-mistep. “One should always look for a possible alternative, and provide against it” (Adventure of Black Peter). Just because you’ve demonstrated that  iodine kills germs, and you know that germs cause disease, please don’t conclude that drinking iodine will cure your disease. That’s the sort of science mistakes that were common in the middle ages, and show up far too often today. In the last steps, as in the first, follow the inductive and quantitative methods of Paracelsus to the end: look for numbers, (not a Holmes quote) check how quantity and location affects things. In the case of antiseptics, Paracelsus noticed that only external cleaning helped and that the help was dose sensitive.

As an example in the 20th century, don’t just conclude that, because bullets kill, removing the bullets is a good idea. It is likely that the trauma and infection of removing the bullet is what killed Lincoln, Garfield, and McKinley. Theodore Roosevelt was shot too, but decided to leave his bullet where it was, noticing that many shot animals and soldiers lived for years with bullets in them; and Roosevelt lived for 8 more years. Don’t make these last-minute missteps: though it’s logical to think that removing guns will reduce crime, the evidence does not support that. Don’t let a leap of bad deduction at the end ruin a line of good science. “A few flies make the ointment rancid,” said Solomon. Here’s how to do statistics on data that’s taken randomly.

Dr. Robert E. Buxbaum, scientist and Holmes fan wrote this, Sept 2, 2013. My thanks to Lou Manzione, a friend from college and grad school, who suggested I reread all of Holmes early in my PhD work, and to Wikiquote, a wonderful site where I found the Holmes quotes; the Solomon quote I knew, and the others I made up.