Category Archives: math

math jokes and cartoons

uo3wgcxeParallel lines have so much in common.

It’s a shame they never get to meet.



sometimes education is the removal of false notions.

sometimes education is the removal of false notions.

pi therapy

pi therapy

Robert E. Buxbaum, January 4, 2017. Aside from the beauty of math itself, I’ve previously noted that, if your child is interested in science, the best route for development is math. I’ve also noted that Einstein did not fail at math, and that calculus is taught wrong, and probably is.

The game is rigged and you can always win.

A few months ago, I wrote a rather depressing essay based on Nobel Laureate, Kenneth Arrow’s work, and the paradox of de Condorcet. It is mathematically shown that you can not make a fair election, even if you wanted to, and no one in power wants to. The game is rigged.

To make up for that insight, I’d like to show from the work of John Forbes Nash (A Beautiful Mind) that you, personally, can win, basically all the time, if you can get someone, anyone to coöperate by trade. Let’s begin with an example in Nash’s first major paper, “The Bargaining Problem,” the one Nash is working on in the movie— read the whole paper here.  Consider two people, each with a few durable good items. Person A has a bat, a ball, a book, a whip, and a box. Person B has a pen, a toy, a knife, and a hat. Since each item is worth a different amount (has a different utility) to the owner and to the other person, there are almost always sets of trades that benefit both. In our world, where there are many people and everyone has many durable items, it is inconceivable that there are not many trades a person can make to benefit him/her while benefiting the trade partner.

Figure 3, from Nash’s, “The bargaining problem.” U1 and U2 are the utilities of the items to the two people, and O is the current state. You can improve by barter so long as your current state is not on the boundary. The parallel lines are places one could reach if money trades as well.

Good trades are even more likely when money is involved or non-durables. A person may trade his or her time for money, that is work, and any half-normal person will have enough skill to be of some value to someone. If one trades some money for durables, particularly tools, one can become rich (slowly). If one trades this work for items to make them happy (food, entertainment) they can become happier. There are just two key skills: knowing what something is worth to you, and being willing to trade. It’s not that easy for most folks to figure out what their old sofa means to them, but it’s gotten easier with garage sales and eBay.

Let us now move to the problem of elections, e.g. in this year 2016. There are few people who find the person of their dreams running for president this year. The system has fundamental flaws, and has delivered two thoroughly disliked individuals. But you can vote for a generally positive result by splitting your ticket. American society generally elects a mix of Democrats and Republicans. This mix either delivers the outcome we want, or we vote out some of the bums. Americans are generally happy with the result.

A Stamp act stamp. The British used these to tax every transaction, making it impossible for the ordinary person to benefit by small trade.

A Stamp act stamp,. Used to tax every transaction, the British made it impossible for ordinary people to benefit by small trades.

The mix does not have to involve different people, it can involve different periods of time. One can elect a Democrat president this year, and an Republican four years later. Or take the problem of time management for college students. If a student had to make a one time choice, they’d discover that you can’t have good grades, good friends, and sleep. Instead, most college students figure out you can have everything if you do one or two of these now, and switch when you get bored. And this may be the most important thing they learn.

This is my solution to Israel’s classic identity dilemma. David Ben-Gurion famously noted that Israel had the following three choices: they could be a nation of Jews living in the land of Israel, but not democratic. They could be a democratic nation in the land of Israel, but not Jewish; or they could be Jewish and democratic, but not (for the most part) in Israel. This sounds horrible until you realize that Israel can elect politicians to deliver different pairs of the options, and can have different cities that cater to thee options too. Because Jerusalem does not have to look like Tel Aviv, Israel can achieve a balance that’s better than any pure solution.

Robert E. Buxbaum, July 17-22, 2016. Balance is all, and pure solutions are a doom. I’m running for water commissioner.

Weir dams to slow the flow and save our lakes

As part of explaining why I want to add weir dams to the Red Run drain, and some other of our Oakland county drains, I posed the following math/ engineering problem: if a weir dam is used to double the depth of water in a drain, show that this increases the residence time by a factor of 2.8 and reduces the flow speed by 1/2.8. Here is my solution.

A series of weir dams on Blackman Stream, Maine. Mine would be about as tall, but somewhat further apart.

A series of weir dams on Blackman Stream, Maine. Mine would be about as tall, but wider and further apart. The dams provide oxygenation and hold back sludge.

Let’s assume the shape of the bottom of the drain is a parabola, e.g. y = x, and that the dams are spaced far enough apart that their volume is small compared to the volume of water. We now use integral calculus to calculate how the volume of water per mile, V is affected by water height:  V =2XY- ∫ y dx = 2XY- 2/3 X3 =  4/3 Y√Y. Here, capital Y is the height of water in the drain, and capital X is the horizontal distance of the water edge from the drain centerline. For a parabolic-bottomed drain, if you double the height Y, you increase the volume of water per mile by 2√2. That’s 2.83, or about 2.8 once you assume some volume to the dams.

To find how this affects residence time and velocity, note that the dam does not affect the volumetric flow rate, Q (gallons per hour). If we measure V in gallons per mile of drain, we find that the residence time per mile of drain (hours) is V/Q and that the speed (miles per hour) is Q/V. Increasing V by 2.8 increases the residence time by 2.8 and decreases the speed to 1/2.8 of its former value.

Why is this important? Decreasing the flow speed by even a little decreases the soil erosion by a lot. The hydrodynamic lift pressure on rocks or soil is proportional to flow speed-squared. Also, the more residence time and the more oxygen in the water, the more bio-remediation takes place in the drain. The dams slow the flow and promote oxygenation by the splashing over the weirs. Cells, bugs and fish do the rest; e.g. -HCOH- + O2 –> CO2 + H2O. Without oxygen, the fish die of suffocation, and this is a problem we’re already seeing in Lake St. Clair. Adding a dam saves the fish and turns the run into a living waterway instead of a smelly sewer. Of course, more is needed to take care of really major flood-rains. If all we provide is a weir, the water will rise far over the top, and the run will erode no better (or worse) than it did before. To reduce the speed during those major flood events, I would like to add a low bicycle path and some flood-zone picnic areas: just what you’d see on Michigan State’s campus, by the river.

Dr. Robert E. Buxbaum, May 12, 2016. I’d also like to daylight some rivers, and separate our storm and toilet sewage, but those are longer-term projects. Elect me water commissioner.

if everyone agrees, something is wrong

I thought I’d try to semi-derive, and explain a remarkable mathematical paper that was published last month in The Proceedings of the Royal Society A (see full paper here). The paper demonstrates that too much agreement about a thing is counter-indicative of the thing being true. Unless an observation is blindingly obvious, near 100% agreement suggests there is a hidden flaw or conspiracy, perhaps unknown to the observers. This paper has broad application, but I thought the presentation was too confusing for most people to make use of, even those with a background in mathematics, science, or engineering. And the popular versions press versions didn’t even try to be useful. So here’s my shot:

Figure 2 from the original paper. For a method that is 80% accurate, you get your maximum reliability at the third to fifth witness. Beyond that, more agreement suggest a flaw in the people or procedure.

Figure 2 from the original paper. For a method that is 80% accurate, you get your maximum reliability at 3-5 witnesses. More agreement suggests a flaw in the people or procedure.

I will discuss only on specific application, the second one mentioned in the paper, crime (read the paper for others). Lets say there’s been a crime with several witnesses. The police line up a half-dozen, equal (?) suspects, and show them to the first witness. Lets say the first witness points to one of the suspects, the police will not arrest on this because they know that people correctly identify suspects only about 40% of the time, and incorrectly identify perhaps 10% (the say they don’t know or can’t remember the remaining 50% of time). The original paper includes the actual factions here; they’re similar. Since the witness pointed to someone, you already know he/she isn’t among the 50% who don’t know. But you don’t know if this witness is among the 40% who identify right or the 10% who identify wrong. Our confidence that this is the criminal is thus .4/(.4 +.1) = .8, or 80%.

Now you bring in the second witness. If this person identifies the same suspect, your confidence increases; to roughly (.4)2/(.42+.12) = .941,  or 94.1%. This is enough to make an arrest, but let’s say you have ten more witnesses, and all identify this same person. You might first think that this must be the guy with a confidence of (.4)10/(.410+.110) = 99.99999%, but then you wonder how unlikely it is to find ten people who identify correctly when, as we mentioned, each person has only a 40% chance. The chance of all ten witnesses identifying a suspect right is small: (.4)10 = .000104 or 0.01%. This fraction is smaller than the likelihood of having a crooked cop or a screw up the line-up (only one suspect had the right jacket, say). If crooked cops and systemic errors show up 1% of the time, and point to the correct fellow only 15% of these, we find that the chance of being right if ten out of ten agree is (0.0015 +(.4)10)/( .01+ .410+.110) = .16%. Total agreement on guilt suggests the fellow is innocent!

The graph above, the second in the paper, presents a generalization of the math I just presented: n identical tests of 80% accuracy and three different likelihoods of systemic failure. If this systemic failure rate is 1% and the chance of the error pointing right or wrong is 50/50, the chance of being right is P = (.005+ .4n)/(.01 +.4n+.1n), and is the red curve in the graph above. The authors find you get your maximum reliability when there are two to four agreeing witness.

Confidence of guilt as related to the number of judges that agree and your confidence in the integrity of the judges.

Confidence of guilt as related to the number of judges that agree and the integrity of the judges.

The Royal Society article went on to a approve of a feature of Jewish capital-punishment law. In Jewish law, capital cases are tried by 23 judges. To convict a super majority (13) must find guilty, but if all 23 judges agree on guilt the court pronounces innocent (see chart, or an anecdote about Justice Antonin Scalia). My suspicion, by the way, is that more than 1% of judges and police are crooked or inept, and that the same applies to scientific analysis of mental diseases like diagnosing ADHD or autism, and predictions about stocks or climate change. (Do 98% of scientists really agree independently?). Perhaps there are so many people in US prisons, because of excessive agreement and inaccurate witnesses, e.g Ruben Carter. I suspect the agreement on climate experts is a similar sham.

Robert Buxbaum, March 11, 2016. Here are some thoughts on how to do science right. Here is some climate data: can you spot a clear pattern of man-made change?

Marie de Condorcet and the tragedy of the GOP

This is not Maire de Condorcet, it's his wife Sophie. Marie (less attractive) was executed by Robespierre for being a Republican.

Marie Jean is a man’s name. This is not he, but his wife, Sophie de Condorcet. Marie Jean was executed for being a Republican in Revolutionary France.

During the French Revolution, Marie Jean de Condorcet proposed a paradox with significant consequence for all elective democracies: It was far from clear, de Condorcet noted, that an election would choose the desired individual — the people’s choice — once three or more people could run. I’m sorry to say, this has played out often over the last century, usually to the detriment of the GOP, the US Republican party presidential choices.

The classic example of Condorcet’s paradox occurred in 1914. Two Republican candidates, William H. Taft and Theodore Roosevelt, faced off against a less-popular Democrat, Woodrow Wilson. Despite the electorate preferring either Republican to Wilson, the two Republicans split the GOP vote, and Wilson became president. It’s a tragedy, not because Wilson was a bad president, he wasn’t, but because the result was against the will of the people and entirely predictable given who was running (see my essay on tragedy and comedy).

The paradox appeared next fifty years later, in 1964. President, Democrat Lyndon B. Johnson (LBJ) was highly unpopular. The war in Vietnam was going poorly and our cities were in turmoil. Polls showed that Americans preferred any of several moderate Republicans over LBJ: Henry Cabot Lodge, Jr., George Romney, and Nelson Rockefeller. But no moderate could beat the others, and the GOP nominated its hard liner, Barry Goldwater. Barry was handily defeated by LBJ.

Then, in 1976; as before the incumbent, Gerald Ford, was disliked. Polls showed that Americans preferred Democrat Jimmy Carter over Ford, but preferred Ronald Regan over either. But Ford beat Reagan in the Republican primary, and the November election was as predictable as it was undesirable.

Voters prefer Bush to Clinton, and Clinton to Trump, but Republicans prefer Trump to Bush.

Voters prefer Bush to Clinton, and Clinton to Trump, but Republicans prefer Trump to Bush.

And now, in 2015, the GOP has Donald Trump as its leading candidate. Polls show that Trump would lose to Democrat Hillary Clinton in a 2 person election, but that America would elect any of several Republicans over Trump or Clinton. As before,  unless someone blinks, the GOP will pick Trump as their champion, and Trump will lose to Clinton in November.

At this point you might suppose that Condorcet’s paradox is only a problem when there are primaries. Sorry to say, this is not so. The problem shows up in all versions of elections, and in all versions of decision-making. Kenneth Arrow demonstrated that these unwelcome, undemocratic outcomes are unavoidable as long as there are more than two choices and you can’t pick “all of the above.” It’s one of the first great applications of high-level math to economics, and Arrow got the Nobel prize for it in 1972. A mathematical truth: elective democracy can never be structured to deliver the will of the people.

This problem also shows up in business situations, e.g. when a board of directors must choose a new location and there are 3 or more options, or when a board must choose to fund a few research projects out of many. As with presidential elections, the outcome always depends on the structure of the choice. It seems to me that some voting systems must be better than others — more immune to these problems, but I don’t know which is best, nor which are better than which. A thought I’ve had (that might be wrong) is that reelections and term limits help remove de Condorcet’s paradox by opening up the possibility of choosing “all of the above” over time. As a result, many applications of de Condorcet’s are wrong, I suspect. Terms and term-limits create a sort of rotating presidency, and that, within limits, seems to be a good thing.

Robert Buxbaum, September 20, 2015. I’ve analyzed the Iran deal, marriage vs a PhD, and (most importantly) mustaches in politics; Taft was the last of the mustached presidents. Roosevelt, the second to last.

An approach to teaching statistics to 8th graders

There are two main obstacles students have to overcome to learn statistics: one mathematical one philosophical. The math is difficult, and will be new to a high schooler, and (philosophically) it is rarely obvious what is the true, underlying cause and what is the random accident behind the statistical variation. This philosophical confusion (cause and effect, essence and accident) is a background confusion in the work of even in the greatest minds. Accepting and dealing with it is the root of the best research, separating it from blind formula-following, but it confuses the young who try to understand the subject, The young student (especially the best ones) will worry about these issues, compounding the difficulty posed by the math. Thus, I’ll try to teach statistics with a problem or two where the distinction between essential cause and random variation is uncommonly clear.

A good case to get around the philosophical issue is gambling with crooked dice. I show the class a pair of normal-looking dice and a caliper and demonstrate that the dice are not square; virtually every store-bought die is uneven, so finding an uneven one is not a problem. After checking my caliper, students will readily accept that after enough tests some throws will show up more often than others, and will also accept that there is a degree of randomness in the throw, so that any few throws will look pretty fair. I then justify the need for statistics as an attempt to figure out if the dice are loaded in a case where you don’t have a caliper, or are otherwise prevented from checking the dice. The evenness of the dice is the underlying truth, the random part is in the throw, and you want to grasp them both.

To simplify the problem, mathematically, I suggest we just consider a crooked coin throw with only two outcomes, heads and tails, not that I have a crooked coin; you’re to try to figure out if the coin is crooked, and if so how crooked. A similar problem appears with political polling: trying to figure out who will win an election between two people (Mr Head, and Ms Tail) from a sampling of only a few voters. For an honest coin or an even election, on each throw, there is a 50-50 chance of throwing a head, or finding a supporter of Mr Head. If you do it twice, there is a 25% chance of two heads, a 25% chance of throwing two tails and a 50% chance of one of each. That’s because there are four possibilities and two ways of getting a Head and a Tail.

pascal's triangle

Pascal’s triangle

After we discuss the process for a while, and I become convinced they have the basics down, I show the students a Pascal’s triangle. Pascal’s triangle shows the various outcomes and shows the ways they can be arrived at. Thus, for example, we see that, by the time you’ve thrown the dice 6 times, or called 6 people, you’re introduced 64 distinct outcomes, of which 20 (about 1/3) are the expected, even result: 3 heads and 3 tails. There is also only 1 way to get all heads and one way to get all tails. Thus, it is more likely than not that an honest coin will not come up even after 6 (or more) throws, and a poll in an even election will not likely come up even after 6 (or more) calls. Thus, the lack of an even result is hardly convincing that the die is crooked, or the election has a clear winner. On the other hand there is only a 1/32 chance of getting all heads or all tails (2/64). If you call 6 people, and all claim to be for Mr Head, it is likely that Mr Head is the favorite. Similarly, in a sport where one side wins 6 out of 6 times, there is a good possibility that there is a real underlying cause: a crooked coin, or one team is really better than the other.

And now we get to how significant is significant. If you threw 4 heads and 2 tails out of 6 throws we can accept that this is not significant because there are 15 ways to get this outcome (or 30 if you also include 2 heads and 4 tail) and only 20 to get the even outcome of 3-3. But what about if you threw 5 heads and one tail? In that case the ratio is 6/20 and the odds of this being significant is better, similarly, if you called potential voters and found 5 Head supporters and 1 for Tail. What do you do? I would like to suggest you take the ratio as 12/20 — the ratio of both ways to get to this outcome to that of the greatest probability. Since 12/20 = 60%, you could say there is a 60% chance that this result is random, and a 40% chance of significance. What statisticians call this is “suggestive” at slightly over 1 standard deviation. A standard deviation, also known as σ (sigma) is a minimal standard of significance, it’s if the one tailed value is 1/2 of the most likely value. In this case, where 6 tosses come in as 5 and 1, we find the ratio to be 6/20. Since 6/20 is less than 1/2, we meet this, very minimal standard for “suggestive.” A more normative standard is when the value is 5%. Clearly 6/20 does not meet that standard, but 1/20 does; for you to conclude that the dice is likely fixed after only 6 throws, all 6 have to come up heads or tails.

From skdz. It's typical in science to say that <5% chances, p <.050 are significant. If things don't quite come out that way, you redo.

From xkcd. It’s typical in science to say that <5% chances, p< .05. If things don’t quite come out that way, you redo.

If you graph the possibilities from a large Poisson Triangle they will resemble a bell curve; in many real cases (not all) your experiential data variation will also resemble this bell curve. From a larger Poisson’s triange, or a large bell curve, you  will find that the 5% value occurs at about σ =2, that is at about twice the distance from the average as to where σ  = 1. Generally speaking, the number of observations you need is proportional to the square of the difference you are looking for. Thus, if you think there is a one-headed coin in use, it will only take 6 or seven observations; if you think the die is loaded by 10% it will take some 600 throws of that side to show it.

In many (most) experiments, you can not easily use the poisson triangle to get sigma, σ. Thus, for example, if you want to see if 8th graders are taller than 7th graders, you might measure the height of people in both classes and take an average of all the heights  but you might wonder what sigma is so you can tell if the difference is significant, or just random variation. The classic mathematical approach is to calculate sigma as the square root of the average of the square of the difference of the data from the average. Thus if the average is <h> = ∑h/N where h is the height of a student and N is the number of students, we can say that σ = √ (∑ (<h> – h)2/N). This formula is found in most books. Significance is either specified as 2 sigma, or some close variation. As convenient as this is, my preference is for this graphical version. It also show if the data is normal — an important consideration.

If you find the data is not normal, you may decide to break the data into sub-groups. E.g. if you look at heights of 7th and 8th graders and you find a lack of normal distribution, you may find you’re better off looking at the heights of the girls and boys separately. You can then compare those two subgroups to see if, perhaps, only the boys are still growing, or only the girls. One should not pick a hypothesis and then test it but collect the data first and let the data determine the analysis. This was the method of Sherlock Homes — a very worthwhile read.

Another good trick for statistics is to use a linear regression, If you are trying to show that music helps to improve concentration, try to see if more music improves it more, You want to find a linear relationship, or at lest a plausible curve relationship. Generally there is a relationship if (y – <y>)/(x-<x>) is 0.9 or so. A discredited study where the author did not use regressions, but should have, and did not report sub-groups, but should have, involved cancer and genetically modified foods. The author found cancer increased with one sub-group, and publicized that finding, but didn’t mention that cancer didn’t increase in nearby sub-groups of different doses, and decreased in a nearby sub-group. By not including the subgroups, and not doing a regression, the author mislead people for 2 years– perhaps out of a misguided attempt to help. Don’t do that.

Dr. Robert E. Buxbaum, June 5-7, 2015. Lack of trust in statistics, or of understanding of statistical formulas should not be taken as a sign of stupidity, or a symptom of ADHD. A fine book on the misuse of statistics and its pitfalls is called “How to Lie with Statistics.” Most of the examples come from advertising.

Zombie invasion model for surviving plagues

Imagine a highly infectious, people-borne plague for which there is no immunization or ready cure, e.g. leprosy or small pox in the 1800s, or bubonic plague in the 1500s assuming that the carrier was fleas on people (there is a good argument that people-fleas were the carrier, not rat-fleas). We’ll call these plagues zombie invasions to highlight understanding that there is no way to cure these diseases or protect from them aside from quarantining the infected or killing them. Classical leprosy was treated by quarantine.

I propose to model the progress of these plagues to know how to survive one, if it should arise. I will follow a recent paper out of Cornell that highlighted a fact, perhaps forgotten in the 21 century, that population density makes a tremendous difference in the rate of plague-spread. In medieval Europe plagues spread fastest in the cities because a city dweller interacted with far more people per day. I’ll attempt to simplify the mathematics of that paper without losing any of the key insights. As often happens when I try this, I’ve found a new insight.

Assume that the density of zombies per square mile is Z, and the density of susceptible people is S in the same units, susceptible population per square mile. We define a bite transmission likelihood, ß so that dS/dt = -ßSZ. The total rate of susceptibles becoming zombies is proportional to the product of the density of zombies and of susceptibles. Assume, for now, that the plague moves fast enough that we can ignore natural death, immunity, or the birth rate of new susceptibles. I’ll relax this assumption at the end of the essay.

The rate of zombie increase will be less than the rate of susceptible population decrease because some zombies will be killed or rounded up. Classically, zombies are killed by shot-gun fire to the head, by flame-throwers, or removed to leper colonies. However zombies are removed, the process requires people. We can say that, dR/dt = kSZ where R is the density per square mile of removed zombies, and k is the rate factor for killing or quarantining them. From the above, dZ/dt = (ß-k) SZ.

We now have three, non-linear, indefinite differential equations. As a first step to solving them, we set the derivates to zero and calculate the end result of the plague: what happens at t –> ∞. Using just equation 1 and setting dS/dt= 0 we see that, since ß≠0, the end result is SZ =0. Thus, there are only two possible end-outcomes: either S=0 and we’ve all become zombies or Z=0, and all the zombies are all dead or rounded up. Zombie plagues can never end in mixed live-and-let-live situations. Worse yet, rounded up zombies are dangerous.

If you start with a small fraction of infected people Z0/S0 <<1, the equations above suggest that the outcome depends entirely on k/ß. If zombies are killed/ rounded up faster than they infect/bite, all is well. Otherwise, all is zombies. A situation like this is shown in the diagram below for a population of 200 and k/ß = .6

FIG. 1. Example dynamics for progress of a normal disease and a zombie apocalypse for an initial population of 199 unin- fected and 1 infected. The S, Z, and R populations are shown in (blue, red, black respectively, with solid lines for the zombie apocalypse, and lighter lines for the normal plague. t= tNß where N is the total popula- tion. For both models the k/ß = 0.6 to show similar evolutions. In the SZR case, the S population disap- pears, while the SIR is self limiting, and only a fraction of the population becomes infected.

Fig. 1, Dynamics of a normal plague (light lines) and a zombie apocalypse (dark) for 199 uninfected and 1 infected. The S and R populations are shown in blue and black respectively. Zombie and infected populations, Z and I , are shown in red; k/ß = 0.6 and τ = tNß. With zombies, the S population disappears. With normal infection, the infected die and some S survive.

Sorry to say, things get worse for higher initial ratios,  Z0/S0 >> 0. For these cases, you can kill zombies faster than they infect you, and the last susceptible person will still be infected before the last zombie is killed. To analyze this, we create a new parameter P = Z + (1 – k/ß)S and note that dP/dt = 0 for all S and Z; the path of possible outcomes will always be along a path of constant P. We already know that, for any zombies to survive, S = 0. We now use algebra to show that the final concentration of zombies will be Z = Z0 + (1-k/ß)S0. Free zombies survive so long as the following ratio is non zero: Z0/S0 + 1- k/ß. If Z0/S0 = 1, a situation that could arise if a small army of zombies breaks out of quarantine, you’ll need a high kill ratio, k/ß > 2 or the zombies take over. It’s seen to be harder to stop a zombie outbreak than to stop the original plague. This is a strong motivation to kill any infected people you’ve rounded up, a moral dilemma that appears some plague literature.

Figure 1, from the Cornell paper, gives a sense of the time necessary to reach the final state of S=0 or Z=0. For k/ß of .6, we see that it takes is a dimensionless time τ of 25 or to reach this final, steady state of all zombies. Here, τ= t Nß and N is the total population; it takes more real time to reach τ= 25 if N is high than if N is low. We find that the best course in a zombie invasion is to head for the country hoping to find a place where N is vanishingly low, or (better yet) where Z0 is zero. This was the main conclusion of the Cornell paper.

Figure 1 also shows the progress of a more normal disease, one where a significant fraction of the infected die on their own or develop a natural immunity and recover. As before, S is the density of the susceptible, R is the density of the removed + recovered, but here I is the density of those Infected by non-zombie disease. The time-scales are the same, but the outcome is different. As before, τ = 25 but now the infected are entirely killed off or isolated, I =0 though ß > k. Some non-infected, susceptible individuals survive as well.

From this observation, I now add a new conclusion, not from the Cornell paper. It seems clear that more immune people will be in the cities. I’ve also noted that τ = 25 will be reached faster in the cities, where N is large, than in the country where N is small. I conclude that, while you will be worse off in the city at the beginning of a plague, you’re likely better off there at the end. You may need to get through an intermediate zombie zone, and you will want to get the infected to bury their own, but my new insight is that you’ll want to return to the city at the end of the plague and look for the immune remnant. This is a typical zombie story-line; it should be the winning strategy if a plague strikes too. Good luck.

Robert Buxbaum, April 21, 2015. While everything I presented above was done with differential calculus, the original paper showed a more-complete, stochastic solution. I’ve noted before that difference calculus is better. Stochastic calculus shows that, if you start with only one or two zombies, there is still a chance to survive even if ß/k is high and there is no immunity. You’ve just got to kill all the zombies early on (gun ownership can help). Here’s my statistical way to look at this. James Sethna, lead author of the Cornell paper, was one of the brightest of my Princeton PhD chums.

Brass monkey cold

In case it should ever come up in conversation, only the picture at left shows a brass monkey. The other is a bronze statue of some sort of a primate. A brass monkey is a rack used to stack cannon balls into a face centered pyramid. A cannon crew could fire about once per minute, and an engagement could last 5 hours, so you could hope to go through a lot of cannon balls during an engagement (assuming you survived).

A brass monkey cannonball holder. The classic monkeys were 10 x 10 and made of navy brass.

Small brass monkey. The classic monkey might have 9 x 9 or 10×10 cannon balls on the lower level.

Bronze sculpture of a primate playing with balls -- but look what the balls are sitting on: it's a surreal joke.

Bronze sculpture of a primate playing with balls — but look what the balls are sitting on: it’s a dada art joke.

But brass monkeys typically show up in conversation in terms of it being cold enough to freeze the balls off of a brass monkey, and if you imagine an ornamental statue, you’d never guess how cold could that be. Well, for a cannonball holder, the answer has to do with the thermal expansion of metals. Cannon balls were made of iron and the classic brass monkey was made of brass, an alloy with a much-greater thermal expansion than iron. As the temperature drops, the brass monkey contracts more than the iron balls. When the drop is enough the balls will fall off and roll around.

The thermal expansion coefficient of brass is 18.9 x 10-4/°C while the thermal expansion coefficient of iron is 11.7 x10-4/°C. The difference is 7.2×10-4/°C; this will determine the key temperature. Now consider a large brass monkey, one with 10 x 10 holes on the lower level, 81 at the second, and so on. Though it doesn’t affect the result, we’ll consider a monkey that holds 12 lb cannon balls, a typical size of 1750 -1830. Each 12 lb ball is 4.4″ in diameter at room temperature, 20°C in those days. At 20°C, this monkey is about 44″ wide. The balls will fall off when the monkey shrinks more than the balls by about 1/3 of a diameter, 1.5″.

We can calculate ∆T, the temperature change, °C, that is required to lower the width-difference by 1.5″ as follows:

kepler conjecture, brass monkey

-1.5″ = ∆T x 44″ x 7.2 x10-4

We find that ∆T = -47°C. The temperature where this happens is 47 degrees cooler than 20°C, or -27°C. That’s 3.2°F, not a very unusual temperature on land, e.g. in Detroit, but on sea, the temperature is rarely much colder than 0°C or 32°F, the temperature where water freezes. If it gets to 3.2°F on the sea, something is surely amiss. To avoid this problem, land-based army cannon-crew uses a smaller brass monkey — e.g. the 5×5 shown. This stack holds 1/7 as many balls, but holds them to -74°F, a really cold temperature.

Robert E. Buxbaum, February 21, 2015. Some fun thoughts: Convince yourself that the key temperature is independent of the size of the cannon balls. That is, that I didn’t need to choose 12 pounders. A bit more advanced, what is the equation for the number of balls on any particular base-size monkey. Show that the packing density is no more efficient if the bottom lawyer were an equilateral triangle, and not a square. If you liked this, you might want to know how much wood a woodchuck chucks if a woodchuck could chuck wood, or on the relationship between mustaches and WWII diplomacy.

Einstein failed high-school math –not.

I don’t know quite why people persist in claiming that Einstein failed high school math. Perhaps it’s to put down teachers –who clearly can’t teach or recognize genius — or perhaps to stake a claim to a higher understanding that’s masked by ADHD — a disease Einstein is supposed to have had. But, sorry to say, it ain’t true. Here’s Einstein’s diploma, 1896. His math and physics scores are perfect. Only his English seems to have been lacking. He would have been 17 at the time.

Einstein's high school diploma

Albert Einstein’s high school diploma, 1896.

Robert Buxbaum, December 16, 2014. Here’s Einstein relaxing in Princeton. Here’s something on black holes, and on High School calculus for non-continuous functions.

Patterns in climate; change is the only constant

There is a general problem when looking for climate trends: you have to look at weather data. That’s a problem because weather data goes back thousands of years, and it’s always changing. As a result it’s never clear what start year to use for the trend. If you start too early or too late the trend disappears. If you start your trend line in a hot year, like in the late roman period, the trend will show global cooling. If you start in a cold year, like the early 1970s, or the small ice age (1500 -1800) you’ll find global warming: perhaps too much. Begin 10-15 years ago, and you’ll find no change in global temperatures.

Ice coverage data shows the same problem: take the Canadian Arctic Ice maximums, shown below. If you start your regression in 1980-83, the record ice year (green) you’ll see ice loss. If you start in 1971, the year of minimum ice (red), you’ll see ice gain. It might also be nice to incorporate physics thought a computer model of the weather, but this method doesn’t seem to help. Perhaps that’s because the physics models generally have to be fed coefficients calculated from the trend line. Using the best computers and a trend line showing ice loss, the US Navy predicted, in January 2006, that the Arctic would be ice-free by 2013. It didn’t happen; a new prediction is 2016 — something I suspect is equally unlikely. Five years ago the National Academy of Sciences predicted global warming would resume in the next year or two — it didn’t either. Garbage in -garbage out, as they say.

Arctic Ice in Northern Canada waters, 1970-2014 from 2014 is not totally in yet. What year do you start when looking for a trend?

Arctic Ice in Northern Canada waters, 1971-2014 from the Canadian ice service 2014 is not totally in yet , but is likely to exceed 2013. If you are looking for trends, in what year do you start?

The same trend problem appears with predicting sea temperatures and el Niño, a Christmastime warming current in the Pacific ocean. This year, 2013-14, was predicted to be a super El Niño, an exceptionally hot, stormy year with exceptionally strong sea currents. Instead, there was no el Niño, and many cities saw record cold — Detroit by 9 degrees. The Antarctic ice hit record levels, stranding a ship of anti warming activists. There were record few hurricanes.  As I look at the Pacific sea temperature from 1950 to the present, below, I see change, but no pattern or direction: El Nada (the nothing). If one did a regression analysis, the slope might be slightly positive or negative, but r squared, the significance, would be near zero. There is no real directionality, just noise if 1950 is the start date.

El Niño and La Niña since 1950. There is no sign that they are coming more often, or stronger. Nor is there evidence even that the ocean is warming.

El Niño and La Niña since 1950. There is no sign that they are coming more often, or stronger. Nor is clear evidence that the ocean is warming.

This appears to be as much a fundamental problem in applied math as in climate science: when looking for a trend, where do you start, how do you handle data confidence, and how do you prevent bias? A thought I’ve had is to try to weight a regression in terms of the confidence in the data. The Canadian ice data shows that the Canadian Ice Service is less confident about their older data than the new; this is shown by the grey lines. It would be nice if some form of this confidence could be incorporated into the regression trend analysis, but I’m not sure how to do this right.

It’s not so much that I doubt global warming, but I’d like a better explanation of the calculation. Weather changes: how do you know when you’re looking at climate, not weather? The president of the US claimed that the science is established, and Prince Charles of England claimed climate skeptics were headless chickens, but it’s certainly not predictive, and that’s the normal standard of knowledge. Neither country has any statement of how one would back up their statements. If this is global warming, I’d expect it to be warm.

Robert Buxbaum, Feb 5, 2014. Here’s a post I’ve written on the scientific method, and on dealing with abnormal statistics. I’ve also written about an important recent statistical fraud against genetically modified corn. As far as energy policy, I’m inclined to prefer hydrogen over batteries, and nuclear over wind and solar. The president has promoted the opposite policy — for unexplained, “scientific” reasons.