How Likely is a Lopsided League?

At various times, some divisions within sports leagues have been much stronger than others. But how likely is this phenomenon?

Sep 15, 2023

Welcome to Fiddler on the Proof! The Fiddler is the spiritual successor to FiveThirtyEight’s The Riddler column, which ran for eight years under the stewardship of myself and Ollie Roeder.

Each week, I present mathematical puzzles intended to both challenge and delight you. Beyond these, I also hope to share occasional writings about the broader mathematical and puzzle communities.

Puzzles come out Friday mornings (8 a.m. Eastern time). Most can be solved with careful thought, pencil and paper, and the aid of a calculator. Many include “extra credit,” where the analysis gets particularly hairy or where you might turn to a computer for assistance.

I’ll also give a shoutout to 🎻 one lucky winner 🎻 of the previous week’s puzzle, chosen randomly from among those who submit their solution before 11:59 p.m. the Monday after that puzzle was released. I’ll do my best to read through all the submissions and give additional shoutouts to creative approaches or awesome visualizations, the latter of which could receive 🎬 Best Picture Awards 🎬.

This Week’s Fiddler

From Dave Moran comes a look at the lopsided standings we often see in sports:

About halfway through the current Major League Baseball season, all five teams in the American League East division had better records (i.e., winning percentages, or percent of games won) than all five teams in the American League Central region.

Inspired by this surprising fact, suppose Fiddler League Baseball has six divisions, with five teams in each division. For simplicity, further suppose each team has a winning percentage chosen randomly, uniformly, and independently between zero percent and 100 percent.

Let’s look at two divisions: The Enigma League East division and the Enigma League Central division. What is the probability that every team in the Enigma League East division has a higher winning percentage than every team in the Enigma League Central division?

Submit your answer

Extra Credit

Dave Moran also came up with some Extra Credit for this puzzle:

Among all six divisions in Fiddler League Baseball, what is the probability that there exist two divisions such that every team in one division has a higher winning percentage than every team in another division?

(Note that this includes cases where multiple divisions are better or worse than others, such as having two divisions that both have higher winning percentages than some third division.)

Submit your answer

Making the Rounds

There’s so much more puzzling goodness out there, I’d be remiss if I didn’t share some of it here. This week, I’m sharing the latest TED-Ed riddle from Dan Finkel. Definitely check out the video, which (spoiler alert!) includes the solution at the end. Here’s a summarized version of the puzzle:

When you roll two fair dice, the outcomes for their sum (from two to 12) follow a triangular probability distribution, with seven being the most likely outcome.
Suppose you have two fair dice (i.e., all six faces come up with equal probability), but now it’s up to you to assign whole numbers for each die’s six faces. The faces on one die can be whole numbers from one to four—yes, this maximum is four. But the faces on the other die can be any whole numbers. Note that duplicate values are allowed on each die.
How can you assign values to these faces so that the probability distribution remains unchanged, with precisely the same triangular shape?

This is a neat puzzle in its own right. But what’s more, I was delighted to hear that it was related to Dan by my colleague over at Amplify Education, Eda Aydemir. Great find, Eda!

Last Week’s Fiddler

Congratulations to the (randomly selected) winner from last week: 🎻 Ryan Lafitte 🎻, from Tucker, Georgia. I received 62 timely submissions, of which 50 were correct—good for an 81 percent solve rate.

Last week, you analyzed a pattern on a square weaving loom with N hooks on each side, evenly spaced from one corner to another (i.e., there were two hooks on the two corners and N−2 hooks between them). The hooks along one side were labeled A₁ through A_N, the hooks on the next clockwise side B₁ through B_N(with A_N and B₁ denoting the same hook), the hooks on the third clockwise side C₁ through C_N, and the hooks on the final side D₁ through D_N.

Next, pairs of hooks were connected via elastic bands. In particular, A₁ and B₁ were connected, as were A₂ and B₂, A₃ and B₃, and so on, up to A_N and B_N. When N was 100, the loom looked like this:

A square loom. Points along the left edge and connected via green lines to points along the bottom edge, forming a curved shape.

As N increased, what was the shape of the curve formed by the edges of the bands?

A few readers thought it was part of a circle or ellipse. Someone else thought it was a catenary. Six readers guessed it was a hyperbola, and that the left and bottom edges of the square were the two asymptotes. However, this couldn’t have been the case, since the curve passed through the top left and bottom right corners of the square, whereas hyperbolas never quite reach their asymptotes.

As it turned out, none of these guesses was correct. However, there were quite a few different ways to describe the correct answer. But before we get to those, let’s derive an equation for the curve.

For simplicity, let’s say the square is a unit square, so that its four vertices are the points (0, 0), (1, 0), (1, 1), and (0, 1). Each band connects a point (p, 0) to (0, 1−p) for some value of p between 0 and 1. Meanwhile, the points along such a band can be represented as a linear combination of those two endpoints: q·(p, 0) + (1−q)·(0, 1−p), where q ranges from 0 to 1. We can write this as a single point: (pq, (1−p)·(1−q)). So if you pick all the different pairs of values for p and q between 0 and 1, you’ll get the green region in the figure above.

Next, to generate the curve atop that green region, let’s look at the x-coordinate of this generic point, pq, and let’s call that quantity x. The corresponding y-coordinate is (1−p)·(1−q), which we can rewrite as (1−p)·(1−x/p), or 1−p−x/p+1. For a fixed value of x, a little calculus will confirm that this expression for y is maximized when p = √x. So the points along the topmost curve have the form (x, (1−√x)²), which means the curve can be represented by the equation y = (1−√x)².

That was one way to represent the solution, but there were many others. You could have expanded the equation so it became y = 1+x−2√x. Or you could have written it more symmetrically as √x + √y = 1, which some solvers described as a squircle (I gave credit for this, although I usually think of squircles as involving exponents greater than 1).

But perhaps the simplest way to describe this curve was “parabola.” Of course, it didn’t look like the parabola you’d see in a high school algebra textbook because it was rotated by 45 degrees, as illustrated by last week’s winner, Ryan Lafitte, and 🎬 Tom Keith 🎬. Here was Tom’s diagram:

A zoomed out version of the original loom. The curve from the top left corner to the bottom right corner is shown to be part of a larger parabola. The bottom and left edges of the loom are tangent to the parabola.

To prove this, solver Peter Exterkate transformed the coordinate system, defining the new variables t = (x−y)/√2 and u = (x+y)/√2. According to Peter, the parabola had the equation u = (2t²+1)/(2√2).

I think solver Josh Silverman’s haiku summed this puzzle up rather nicely:

Threads weave secrets tight,
Mystery curve in plain sight,
Shifted view, insight:
Parabola

Last Week’s Extra Credit

Congratulations to the (randomly selected) winner from last week: 🎻 Lowell Vaughn 🎻, from Bellevue, Washington. For the Extra Credit, I received 38 timely submissions, 36 of which were correct—a rather impressive 95 percent solve rate (for those bold enough to make the attempt).

For Extra Credit, there were four times as many bands placed on the weaving loom. In addition to the band connecting A₁ and B₁, other bands connected B₁ and C₁, C₁ and D₁, and D₁ and A₁. Similar quartets of bands were placed for all sets of hooks from 1 through N, for a total of 4N bands.

When N was 100, the loom looked like this:

The original loom, now with four such curves connecting each pair of adjacent sides. Each set of lines and curves are a different color. Between them all is a white region in the middle of the square.

As N increased, what fraction of the loom’s area lay between the four sets of bands? In other words, what fraction of the square above did the central white region make up?

Once you had the functional form of the four curves—or at least the form for one of them, since they were all rotations and reflections of each other—this became an exercise in integration. As with last week’s Fiddler, it was again useful to assume the loom was a unit square.

Now, there were many different ways to compute the area. Several solvers, like Michael Schubmehl, found the area of each of the four colored regions and then subtracted their overlap to find the total colored area. Finally, subtracting that from 1 (the total area of the square) gave the area of the white region.

Here, I’ll share a more direct approach, adapted from Rohan Lewis’s write-up. We can divide the central white region into eight distinct, congruent regions, one of which is shown below in gray:

One eighth of the central white region of the loom is shaded gray.

This region occupies the space above the function f(x) = (1−√x)² and below the function g(x) = x, from x = 1/4 on the left to x = 1/2 on the right. Its area can be expressed as the following integral:

\(8\int_{\frac{1}{4}}^\frac{1}{2}(f(x)-g(x))dx=8\int_{\frac{1}{4}}^\frac{1}{2}(2\sqrt{x}-1)dx=\frac{8\sqrt{2}-10}{3}\)

That final expression was approximately 0.4379, which meant the central white region made up about 43.79 percent of the square.

Want to Submit a Puzzle Idea?

Then do it! Your puzzle could be the highlight of everyone’s weekend. If you have a puzzle idea, shoot me an email. I love it when ideas also come with solutions, but that’s not a requirement.

Jim

The problem states that team winning percentages are random, with an independent, uniform distribution between 0% and 100%. This problem statement is kind of nonsensical, because team winning percentages can't be independent since overall the league has to have an average winning percentage of 50%. Every time one team wins their opponent loses. You can't have a league where every team has a winning percentage over 50%, for example. Is independent really the correct problem statement?

Expand full comment

2 replies by Zach Wissner-Gross