May 20, 2012

Will Today be the Day? Labor Predictor

One question I have on my mind a lot lately, as I’m sure every pregnant woman starts asking, is “what are the odds of my baby coming today?” or “in the next couple of days?”. The trouble is, it’s really hard to find any kind of answer to that question online. Some babies come early, some come late. Any that come between 37 weeks and 42 are considered ‘right on time’. Well, the math nerd in me wasn’t satisfied with that answer.

I previously found this chart online, which uses a normal distribution of mean 40 weeks (or 280 days) and standard deviation of 10 days to estimate the probability of going into labor. Or N(280,102) for you statisticians out there. The normal distribution is symmetric, which would mean one’s odds of going into labor one day before one’s due date is the same as going into labor one day after. I suspect the actual probability distribution of spontaneous labor is closer to a left skew, or negative skew, normal than a standard normal distribution. A negative skew would mean one’s odds of a very premature labor are greater than one’s odds of a very postmature labor. After all, you could go into labor at 34 weeks (x = -42 days). According to the normal distribution N(280,102), 6 in every 1 million babies would be born at 34 weeks and 6 in every 1 million babies to be born at 46 weeks (x = 42 days). Given that 4 million babies are born each year in the US, that would be 24 babies would be born at 46 weeks gestation per year in the US alone! Of course these days doctors tend not to let women go more than 42 weeks due to health risks, so it’s impossible to say how far those women would have gone in their pregnancies. Still, I doubt 24 of them would have made it to 46 weeks. Baby’s got to run out of room eventually, and at some point the female body just can’t handle it anymore!

A skewed normal and normal distribution are very similar when you’re close to the middle (ie close to the due date.) The two distributions are less similar when you get further from the middle (ie further from the due date.) I was really interested in knowing how likely labor was TODAY, approximately 6 weeks before my due date, so the normal distribution wasn’t going to cut it.

I wanted to estimate a skewed distribution, but how to do that without any data? Fortunately cites several studies which indicates the true likelihood is approximately normal, so I need a skewed normal distribution that is close to N(280,102) – characteristic one. Our doctor also told us 10% babies are delivered prematurely (before 37 weeks) – characteristic two. (The normal distribution N(280,102) predicts only 3% of babies will be delivered prematurely). We also know that roughly half of pregnant women go into labor before their due date, and half afterwards – characteristic 3. Skewed distributions have three parameters (location, scale and shape), so all I had to do was tweak these parameters until I have a distribution with all three characteristics. Should be easy, right?

Five hours later…

I wanted to create my model using excel, rather than Matlab or R, two programs especially designed for statistics. I haven’t touched either in a while, and didn’t want to re-learn them. Excel has support for doing normal distributions, but nothing for skewed normal. That meant I had to implement the functions on my own, and my calculus skills are only slightly less rusty than my Matlab or R skills. At some point I probably should have given up and switched over to Matlab, but I was stubborn and determined to get it! It was a matter of pride.

In the end I came up with a skewed normal with location 295, scale 21 and shape -4. This distribution shows approximately 10% of babies will be premature, half of all pregnancies will be early while half will be late, and the squared error between the two distributions is less than 2 X 10-3. For another sanity check, it shows a mean average deliver date as 279 days, or 39 weeks 6 days.

My model (blue) as compared to the normal distribution (red). I plotted them both assuming ‘0 days’ as the due date instead of 280 to make it easier to read.

Interesting side note: while the model shows half of women go into labor before their due date, the day with the highest probability of spontaneous labor is 7 days after her due date, which matches conventional wisdom!

So what does this mean for me? Given that zippy isn’t here yet, I have a 0.1% chance of going into labor today and a 1.36% chance of going into labor in the next seven days! That’s 30 times higher than the prediction I was getting with the normal distribution!

Of course this is just an estimate, and all meant to be in good fun. Without data, my model is only a guesstimate. Nevertheless, my math nerd itch has been scratched.

You can try the tool out for yourself here.

Related posts:

Posted in Life | Tags:


  1. I really enjoyed your post on moving the distribution to a skewed one around the due date. I live in Australia and in my efforts to find some statistical data on Australia I came across a document filled with lovely data on many different aspects of childbirth. One table, table 27b page 48 of the document and 51 of the PDF, which shows for single births 92.6% of births occure between week 37 and week 41. This demonstrates support for skewing the distribution. The Document can be found at

    Thanks for your post.

    • Excellent! I’ll have to update my model when I have a chance. Thanks for sharing.

  2. Thank you for linking to me and it’s always nice to read about other moms who don’t mind diving into the statistics 🙂

    However, it seems like you first assumed that the distribution should be a skewed distribution, then made a model around the assumption. This doesn’t really line up with the studies I found, or the survey I did that still has the most common days of labor at 40 weeks, plus or minus a few days:

    And I have this page that really dives into the internet rumor that 41 weeks is the most common day to give birth… a rumor that I think has gotten entirely out of hand:

    Both these pages back up the assumption that 41 weeks is a perfectly fine and quite common day to have a baby. But it’s not the most popular day, or the average.

    • You are correct, I started with the assumption that the distribution was skewed. I was mainly interested in the tails, particularly the left side (premature births) since I was still about a month and a half away from my due date at the time :).

      I didn’t use any assumptions about the most common day to give birth when coming up with my model, although that did end up being a side effect of my model. I used the fact that 10% of babies are delivered prematurely, that the distribution is approximately normal around 40 weeks, and that half of all babies are delivered early (median 40 weeks).

      I’m surprised that you say your data doesn’t look skewed. To me your data has a clear albeit small skew, especially at the extremes.


  1. […] car seats are ridiculously hard to install. Yes, I can write statistical models that predict when I will go into labor, but apparently I can’t install a car seat in a way that feels secure to me. Luckily, our local […]

  2. […] due date. Your dad and I think about your pending arrival all the time. I keep going back to the labor prediction calculator I wrote. You only have a 24% chance of being a Gemini now, a 16% chance of being born before […]

  3. […] was generating a ton of traffic. This happens from time to time (normally it’s a link to my labor predictor) but this time it was to Newborn Photography page. I was so excited to see what strangers thought […]

  4. […] most popular page is my Labor Predictor. On any given day about 40-50% of the page views to my blog are on the labor predictor. I’m […]

  5. […] most popular post by such a big a factor of 18x is my statistical model for labor. I really must follow it up with some more geeky posts. I’m kind of surprised the baby name […]

Leave a Reply

Your email address will not be published.