Survival Times and Probabilities

My acquaintance gwern had a question: given the current total lifetime of a certain web service, how to infer the probable future lifetime?

Quick Estimate

A quick and dirty intuitive method of calculating this uses Laplace’s rule of succession: treat survival or failure as a Bernoulli trial per unit time, so that given that the subject has survived for N days without a closure on any of those days, we could say that the probability of surviving the next day is as given:

p(N)=1N+2

In that case the probability that we survive at least the next n days could be pn, or

P(n|N)=(1N+2)n

But of course that’s not really right, because after the first of those n days we have one more day of information, so we’re a little bit more likely to survive the next. Therefore it can’t just be pn. We really want:

P(n|N)=d=0n1(1N+d+2)

Problem?

The above solutions have two big problems. The first is somewhat obvious: the second, more correct solution requires n to be an integral multiple of your unit of time. You can’t ask about n=1.618 days, because you can’t take the product of discrete steps from d=0 to 1.618.

The second problem is more subtle: neither of these solutions are invariant with respect to the base unit of time. This is easiest to see with the first solution – changing from measuring in days to months turns (1N+2)n into (1N/30+2)(n/30) which is not identically equal. This is a problem because our units of time are an arbitrary convention and should make no difference to the result.

A True Solution

The resolution to both those problems lies in observing that the rule of succession isn’t really aware of time as a continuous (as opposed to discrete) notion. We need a proper model. As it turns out, the usual model for ‘lifetimes’ of things is a negative exponential:

P(T=t|a)=aeat

P(T>t|a)=eat

This could perhaps be called the continuous generalization of a Bernoulli trial. The event has a constant probability per unit time given that it hasn’t happened yet, described by a, which is called the “hazard rate”. a has units of inverse time, or “events per unit time”, as you’d expect.

In our case, a is an unknown variable. The only input we’ve been given is the survival time so far, Y, from which we must calculate the distribution for the future survival time, X (switching to variables X and Y here rather than n,N since we’re no longer thinking discretely). Our goal is to calculate the posterior distribution:

P(X=x|Y=y)

or

P(T=(x+y)|T>y)

We can start by calculating the posterior distribution on a:

P(a|T>y)=P(T>y|a)P(a)P(T>y)=P(T>y|a)P(a)da P(T>y|a)P(a)

A reasonable choice of prior P(a) is also a negative exponential P(a)=wewa. We can take the limit w0 for the flat (improper) prior of maximum ignorance. Then,

P(T>y|a)P(a)=eaywewa=wea(y+w)P(T>y)=w0ea(y+w) da=wy+w

So we have our posterior distribution on a:

P(a|T>y)=(y+w)ea(y+w)

With this we can calculate the posterior pdf on X, by marginalizing over a:

P(T=(x+y)|T>y)=0aeaxP(a|T>y) da=0aeax(y+w)ea(y+w) da=(y+w)0aea(w+y+x) da=w+y(w+y+x)2

And the posterior reverse cumulative distribution:

P(T>(x+y)|T>y)=0eax (y+w)ea(y+w) da=0(y+w)ea(w+y+x) da=w+yw+y+x

Notice that w (the prior) seems to affect the result in the same way as would w units of previously observed survival time. Taking the ignorance prior w=0, the results are:

P(a|Y=y)=yeayP(X=x|Y=y)=y(y+x)2P(X>x|Y=y)=yy+x

The expected value of a is 1/y which is quite reasonable, but notice that since xP(X=x|Y=y) goes as 1/x asymptotically, the expectation E[x|y] doesn’t converge.

Also notice that these results are properly invariant with respect to the unit of time. P(X>x|Y=y) is a unitless probability, and therefore is completely invariant as it should be, while P(X=x|Y=y) and P(a|Y=y) are probability densities with respect to time and inverse time, and therefore have units of inverse time / time as they should.