Skip to content

Optimize log1mexp(log1mexp(x)) -> x #471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
NathanielF opened this issue Oct 9, 2023 · 4 comments · May be fixed by #474
Open

Optimize log1mexp(log1mexp(x)) -> x #471

NathanielF opened this issue Oct 9, 2023 · 4 comments · May be fixed by #474

Comments

@NathanielF
Copy link

NathanielF commented Oct 9, 2023

Describe the issue:

As discussed in pymc-devs/pymc-examples#580 (comment)

There are problems invoking the pm.Censored functionality when trying to building Weibull AFT models using pm.Censored instead of the hand-crafted likelihood pattern using Potentials.

Reproduceable code example:

import pymc as pm
import numpy as np
import pandas as pd

X = pd.DataFrame(np.random.normal(size=(3000, 5)))
y = np.random.choice(np.arange(1, 12, 1), size=3000)
cens = np.random.choice([0, 1], size=3000, p=[0.7, 0.3])

#### This model runs the Weibull samples
with pm.Model(coords=coords) as model:
        X_data = pm.MutableData('X_data_obs', X)
        beta = pm.Normal("beta", 0.0, 1, shape=5)
        mu = pm.Normal('mu', 0, 1)

        s = pm.HalfNormal('s', 5.)
        eta = pm.Deterministic('eta', pm.math.dot(beta, X_data.T))
        reg = pm.Deterministic('reg', pt.exp(-( (mu + eta)) / s))
        y_obs = pm.Weibull("y_obs", beta=reg, alpha=s, observed=y)
        #weibull_dist = pm.Weibull.dist(beta=reg, alpha=s)
        #y_obs = pm.Censored('y_obs', weibull_dist, lower=None, upper=np.where(cens, y, y+1), observed=y)
        idata = pm.sample(target_accept=0.95)
        idata.extend(pm.sample_posterior_predictive(idata))


### This model fails using pm.Censored

with pm.Model(coords=coords) as model:
        X_data = pm.MutableData('X_data_obs', X)
        beta = pm.Normal("beta", 0.0, 1, shape=5)
        mu = pm.Normal('mu', 0, 1)

        s = pm.HalfNormal('s', 5.)
        eta = pm.Deterministic('eta', pm.math.dot(beta, X_data.T))
        reg = pm.Deterministic('reg', pt.exp(-( (mu + eta)) / s))
        #y_obs = pm.Weibull("y_obs", beta=reg, alpha=s, observed=y)
        weibull_dist = pm.Weibull.dist(beta=reg, alpha=s)
        y_obs = pm.Censored('y_obs', weibull_dist, lower=None, upper=np.where(cens, y, y+1), observed=y)
        idata = pm.sample(target_accept=0.95)
        idata.extend(pm.sample_posterior_predictive(idata))

Error message:

{
	"name": "SamplingError",
	"message": "Initial evaluation of model at starting point failed!\nStarting values:\n{'beta': array([ 0.34306535,  0.08951002,  0.0309349 ,  0.88065407, -0.20477662]), 'mu': array(0.28941877), 's_log__': array(1.41547689)}\n\nLogp initial evaluation results:\n{'beta': -5.07, 'mu': -0.96, 's': -0.76, 'y_obs': -inf}\nYou can call `model.debug()` for more details.",
	"stack": "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mSamplingError\u001b[0m                             Traceback (most recent call last)\n\u001b[1;32m/Users/nathanielforde/Documents/Github/pymc-examples/examples/survival_analysis/frailty_models.ipynb Cell 18\u001b[0m line \u001b[0;36m1\n\u001b[1;32m     <a href='vscode-notebook-cell:/Users/nathanielforde/Documents/Github/pymc-examples/examples/survival_analysis/frailty_models.ipynb#Y140sZmlsZQ%3D%3D?line=12'>13</a>\u001b[0m weibull_dist \u001b[39m=\u001b[39m pm\u001b[39m.\u001b[39mWeibull\u001b[39m.\u001b[39mdist(beta\u001b[39m=\u001b[39mreg, alpha\u001b[39m=\u001b[39ms)\n\u001b[1;32m     <a href='vscode-notebook-cell:/Users/nathanielforde/Documents/Github/pymc-examples/examples/survival_analysis/frailty_models.ipynb#Y140sZmlsZQ%3D%3D?line=13'>14</a>\u001b[0m y_obs \u001b[39m=\u001b[39m pm\u001b[39m.\u001b[39mCensored(\u001b[39m'\u001b[39m\u001b[39my_obs\u001b[39m\u001b[39m'\u001b[39m, weibull_dist, lower\u001b[39m=\u001b[39m\u001b[39mNone\u001b[39;00m, upper\u001b[39m=\u001b[39mnp\u001b[39m.\u001b[39mwhere(cens, y, y\u001b[39m+\u001b[39m\u001b[39m1\u001b[39m), observed\u001b[39m=\u001b[39my)\n\u001b[0;32m---> <a href='vscode-notebook-cell:/Users/nathanielforde/Documents/Github/pymc-examples/examples/survival_analysis/frailty_models.ipynb#Y140sZmlsZQ%3D%3D?line=14'>15</a>\u001b[0m idata \u001b[39m=\u001b[39m pm\u001b[39m.\u001b[39;49msample(target_accept\u001b[39m=\u001b[39;49m\u001b[39m0.95\u001b[39;49m)\n\u001b[1;32m     <a href='vscode-notebook-cell:/Users/nathanielforde/Documents/Github/pymc-examples/examples/survival_analysis/frailty_models.ipynb#Y140sZmlsZQ%3D%3D?line=15'>16</a>\u001b[0m idata\u001b[39m.\u001b[39mextend(pm\u001b[39m.\u001b[39msample_posterior_predictive(idata))\n\nFile \u001b[0;32m~/mambaforge/envs/pymc_examples_new/lib/python3.9/site-packages/pymc/sampling/mcmc.py:619\u001b[0m, in \u001b[0;36msample\u001b[0;34m(draws, tune, chains, cores, random_seed, progressbar, step, nuts_sampler, initvals, init, jitter_max_retries, n_init, trace, discard_tuned_samples, compute_convergence_checks, keep_warning_stat, return_inferencedata, idata_kwargs, nuts_sampler_kwargs, callback, mp_ctx, model, **kwargs)\u001b[0m\n\u001b[1;32m    617\u001b[0m ip: Dict[\u001b[39mstr\u001b[39m, np\u001b[39m.\u001b[39mndarray]\n\u001b[1;32m    618\u001b[0m \u001b[39mfor\u001b[39;00m ip \u001b[39min\u001b[39;00m initial_points:\n\u001b[0;32m--> 619\u001b[0m     model\u001b[39m.\u001b[39;49mcheck_start_vals(ip)\n\u001b[1;32m    620\u001b[0m     _check_start_shape(model, ip)\n\u001b[1;32m    622\u001b[0m \u001b[39m# Create trace backends for each chain\u001b[39;00m\n\nFile \u001b[0;32m~/mambaforge/envs/pymc_examples_new/lib/python3.9/site-packages/pymc/model.py:1784\u001b[0m, in \u001b[0;36mModel.check_start_vals\u001b[0;34m(self, start)\u001b[0m\n\u001b[1;32m   1781\u001b[0m initial_eval \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mpoint_logps(point\u001b[39m=\u001b[39melem)\n\u001b[1;32m   1783\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mall\u001b[39m(np\u001b[39m.\u001b[39misfinite(v) \u001b[39mfor\u001b[39;00m v \u001b[39min\u001b[39;00m initial_eval\u001b[39m.\u001b[39mvalues()):\n\u001b[0;32m-> 1784\u001b[0m     \u001b[39mraise\u001b[39;00m SamplingError(\n\u001b[1;32m   1785\u001b[0m         \u001b[39m\"\u001b[39m\u001b[39mInitial evaluation of model at starting point failed!\u001b[39m\u001b[39m\\n\u001b[39;00m\u001b[39m\"\u001b[39m\n\u001b[1;32m   1786\u001b[0m         \u001b[39mf\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mStarting values:\u001b[39m\u001b[39m\\n\u001b[39;00m\u001b[39m{\u001b[39;00melem\u001b[39m}\u001b[39;00m\u001b[39m\\n\u001b[39;00m\u001b[39m\\n\u001b[39;00m\u001b[39m\"\u001b[39m\n\u001b[1;32m   1787\u001b[0m         \u001b[39mf\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mLogp initial evaluation results:\u001b[39m\u001b[39m\\n\u001b[39;00m\u001b[39m{\u001b[39;00minitial_eval\u001b[39m}\u001b[39;00m\u001b[39m\\n\u001b[39;00m\u001b[39m\"\u001b[39m\n\u001b[1;32m   1788\u001b[0m         \u001b[39m\"\u001b[39m\u001b[39mYou can call `model.debug()` for more details.\u001b[39m\u001b[39m\"\u001b[39m\n\u001b[1;32m   1789\u001b[0m     )\n\n\u001b[0;31mSamplingError\u001b[0m: Initial evaluation of model at starting point failed!\nStarting values:\n{'beta': array([ 0.34306535,  0.08951002,  0.0309349 ,  0.88065407, -0.20477662]), 'mu': array(0.28941877), 's_log__': array(1.41547689)}\n\nLogp initial evaluation results:\n{'beta': -5.07, 'mu': -0.96, 's': -0.76, 'y_obs': -inf}\nYou can call `model.debug()` for more details."
}

Context for the issue:

Trying to use pm.Censored pattern for survival regression models in a PyMC example rather than the pm.Potential pattern

@NathanielF NathanielF added the bug Something isn't working label Oct 9, 2023
@ricardoV94
Copy link
Member

Looks like numerical precision issues:

import pymc as pm
import numpy as np
import pandas as pd
import pytensor.tensor as pt

rng = np.random.default_rng(42)
X = pd.DataFrame(rng.normal(size=(30, 5)))
y = rng.choice(np.arange(1, 12, 1), size=30)
cens = rng.choice([0, 1], size=30, p=[0.7, 0.3])

with pm.Model() as model:
        X_data = pm.MutableData('X_data_obs', X)
        beta = pm.Normal("beta", 0.0, 1, shape=5)
        mu = pm.Normal('mu', 0, 1)
        s = pm.HalfNormal('s', 5.)
        eta = pm.math.dot(beta, X_data.T)
        reg = pt.exp(-( (mu + eta)) / s)
        weibull_dist = pm.Weibull.dist(beta=reg, alpha=s)
        pot1 = pm.Potential("pot1", pt.log1mexp(pm.logcdf(weibull_dist, y)))
        pot2 = pm.Potential("pot2", -((y / reg) ** s))

a, b = model.compile_logp([pot1, pot2], sum=False)(model.initial_point())
list(zip(list(a), list(b)))
[(-inf, -7775.999999999987),
 (-31.99999999999998, -31.99999999999998),
 (-inf, -1023.9999999999987),
 (-inf, -16806.99999999997),
 (-inf, -1023.9999999999987),
 (-31.99999999999998, -31.99999999999998),
 (-inf, -32767.999999999938),
 (-inf, -32767.999999999938),
 (-31.99999999999998, -31.99999999999998),
 (-inf, -1023.9999999999987),
 (-242.99999999999977, -242.99999999999977),
 (-inf, -32767.999999999938),
 (-inf, -59048.99999999988),
 (-inf, -32767.999999999938),
 (-inf, -161050.99999999965),
 (-inf, -59048.99999999988),
 (-inf, -1023.9999999999987),
 (-31.99999999999998, -31.99999999999998),
 (-inf, -16806.99999999997),
 (-inf, -161050.99999999965),
 (-inf, -161050.99999999965),
 (-242.99999999999977, -242.99999999999977),
 (-242.99999999999977, -242.99999999999977),
 (-1.0, -1.0),
 (-31.99999999999998, -31.99999999999998),
 (-inf, -16806.99999999997),
 (-inf, -16806.99999999997),
 (-inf, -3124.9999999999955),
 (-inf, -1023.9999999999987),
 (-inf, -99999.9999999998)]

@ricardoV94
Copy link
Member

Funnily enough the logcdf is just pt.log1mexp(logccdf), and the two aren't canceled out by PyTensor.

x = pt.scalar("x")
y = pt.log1mexp(pt.log1mexp(x))
y.eval({x: -1000})  # array(-inf)

Should be easy to add a "useless rewrite"

@ricardoV94 ricardoV94 transferred this issue from pymc-devs/pymc Oct 9, 2023
@ricardoV94 ricardoV94 changed the title BUG: Issue with pm.Censored for Weibull distribution Optimize log1mexp(log1mexp(x)) -> x Oct 9, 2023
@ricardoV94 ricardoV94 added enhancement New feature or request graph rewriting beginner friendly and removed bug Something isn't working labels Oct 9, 2023
@NathanielF
Copy link
Author

Not at all familiar with pytensor but is that as easy as it sounds?

@ricardoV94
Copy link
Member

The optimization is easy. Would look something like this

# Case for exp(log1mexp(x)) -> 1 - exp(x)

This won't fix the issue on the PyMC model because there's currently a switch between the two log1mexp to check if the parameters a,b are valid. But if you set Model(check_bounds=False) and this rewrite exists then it would fix it.

We could get rid of this switch by checking it must be positive but that's a story for another time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants