Nested model and `get_fit_data` proposal #32

21ch216 · 2025-02-21T16:53:03Z

Condensed Proposal for Issues #31, #28, and #27

Brief Context

To start, let’s clarify how the current relationships between object types work in the pipeline that allows for simulating data that can be used to fit a model (here ParametricLifetimeModel):

[ LifetimeModel ] --> used in --> [ Policy ]

[ Policy ].sample --> generates --> [ PolicySample ]

[ PolicySample ].to_fit  --> returns -->  [ time ], [ event ], [ entry ]

I think that this pipeline is uncorrect in cases where the policy involves application of a second model, model1 for the first cycle (as discussed toward the end of issue #28).

For clarity going forward, I propose renaming:

The to_fit method to get_fit_data, and
The CountData class to PolicySample.

Proposed Changes:

We cannot expect a PolicySample object to have a get_fit_data method. This functionality should belong to models (eg. LifetimeModel).
Objects that need a fit method (e.g., LifetimeModel, NHPP, Cox, GammaProcess, etc.) should also provide a get_fit_data method.
Rename the fit method in Policy objects to optimize, to distinguish optimization of policy parameters (like a_r) from model parameter estimation (fit).
PolicySample objects should be used only for visualizing sampled data and performing empirical cost computations.

A New `get_fit_data` Method

As proposed in issue #28, generating fit data should not be handled by a policy object but instead directly by the relevant model. Each LifetimeModel (or derivative) should support a get_fit_data method.
Example usage:

# Generate fit data directly from the model
model: LifetimeModel
time, event, entry = model.get_fit_data(size=..., t0=..., tf=...)

# When t0 != 0, left truncations are applied
time, event, entry = model.get_fit_data(size=..., t0=..., tf=...)  # t0 != 0

Here:

t0 and tf define the observation window.

Similarly, NHPP models, which have fit methods to estimate intensity function parameters, should also implement this:

nhpp: NHPP
ages, assets, a0, af = nhpp.get_fit_data(size=..., t0=..., tf=...)

Fluent Interface for Nested Models

A natural question arises: how do we generate data with age replacement if this functionality is removed from Policy and PolicySample?
The solution lies in using wrapper models like AgeReplacementModel, enhanced with a get_fit_data method. However, introducing multiple classes like NHPPAgeReplacementModel for every potential model type (e.g., LifetimeModel, NHPP, etc.) would be cumbersome and inconvenient for users.
Instead, I propose a fluent interface that provides unified methods to instantiate transformed versions of existing models.

Example:

# Replace at a defined age to handle age replacement
model = model.replace_at_age(ar)  # Returns an AgeReplacementModel
time, event, entry = model.get_fit_data(
    size=..., t0=..., tf=...
)  # Lifetimes are censored based on `ar`

# Apply left truncation
model = model.left_truncate_by(a0)  # Returns a LeftTruncatedModel
time, event, entry = model.get_fit_data(
    size=..., t0=..., tf=...
)  # Lifetimes are left-truncated by `a0`

# Fluent interface for NHPP
nhpp = nhpp.replace_at_age(ar)  # Returns AgeReplacementNHPP
ages, assets, a0, af = nhpp.get_fit_data(
    size=..., t0=..., tf=...
)  # Generates data with `af = ar` or `tf`

Benefits: this uniform interface supports transforming models (e.g., applying age replacement or left truncation). Users need to remember only two main methods—replace_at_age and left_truncate_by, no matter the model type.
2. Unlike the current approach, this provides a unified way to manage transformations (e.g., applying ar and/or a0) across model types like LifetimeModel, RenewalProcess, and NHPP.

Improved `Policy` Interface

Finally, here is the proposed structure for the Policy interface. Policy objects would define the following:

Methods for calculating expected costs (see issue Simpler method names for Policy class #25 for simplified function names).
An optimize method to compute and set the optimal parameter (e.g., ar).
A sample method to generate PolicySample objects. These are invisible to the user but contain the sampled data.
Cost functions for sampled data: once sampling is performed, these become accessible
Plotting: visualization methods, which require sampling to have been performed before they can be used.

Note: Expected costs work without sampling, while empirical cost functions and plotting require prior sampling.

Code Example:

policy = AgeReplacementPolicy(...)

timeline: NDArray[np.float64]

# Expected costs (work without sampling if `ar` is set during initialization or via optimize)
policy.expected_cost(timeline)
policy.expected_eac(timeline)

# Empirical cost functions (requires sampling)
policy.mean_cost(timeline)  # Error, since no sample was generated
policy.sample(size=..., tmax=...)  # `tmax` is the horizon time
policy.mean_cost(timeline)  # Works: sample was performed

# Plotting sampled data
policy.plot.nb_event_count(timeline)  # Plots the sampled number of events

Implications

Setting ar as an object attribute
I see one implication that is not disturbing. In AgeReplacementPolicy, ar in optimized and is passed as an argument to the optimized function. The workaround is simply to set the current ar value into the model object each time the optmized function is called by the optimizer. The same is already done from LikelihoodFromLifetimes and model.params. There is no major loss in run time.

The text was updated successfully, but these errors were encountered:

21ch216 assigned 21ch216, e4f2149b and tomguillon Feb 21, 2025

This was referenced Feb 24, 2025

Set ar as an attribute in AgeReplacementModel class #31

Open

to_fit data in NHPP/NHPPAgeReplacementPolicy #28

Open

Simplify class names #29

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nested model and `get_fit_data` proposal #32

Nested model and `get_fit_data` proposal #32

21ch216 commented Feb 21, 2025

Nested model and get_fit_data proposal #32

Nested model and get_fit_data proposal #32

Comments

21ch216 commented Feb 21, 2025

Condensed Proposal for Issues #31, #28, and #27

Brief Context

Proposed Changes:

A New get_fit_data Method

Fluent Interface for Nested Models

Example:

Improved Policy Interface

Code Example:

Implications

Nested model and `get_fit_data` proposal #32

Nested model and `get_fit_data` proposal #32

A New `get_fit_data` Method

Improved `Policy` Interface