Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested model and get_fit_data proposal #32

Open
21ch216 opened this issue Feb 21, 2025 · 0 comments
Open

Nested model and get_fit_data proposal #32

21ch216 opened this issue Feb 21, 2025 · 0 comments
Assignees

Comments

@21ch216
Copy link
Collaborator

21ch216 commented Feb 21, 2025

Condensed Proposal for Issues #31, #28, and #27

Brief Context

To start, let’s clarify how the current relationships between object types work in the pipeline that allows for simulating data that can be used to fit a model (here ParametricLifetimeModel):

[ LifetimeModel ] --> used in --> [ Policy ]

[ Policy ].sample --> generates --> [ PolicySample ]

[ PolicySample ].to_fit  --> returns -->  [ time ], [ event ], [ entry ] 

I think that this pipeline is uncorrect in cases where the policy involves application of a second model, model1 for the first cycle (as discussed toward the end of issue #28).

For clarity going forward, I propose renaming:

  • The to_fit method to get_fit_data, and
  • The CountData class to PolicySample.

Proposed Changes:

  1. We cannot expect a PolicySample object to have a get_fit_data method. This functionality should belong to models (eg. LifetimeModel).
  2. Objects that need a fit method (e.g., LifetimeModel, NHPP, Cox, GammaProcess, etc.) should also provide a get_fit_data method.
  3. Rename the fit method in Policy objects to optimize, to distinguish optimization of policy parameters (like a_r) from model parameter estimation (fit).
  4. PolicySample objects should be used only for visualizing sampled data and performing empirical cost computations.

A New get_fit_data Method

As proposed in issue #28, generating fit data should not be handled by a policy object but instead directly by the relevant model. Each LifetimeModel (or derivative) should support a get_fit_data method.
Example usage:

# Generate fit data directly from the model
model: LifetimeModel
time, event, entry = model.get_fit_data(size=..., t0=..., tf=...)

# When t0 != 0, left truncations are applied
time, event, entry = model.get_fit_data(size=..., t0=..., tf=...)  # t0 != 0

Here:

  • t0 and tf define the observation window.

Similarly, NHPP models, which have fit methods to estimate intensity function parameters, should also implement this:

nhpp: NHPP
ages, assets, a0, af = nhpp.get_fit_data(size=..., t0=..., tf=...)

Fluent Interface for Nested Models

A natural question arises: how do we generate data with age replacement if this functionality is removed from Policy and PolicySample?
The solution lies in using wrapper models like AgeReplacementModel, enhanced with a get_fit_data method. However, introducing multiple classes like NHPPAgeReplacementModel for every potential model type (e.g., LifetimeModel, NHPP, etc.) would be cumbersome and inconvenient for users.
Instead, I propose a fluent interface that provides unified methods to instantiate transformed versions of existing models.

Example:

# Replace at a defined age to handle age replacement
model = model.replace_at_age(ar)  # Returns an AgeReplacementModel
time, event, entry = model.get_fit_data(
    size=..., t0=..., tf=...
)  # Lifetimes are censored based on `ar`

# Apply left truncation
model = model.left_truncate_by(a0)  # Returns a LeftTruncatedModel
time, event, entry = model.get_fit_data(
    size=..., t0=..., tf=...
)  # Lifetimes are left-truncated by `a0`

# Fluent interface for NHPP
nhpp = nhpp.replace_at_age(ar)  # Returns AgeReplacementNHPP
ages, assets, a0, af = nhpp.get_fit_data(
    size=..., t0=..., tf=...
)  # Generates data with `af = ar` or `tf`

Benefits: this uniform interface supports transforming models (e.g., applying age replacement or left truncation). Users need to remember only two main methods—replace_at_age and left_truncate_by, no matter the model type.
2. Unlike the current approach, this provides a unified way to manage transformations (e.g., applying ar and/or a0) across model types like LifetimeModel, RenewalProcess, and NHPP.

Improved Policy Interface

Finally, here is the proposed structure for the Policy interface. Policy objects would define the following:

  1. Methods for calculating expected costs (see issue Simpler method names for Policy class #25 for simplified function names).
  2. An optimize method to compute and set the optimal parameter (e.g., ar).
  3. A sample method to generate PolicySample objects. These are invisible to the user but contain the sampled data.
  4. Cost functions for sampled data: once sampling is performed, these become accessible
  5. Plotting: visualization methods, which require sampling to have been performed before they can be used.

Note: Expected costs work without sampling, while empirical cost functions and plotting require prior sampling.

Code Example:

policy = AgeReplacementPolicy(...)

timeline: NDArray[np.float64]

# Expected costs (work without sampling if `ar` is set during initialization or via optimize)
policy.expected_cost(timeline)
policy.expected_eac(timeline)

# Empirical cost functions (requires sampling)
policy.mean_cost(timeline)  # Error, since no sample was generated
policy.sample(size=..., tmax=...)  # `tmax` is the horizon time
policy.mean_cost(timeline)  # Works: sample was performed

# Plotting sampled data
policy.plot.nb_event_count(timeline)  # Plots the sampled number of events

Implications

Setting ar as an object attribute
I see one implication that is not disturbing. In AgeReplacementPolicy, ar in optimized and is passed as an argument to the optimized function. The workaround is simply to set the current ar value into the model object each time the optmized function is called by the optimizer. The same is already done from LikelihoodFromLifetimes and model.params. There is no major loss in run time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants