Forums

Back

Bootstrap with data.ss_new

Erin Bohaboy, modified 5 Years ago.

Bootstrap with data.ss_new

Youngling Posts: 12 Join Date: 6/10/16 Recent Posts

Greetings, Everyone,

 

I have an SS3.24 model and I attempted to do a bootstrap using the create bootstrap datasets in data.ss_new option in the starter file. In my original model that I used to create the bootstrap datasets (I attached files with _original), I am using variance adjustment for the length and age data in the control file. When I ran SS on each bootstrap dataset, I reset all variance adjustments to null (see attached files with _runboot, the data file is one of the bootstrap datasets) but apparently something went wrong with the bootstrap because the bootstrap distribution of parameter values is quite different from the likelihood profile from the base model (see figure showing steepness as an example). Note that I am estimating sigmaR in the base model, but even fixing sigmaR (in the base model or bootstrap runs) makes no difference. Hopefully I missed something simple, does anyone have any ideas?

 

Regards,

Erin

Richard Methot, modified 6 Years ago.

RE: Bootstrap with data.ss_new

Youngling Posts: 222 Join Date: 11/24/14 Recent Posts

Hi Erin.  This looks like a good investigation.  I will try to look at your configuration more closely later, but here is an initial thought regarding the impact of process error (model misspecification) and measurement error.

Your original data came from a multitude of actual processes that are more complex than what we can represent with a SS configuration.  We try, but all models are approximations.  When we do a parametric bootstrap, we are using one possible configuration and parameter set to create a population and sample - completely randomly - from that population.  So, I do not think it is surprising that the distribution of estimates from bootstrap samples is narrower than the distribution from a profile using the original data.   Another consequence of the  "all models are approximations" axiom is that the fit to the original data probably has some pattern to the residuals, sometimes called data conflicts.  These patterns mean that there are some unresolved gradients in the final model fit, but the model lacks enough flexibility to resolve them.  Then when we fit the model to bootstrap data there is no inherent data conflicts (all data are random) so the average set of parameters among the bootstrap runs can differ from the original set of parameters that are based on real data.   This is informative about the existence of such data conflicts.  I assume you have seen the papers by Hui-hua Lee, Kevin Piner, Mark Maunder and myself in this regard.

I hope this helps.  The difference in spread for the steepness parameter does seem greater than I would have expected so I do hope we can find something to help resolve this difference or at least understand it more fully.

Rick