-
-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework income process using IndexDistribution #1024
Conversation
TODOS:
|
Yes, thank you, I like this. There is an additional failing test:
I didn't write a https://www.geeksforgeeks.org/__getitem__-and-__setitem__-in-python/ The seed logic in HARK is quite complex and I'm not sure I fully understand all the motivations for it. |
# Conflicts: # HARK/ConsumptionSaving/ConsIndShockModel.py # HARK/core.py
I am reviving this and will work on it during the next few weeks. It is part of a broad goal of making the income-process construction code more flexible. However, the next step is to update the current code using newer tools that have been added, like A big issue with this kind of update is the RNG. For reasons yet to be determined (maybe the order in which shocks are drawn or some seed that needs to be passed) using Before I dive into it, I want to summon the RNG maestros to ask if they have any idea or opinion on how this issue should be resolved. Do I edit the tests to check for the new values? Do I tweak the new implementation until matches the current tests? Do I look for a happy middle? Have the team's views on simulation tests and how they should be done changed? |
Hooray. Glad you're looking into this. I don't think this addresses the whole issue, but one way to tackle the problem is to rewrite tests so that rather than targeting a specific numerical value, they tests a functional relationship: Here's an example: This is actually a much better test, since it tests the functionality of the code, and not some accidental combination of the code and a random seed. |
That test is evaluating a property of the solution. I think we have a fair bit of those. I still see value in testing simulation results. Imagine for instance that some indexing error in SimOnePeriod made us draw the wrong shocks. Or that, as it happened 2 years ago, some handling of a seed makes agents always draw the same shock. Tests to catch that are great. So I am happy that we have both kinds, but I wonder if in your discussions there has been any development in how we think we should carry out simulation tests going forward. |
Something reassuring is that in the current state of the PR, tests that deal with the solution itself (not the simulation history) pass, so we know the current machinery is leaving the distributions that the agents get in the solution process unchanged. We know the miss-matches are generated when we draw from the |
These are great points. This is an example of a test of simulation functionality that does not depend on the random seed: I am not aware of any new consensus for doing simulation tests. |
Is it possible that there is an error in IndexDistribution though? |
I am 98% sure that:
|
@sbenthall I have a clue of what might be happening. If you run
You will find that Notice:
The key of why these two approaches give the same result is that Line 67 in f338c97
Lines 19 to 20 in f338c97
and then uses that random-number generator to draw the seeds of the time-specific distributions Lines 72 to 78 in f338c97
That is why the two previous approaches yield the same shocks. However, notice that representing the time-varying income distribution with an
The difference is that under the current approach the seeds for the time-specific distributions are drawn directly from the agent's RNG HARK/HARK/ConsumptionSaving/ConsIndShockModel.py Lines 2737 to 2742 in f338c97
Thus, without The extra step must make simulations not exactly match (but it might not be the only thing). |
A solution might be to let |
@sbenthall, I'd like to ask why you think I think the above issue is being generated by the fact that the current implementation thinks of the |
@sbenthall Just confirmed this (mostly) works. Passes all of IndShock's tests except Harmenberg (it fails most with the current implementation). |
Codecov Report
@@ Coverage Diff @@
## master #1024 +/- ##
==========================================
+ Coverage 73.66% 73.73% +0.06%
==========================================
Files 69 69
Lines 10568 10577 +9
==========================================
+ Hits 7785 7799 +14
+ Misses 2783 2778 -5
Continue to review full report at Codecov.
|
This PR finally passes tests! I managed to resolve all the simulation discrepancies. It does kick the can of simulation tests down the road, but there is already an issue for that #1105 and I feel getting to an unified solution and a single fix in a different PR would be better. It was also satisfying to get it to match: I now know there is no bug in The actual change that is introduced by this pr is not that transcendental: it simply uses I am re-requesting review. |
This is looking very good. There's one change I'd suggest. According to its documentation, the I wonder whether it would make sense to make these engine methods into model-specific subclasses of Distribution. Basically, you would just be putting the substance of your engine methods into the |
Thanks Seb, Just to verify that I follow, you'd like me to move my ***Shk_engine methods to the inside of the Agent type? so that they are called by self.***Shk_engine? I think that'd make sense, since these are specifically tailored for the given model. |
No, that's not what I mean. I mean something more like:
|
To elaborate on this @Mv77 ... The single-period income distributions are used by several models (in the current HARK architecture, via inheritance). But where I think we should move to is that these distributions should be define in model configuration, not hard coded into model code. See issue #620 In the FramedAgent code, this is precisely what happens: the That is is then used by the simulation code to know that that variable must be sampled as a shock. https://github.com/econ-ark/HARK/blob/master/HARK/frame.py#L257-L265 At the very least, if you defined the IncShk engine in terms of classes that are subclasses of Distribution (or, DiscreteDistribution, which may be more appropriate), then I could use that same distribution definition in the PortfolioFrameModel. That's an indication of how this would support code reuse. |
Thanks Seb. I initially thought of this as type puritanism! But the more I think of it, the more it makes sense. It pushes us in the direction of income distributions being their own object, modular entities easily switched around. I am fully onboard; will revise over the weekend. |
Not that type puritanism is bad; it is just one of those things that have a huge social return but are annoying to enforce in one's own life. |
…umbing/income_process # Conflicts: # HARK/ConsumptionSaving/ConsIndShockModel.py
Seb, I've now made engines their own class. Tests still pass. This is ready for your review again. Thank you! |
A small suggestion: Many different choices could be made for the |
Agreed. I've now gone with
|
Great work, @Mv77 . Thanks for the careful design and implementation. :) |
Thank you for the reviews! :) |
@sbenthall recently introduced
IndexDistribution
to deal (among other things) with distributions that time-varying indices.Seb, this is a first attempt at using that class in the income-process constructor. The current code works and I quite like the change, I'd like your feedback.
The big issue is that, as we discussed before, we need to think a bit more about what is the appropriate thing to do about RNG seeds with this change. I've not taken care of it and that's why tests are failing.