-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Edits] Support for restoring serialized InstancePayload
as instance state
#345
[Edits] Support for restoring serialized InstancePayload
as instance state
#345
Conversation
🦋 Changeset detectedLatest commit: 4a9a048 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
InstancePayload
as instance state
036cb96
to
3e76a16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to give another pass on Monday, but submitting some feedback.
Question: Have you test the performance of the Repeat? Just for sanity :)
packages/common/src/test/fixtures/xform-dsl/PROPOSED_XMLLiteralXFormsElement.ts
Show resolved
Hide resolved
@@ -1,24 +1,28 @@ | |||
import type { XFormsElement } from '@getodk/common/test/fixtures/xform-dsl/XFormsElement.ts'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix:
This file is +1230 lines, which is too big. What can be extracted? Even if it's little, maybe some functions or definitions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before I address the specific concern, I want to make sure we have a shared understanding of some of the context of what's going on with this particular file.
There are some hard-ish constraints on the size of this file which aren't entirely up to us because, like the XForms DSL, Scenario
is an interface we share with JavaRosa. While the languages are not exactly 1:1, I think we can take the current state of JavaRosa's equivalent as a decent estimate of a practical floor on its potential line count: 934.
We have already extracted some supporting logic into separate modules, where it's been practical to do (for a variety of reasons, though I'm not sure file size has been a primary motivation). And we've also, by necessity in supporting the interface as defined in Java, had to inflate some aspects of this module because: JavaScript the runtime language doesn't have method overloading semantics... which are used a fair bit by JavaRosa's equivalent. (And TypeScript's signature overloading doesn't introduce runtime semantics.)
The additive changes in this PR make up almost the entire delta between our line count and JavaRosa's. Those additive changes are almost all comments. Coming to understand those comments' contents, and writing them for posterity and future reference, was crucial for me to understand the semantics of existing Scenario
usage (i.e. ported from JavaRosa) in existing test coverage (again, ported from JavaRosa) affected by this change/feature. Gaining that understanding was also invaluable for me to understand the approach to exercising core semantics of the feature for which I couldn't find existing coverage.
The remaining additive changes in this PR:
- Add a proposed API with clearer semantics and intention for the feature under test: 2 SLOC
- Update an existing method (which we skipped in the original porting effort) to call into that added API: 1 assertion (which Prettier expanded to 4 lines so a failure can produce a helpful message) + 1 SLOC
- Improve clarity of a still-unimplemented function: +0 SLOC (Prettier added whitespace)
- Add a few properties to support all of the new test surface area, all of which necessarily goes through
Scenario
or implies a significant rethink on our approach to engine integration testing
I'm sure we could get creative about extracting some aspects of Scenario
into other modules, but I want to seriously consider whether the downsides that would entail are worth the goal of a smaller file size in this case.
There's a very small amount of code which is not directly part of the Scenario
class. Some of it those are lines which are inherently part of the class's interface and don't belong anywhere else. There are a couple of helper functions we could hypothetically extract to... somewhere. Most of that potential reduction in SLOC would be countered by adding to the already massive stack of imports at the top of the module.
And I really think it's worth highlighting that massive stack of imports: quite a lot of Scenario
, the class and its behavior, is already being imported from other modules.
So with all of that context...
Can you help me understand how you've determined that this file is too big? What is the maximum number of lines you'd like to see in a file, and what is the absolute consequence of exceeding it? How important is this relative to cohesion and comprehension of the module itself? How should we weigh these against potential synchronization overhead (if it should occur) where a module's interface is shared with another project?
I want to also be explicit that I could have probably answered this with the last two paragraphs in the first section. But I think the context in this case is important. And I think it's just as important to ask that we try to consider context (and seek it out) when we think about introducing new (especially arbitrary) code style rules which don't match the current project state. (An oft-cited name for this principle is Chesterston's fence.)
Lastly, I really don't want this to come across as dismissive. If we do want to set a cap on module size, I'm open to that... with exceptions as appropriate, and I think this module is likely to be one such exception. Line count aside, maybe there are other ways we can help make this module more accessible for you. For instance, I've found tools like the TS Language Server incredibly helpful, even indispensable, for navigating large modules (and large numbers of small modules too!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed context and reasoning behind Scenario’s current state. I agree that ensuring the module’s cohesion is important, and I can see how the comments and interface requirements have driven the line count up. Your questions about my concerns are fair, so I’ll try to address them and explain why I think shortening the file could still be worth exploring.
To me, the file feels “too big” at 1,230 lines because it’s harder to skim or grasp quickly compared to smaller modules—especially for someone jumping in without the full history (like a new dev or even myself in a few months). I don’t have a hard maximum line count in mind, but something closer to 800 lines feels more manageable based on other projects I’ve worked on. The consequence of exceeding that isn’t catastrophic—it’s more about cumulative cognitive load, onboarding ease, and keeping the codebase approachable as it grows. I’d prioritize comprehension and team productivity over a strict size rule, but I think we can achieve both with few tweaks.
My suggestion to extract code wasn’t about changing the current structure wholesale—it’s more about finding low-hanging fruit to lighten the file without losing its integrity. For example, could some of the helper functions or test-specific properties live in a ScenarioUtils
or ScenarioTestSupport
? The massive import stack is a valid concern, but I’d argue a few extra imports are less daunting than scrolling through 300+ extra lines in one file. The comments are super helpful (and I’d hate to lose their value), but maybe we could move the detailed ones to a separate doc or distill them into shorter inline notes?
I’m not stuck to a specific solution. My goal isn’t to overcomplicate things but to make this module a bit more approachable for the team long-term. Let me know what you think!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for elaborating on your concerns. I am totally on board for reducing unnecessary cognitive load. That's a worthy goal. I would like to consider opening an issue about:
- addressing module size as a general factor affecting cognitive load
- how best to balance that general concern in specific cases (such as
Scenario
) where we have collaboration goals outside of this repo
I think we can probably come up with a lot of good ideas on both fronts, but I'm cautious about trying to address them in this PR.
That caution is partly motivated by the sense of urgency we all feel to land edit functionality and unblock Central. Also, since so much of the additive change here is in the form of JSDoc, a big part of my motivation for being thoughtful about addressing it is to try to keep documentation context close to what it's documenting as much as reasonably possible. A lot of the Scenario
interface carries a lot of semantic weight that isn't necessarily as obvious where it's used, and JSDoc provides invaluable leverage for exactly that: editors can/do also make that information available directly where the interface is accessed.
I've found that leverage valuable enough, especially for Scenario
and similarly semantics-heavy interfaces, that I'd be more inclined to extract code in ways that might seem awkward, if it helps keep the relationship between interface/documentation intact.
So my proposal is to file an issue with something like the two bullet points above, and hold addressing this module's size for this PR. Does that sound okay to you for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my proposal is to file an issue with something like the two bullet points above, and hold addressing this module's size for this PR. Does that sound okay to you for now?
Yes, it sounds good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracked in #350
* Some tests were made to pass because they exercise other functionality (and | ||
* the previous top-level comment lagged that fact in confusing ways). | ||
* | ||
* At time of writing, the the following changes and observations are now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix:
* At time of writing, the the following changes and observations are now | |
* At time of writing, the following changes and observations are now |
* for this concept as it applies to the scope of serializing and | ||
* deserializing instance state)? | ||
*/ | ||
describe('Restoring serialized instance state', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion:
These tests can be extracted to different files. At least restore suite if not too granular is prefered; so something similar to: packages/scenario/test/restoreSerializedInstance.test.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so I included that comment (the one directly above this line) because I agree it's probably best to split out another suite. The intent of the comment is to prompt some discussion about a good structure. Again, this is stuff we at least nominally share with JavaRosa, so I think it's worth putting some thought into that.
Without that consideration, my thinking would go something like:
- I think it would make sense to continue grouping these concepts (serialization/deserialization) together since they have a mostly inverse relationship ("mostly" because edit will go beyond the direct overlap soon enough). They could be grouped in a subdirectory rather than a single module (for instance, but...)
- A common term for grouping these concepts is the abbreviated portmanteau "serde", but that has all the downsides of abbreviations and portmanteaus.
- We could also cash in on some of the recent gains we've made clarifying terminology in [Edits] Naming revisions: "submission" -> "instance" #334, with a shared prefix rather than a directory, e.g.:
serialization.test.ts
->instance-output.test.ts
- [New tests] ->
instance-input.test.ts
- [And hypothetically soon?]
instance-edit.test.ts
, although I don't expect there to be a ton of test surface area there. Edit functionality will just be restore (or "instance input") +deprecatedId
+ I/O
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the instance-input/instance-output naming idea. Could we try that and see how it feels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel pretty good about it. I was hoping to be able to do it in a single commit without losing the history of the original file at its new name, but that didn't pan out. So I ended up doing it in three commits to make sure it happened explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's okay, thank you!
<details> | ||
<summary>Instance cache</summary> | ||
|
||
<p>This is an in-memory demo simulating storage and restoration of instance state, which might be similar to the experience of a Web Forms offline mode.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix:
Based on this explanation. Since it's a simulation, it feels right to make it clear in the component name:
InstanceCacheSimulation.vue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish I'd had the energy to write up PR notes before EOD yesterday. But since I didn't... apologies for not explaining sooner:
Unless folks feel strongly otherwise, I was planning to use the last (Vue UI) commit as a temporary internal demo, and drop the commit before we merge this. I included it because I thought it was valuable to be able to show:
- The functionality working visibly within the product
- Engine/client API integration in practice, in the UI context
- A little hint of how the functionality overlaps with other pending features (e.g. offline, restoring incomplete instance state, etc)
But I don't think this demo is something we want to land on main
. If we do want it to be part of this scope, I can spend a few minutes writing up what I think it still needs to be suitable for that. Because there's more to address than the (great!) feedback you've already left here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This demo seems valuable to me, so it’s fine to remove it from the main
branch. However, I think it would be helpful to include it in a documentation file as a reference for others. A commit is not easily discoverable, but having it in the documentation would make it a much better experience for others to find and use as an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it's probably valuable, but I also expect we'll be doing a very similar integration the right way with the intent of supporting Central integration of #247 and #316. While those integration points won't directly exercise restoreInstance
right away, I think the editInstance
call site would be an excellent place to link directly to the commit.
If you think that sounds like a reasonable way to keep a reference it, I'd be happy to (er...) commit to making sure that happens as soon as there's an editInstance
API to call (likely today).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh. Or even simpler/more immediate: I could see linking to the commit in JSDoc on both the createInstance
and restoreInstance
client interface methods in this PR. That way it’s linked right away, without any need to follow up afterwards, right where someone would want to start looking into integrating them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see linking to the commit in JSDoc on both the createInstance and restoreInstance client interface methods in this PR. That way it’s linked right away, without any need to follow up afterwards, right where someone would want to start looking into integrating them.
This sounds good to me, thanks!
.items, | ||
.items li { | ||
display: block; | ||
list-style: none; | ||
} | ||
|
||
.items { | ||
margin: 0; | ||
padding: 0; | ||
} | ||
|
||
.items li { | ||
margin: 0; | ||
padding: 0.25rem; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix:
.items, | |
.items li { | |
display: block; | |
list-style: none; | |
} | |
.items { | |
margin: 0; | |
padding: 0; | |
} | |
.items li { | |
margin: 0; | |
padding: 0.25rem; | |
} | |
.items, | |
.items li { | |
display: block; | |
list-style: none; | |
margin: 0; | |
} | |
.items { | |
padding: 0; | |
} | |
.items li { | |
padding: 0.25rem; | |
} |
const isInstantiableFormResult = ( | ||
formResult: LoadFormResult | null | ||
): formResult is InstantiableFormResult => { | ||
return formResult != null && formResult.status !== 'failure'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix:
This status ('failure') is safer coming from the xforms-engine, one source for these string constants. For example:
return formResult != null && formResult.status !== 'failure'; | |
return formResult != null && formResult.status !== FORM_STATE_STATUS_FAILURE; |
I know we said about doing it at once in another PR, but if this is a low-hanging fruit, it's worth doing it now.
Since this status is being used outside the engine, I feel extra cautious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't be super concerned about addressing this case in this PR if that turns out to be important. However...
-
We already merged the engine side in [Edits] Separate form load and instance creation in engine entrypoints #335
-
As I mentioned above, I'm not sure we want to land this demo in its current state anyway
-
I'm much more concerned that this suggestion of a constant doesn't reflect the approach we decided to take when we discussed it a couple days ago, nor [TASK] Apply code style for "enum" like string constants #338 which captured that decision. I get the sense that we may need to revisit that decision. Which is probably all the more reason not to be hasty about adopting a new approach in this PR.
-
I'm even more concerned about the impact that continued and growing focus on code style during review is having on productivity, and may have on functionality. The former is something I think we've all been feeling in one way or another. The latter is always a concern: in my experience, review focused on code style very often leads to more mistakes in behavior/approach slipping through, sometimes fairly big ones with significant impact. This is a big part of why I advocate for automated tools as a primary solution: I've seen the before/after, and it's pretty stark!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with dismissing this suggestion now that I have a better grasp of the context.
My concern isn’t really about code style but about how to alert clients to typos in hardcoded strings like entity statuses ('failed' or 'success') or to changes in those strings. For instance, if the engine could expose these statuses, it might prevent some headaches down the line.
But this is something we can address in that ticket: #338
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if that was your concern!
I have some good news, and some bad news (in theory only).
Good news
-
Here's what happens if there's a typo in one of our clients: https://github.com/getodk/web-forms/actions/runs/13908790302/job/38918623833
-
Here's what happens if we change the enumerated string in the engine, but forget to update one of our clients to reflect the change: https://github.com/getodk/web-forms/actions/runs/13908796680/job/38918133935
Bad news (in theory only)
In both cases I emphasized one of our clients, because the reason we can expect those CI failures is that we know we use TypeScript on both sides of the @getodk/xforms-engine
package boundary.
In theory, someone else might develop another client, directly integrating the engine without going through the project's primary interface (@getodk/web-forms
).
- If someone were to do that; AND
- If they didn't use TypeScript; AND
- If they didn't use an editor with a TypeScript language server_...
...then they might not immediately realize they've made a typo. That's a lot of "if"s!
While the first "if" is something we do explicitly support, I think it's a lot more likely that we'll develop other clients before anyone else even wants to. Even if we assume the first "if", the second and third "if"s (combined) are increasingly rare.
I suspect the more likely failure mode in a hypothetical third-party client would be a correct initial integration, followed by a breaking change in the engine1, followed by the downstream integration doing a semver-major upgrade without integrating the breaking change.
Footnotes
-
Which we MUST document. And if we don't already have an issue to automate detection of breaking interface changes at the package boundary, I plan to file one right away! ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for testing the assumptions. It gives reassurance
think it's a lot more likely that we'll develop other clients before anyone else even wants to
Really looking forward to this!
): Promise<FormInstanceState | null> => { | ||
const formResult = await initializeFormResultState(formXML, options); | ||
|
||
if (formResult.status === 'failure') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix:
Same here. If not, at least this can be a function, so the hardcoded string is in one place.
const isFormResultFailure(formResult) => formResult.status === 'failure';
@@ -2,6 +2,11 @@ import type { INSTANCE_FILE_NAME, InstanceFile } from './InstanceFile.ts'; | |||
|
|||
export interface InstanceData extends FormData { | |||
get(name: INSTANCE_FILE_NAME): InstanceFile; | |||
|
|||
/** | |||
* @todo Can we guarantee (both in static types and at runtime) that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question:
What do we need to resolve this question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little bit of time and effort. I wrote the comment so I wouldn't forget to apply some of both when it comes up again and when I can afford the cognitive overhead of a context switch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that coming in the following PR? Or more like another day sometime in the future?
My question is in the line to know if we should open a GitHub ticket for this and get it scheduled with the team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll want to look into it for either #247 or #66.
Is your concern about having TODOs in code comments generally? Because if that's the case, I have some really bad news1.
Or is this specific TODO of particular concern for you? (Maybe as relates to #66 and/or
#346?) If that's the case, I'd be happy to put some thought into it as soon as you're ready to address it.
Or maybe some other concern I haven't thought of?
Footnotes
-
And it's much less theoretical than the enum concern: we have a ton of TODOs in code comments already. For what it's worth, I've been wanting to add linting for them so they can be tracked more intentionally in some way. It just hasn't become a priority yet, since we've had pretty decent TODO comment hygiene thus far. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is your concern about having TODOs in code comments generally?
I'm fine with TODO comments. I just wanted to understand what’s needed to address this one—whether I can help now or if we need to create a GitHub ticket and prioritize with the team (project management). But that’s all I had in mind.
3e76a16
to
f526548
Compare
…n WIP) **Note on client interface:** I’d hoped that restoring instance state would be synchronous (from `LoadFormResult` and its implementations). In hindsight it makes sense that this is not possible: instance XML is stored there as a `File` in `InstanceData` (`FormData`). Reading `File` data synchronously would likely need to go through [`FileReaderSync`](https://developer.mozilla.org/en-US/docs/Web/API/FileReaderSync), which can’t be used on the main thread. This isn’t a huge deal in any case, as we can expect the non-restore edit case to be async as well. Open question: is it better, for _consistency_ to make the synchronous `LoadFormResult.createInstance` signature return `Promise<CreatedFormInstance>`? Asynchrony there would be superfluous, but it might be nice for clients to have symmetry across calls to these similar APIs.
This is so simple because `InstanceNode`s (from `PrimaryInstance` on down) now decouple their instance state input from `NodeDefinition` (`RootDefinition` on down). This is naive because it doesn’t account for: - missing (i.e. omitted non-relevant) nodes - nuances of relevance as applied to repeats - mismatched repeat instance count with `jr:noAddRemove` - ?
…L literal This is mostly meant as a convenience to unify flows where a signature overload allows both raw XML and XForms DSL input. Will be used to help tame some incoming complexity in `scenario`, keeping it closer to inherent complexities.
…/commentary The vast majority of this change is best described by comments added/updated in the commit. One substantive exception worth mentioning here, but which will hopefully be self-documenting enough post-review, is streamlining some of `scenario`’s use of Solid (largely for its own internal state, but affecting integration with the engine for Solid-specific reasons). There’s enough other complexity in the change that I wanted to at least pare down some incidental complexity, especially around Solid reactive scoping.
A note on this implementation (because I struggled with a few earlier attempts!): This approach is deceptively simple. It relies on a new set of assumptions: - There is only one `NodeDefinition` for every nodeset in a form definition - There can and **should** be only one “template” for every `NodeDefinition` - An **instance** (`StaticDocument`) is provided as input under all circumstances Earlier attempts to address this got deep in the weeds of trying to restore non-relevant nodes from any subtree, at any position of its parent. But that was based on flawed assumptions, contrary to those above. My thinking was that it makes sense for e.g. a non-relevant `/data/repeat[2]/foo` to be restored with the model’s definition for `/data/repeat[2]/foo`. But I think the _much more likely_ expectation is that it should be restored from `/data/repeat[@jr:template]/foo`! So how this works is: for every subtree of a form, determine all of the _nodesets_ which should produce a child node. For each of those nodesets: - If a node with that nodeset exists in the instance, it is used to produce the child’s initial state and that of its descendants (recursively following this same logic). This logic is applied to an instance node from any source, i.e. regardless of whether the instance is restored (or soon, edited), or directly from the form’s model definition (where it’s tautological that the nodeset will correspond to a node). - If the instance does not have a child for that nodeset (and since this is recursive, if the parent has no instance node), the `NodeDefinition.template` is consulted for initial state instead. This works **because** an instance (`StaticDocument`) is **always provided as input** to `PrimaryInstance` (and its descendants walked to provide input to other `InstanceNode`s all the way down the subtree). We can ensure that `/data/repeat[2]/foo` is populated by the **form-defined** node with that nodeset because the model instance is now decoupled from its corresponding `NodeDefinition`s and their `template`s. (Note: I think this is also roughly consistent with the mental model @lognaturel has described in JavaRosa. Confirmation of this would be appreciated!)
Note: restore works as expected, but a related test is added showing that **serialization** prior to restore exhibits a bug! TODO(?): we might consider splitting the behavior of `Scenario.proposed_serializeAndRestoreInstanceState` (on “And”), where we could combine the related tests into one coherent flow. If we did that, and until we fix the serialization bug, we’d probably want to parameterize that combined test so we can show the working restore behavior regardless.
More nuanced tests will follow, but it felt valuable to commit this now so we can see that the basic functionality does work as expected. In particular, this is valuable for demonstrating that the underlying assumptions of the deceptively simple solution to restore non-relevant nodes **DOES NOT** break expections for node-sets which may have N>1 node (i.e. repeat instances _and their descendants_).
Again, more nuanced tests will follow. Intent is to gradually escalate the factors under test until gaps are discovered so they can be fixed.
These input cases are not likely **as expressed**, but they largely reflect the expected behavior of various shapes of non-relevant omission we might encounter (without requiring a combinatory explosion of fixtures to set up those more realistic cases). This also more or less demonstrates current “healing” behavior we could expect, e.g. for form additive changes to a form definition.
…oAddRemove` Tests currently fail without special handling for `jr:noAddRemove`. Engine fix will come in next commit! May want to come back to this and add sensible defaults in a template. Leaving the NaN cases in place for now in case we want to discuss the expected behavior of XPath expressions like `2 * /foo` where `/foo = ‘’`.
Test fails pending warning, fixed in next commit
Note: this is rework of several earlier efforts to warn under the same conditions (and skip warning under the same excluded conditions). This revision of the change addresses: 1. The fact that upstream commits have vastly simplified some other repeat-related changes where the logic was introduced in prior iterations. 2. The fact that when warning occurs (and why that when applies) was much less clear than it could have been, especially once integrated with much simpler upstream changes. (At least in the source; it’s a bit more clear in the tests…)
Generalizes the concept for both cases, reduces surface area of cleanup (don’t want to overstep if we have e.g. global mocking of timezone *ahem*). As previously for `jr:noAddRemove`, engine fix is incoming for the `jr:count` case
…nput Also corrects several tests which had already mismatched these without the engine enforcing it (not today copypasta!). Engine fix incoming in next commit (which helped catch said copypasta mistakes).
16691da
to
be77c40
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did another quick pass, and it looks good. I think all the conversations has been discussed. I'm approving now since I won't be able to do it tomorrow, to unblock and you can merge it
Thanks!
…ut (1/3) This is a prep commit, ensuring each subset is grouped clearly before splitting.
…ut (2/3) Extract newly added instance input tests from serialization.test.ts to instance-input.test.ts.
…ut (3/3) With `serialization.test.ts` only testing instance output, rename it now to `instance-output.test.ts`. This ensures that history of the file can easily be tracked from its new name to old.
b600f52
to
4a9a048
Compare
Part 4/4 in preparation for #247.
Branched from #336 (diff).
I have verified this PR works in these browsers (latest versions):
Once again, I don't think there's anything browser specific here. But I'd be happy to do a cross-browser test. Maybe worth a look to get a sense of where we stand on performance since the question was raised. I do have some notes on that below.
What else has been done to verify that this works as intended?
Almost the whole change is testing:
Scenario
deferred in the original porting pass because we didn't have support for serialization/deserializationWhy is this the best possible solution? Were any other approaches considered?
The functionality implemented in this PR was almost entirely laid out in #334/#335/#336. In fact, it began with a 5-line change (5902211) for the naive solution to initialize state from instance input rather than the form-defined instance. That's because the bulk of the real work to accept instance state input was already done in #336.
The real meat of this change is around subtleties beyond that naive solution, in two specific areas:
jr:count
andjr:noAddRemove
)Even so, after some simplification in #336 during review, the only change needed to support the latter was adding a warning when controlled repeat instances are dropped from instance input—which occurs when the count of repeat instances from instance input exceed their computed (or fixed) count on load.
So really this comes down to how non-relevance is handled. And that turned out to be simpler than I expected going into this, largely due to simplifications in #336 (both as part of the original dev pass and as surfaced during review).
I'll describe the approach here so I can describe the other approach and contrast.
✅ Reconcile non-relevant node state at the node level during initialization
In #336, the concept of "definition" and "instance input" were split so that we could even accept instance input from any source other than a form definition. That's why the naive solution was so simple: we just had plug instance state from another source into the same interface. And that change established instance state as input to each node on construction.
The solution in this change, to handle non-relevance, is to treat every1 node's instance input as optional. In this way, the way we model instance input is almost2 exactly consistent with what we expect the actual input to be. On that basis (and with foundations from #336), reconciling a non-relevant node—i.e. a node which was omitted from the serialized instance XML—turned out to be surprisingly3 simple: if a node isn't present, look up the form definition node for the same nodeset and... use that node instead!
And the approach works for:
It works because of any insight I wish I'd had in #336, but feels just as fitting here: if a
NodeDefinition
is 1:1 with a nodeset, it's also inherently 1:1 with a single node in the form definition. We have a formal concept of this for repeat instances (i.e. "template"). In this change I applied the same name to the same concept for allNodeDefinition
s. That's fundamentally how now model it, and it's really helpful (or at least I found it to be!) to have a common name for the common concept.❌ Merge instance state as a tree
The most likely alternative approach is probably best described by how it's done in Enketo. Which is roughly...
Given:
On form load:
This approach both felt unnecessarily complicated and objectively was a significant source of many bugs. Maybe that would be solvable by taking another stab at the same approach with various lessons learned. But what I imagined that looking like was a sprawling web of cross-references into a variety of parsed information, all to reconcile something upfront which ... doesn't need to be reconciled upfront!
How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?
I don't think there's a significant regression risks in this PR; the regression risk in the "edit" feature work thus far is probably isolated to #334, #335, and #336.
The biggest risk (generally) is that there are a few behaviors we've intentionally deferred, as described by test stubs I added this morning.
Do we need any specific form for testing your changes? If so, please attach one.
N/A
What's changed
Engine: restore instance state from input
This was addressed under Why is this the best possible solution? better than I could here!
Test coverage (
scenario
)This represents the bulk of the changes in this PR. A huge subset of that is commentary, particularly around expanding [my, maybe others'] understanding of pertinent
Scenario
APIs which are now implemented. The rest is focused on the nuances described under Why is this the best possible solution? above.UI DemoThis commit has now been dropped, and a link to it has been added in JSDoc for each of the key methods which it integrated.
Notes on performance
Since @latin-panda asked about performance, I will mention that I've been watching it pretty closely from #334 on. Here's what I've observed:
This is totally unsurprising because the we're doing "more work"4: instead of parsing a form-defined primary instance once (DOM), we're parsing it twice (DOM -> "static-dom"). By "slight", what I've seen on my machine (on average; individual test runs, your machine, CI all may vary!) is that our slowest test (by far: the child-vaccination smoke test) has gone from ~12.x seconds to ~13 seconds. I've seen sub-50ms increases to the next slowest tests (WHO, form-validity).
I have not seen any measurable/notable performance regressions in:
bench9-alt
,nigeria_wards_external
)Footnotes
Well, not quite every node. We can be 100% certain that any instance input, from any source, has a "document" (1:1 with
PrimaryInstance
) and "root element" (1:1 withRoot
). If either of those is not present, several aspects of parsing will fail well before we attempt to initialize any aspect of state from that input. ↩In theory we could model it more precisely, i.e. by pairing the concepts of "this node or one of its ancestors has a
relevant
expression" with "this node accepts optional instance input". That would probably also produce stricter validation of instance input. But that would involve substantially more complexity at the type level—complexity I don't think we need. And my hunch is that if I asked "should we validate this aspect of instance input" the answer would be a pretty firm "no". @lognaturel please feel free to correct me if that hunch is wrong! ↩I say this was surprising because when I started working on this feature, my mental model of "form definition node" still matched the data model we had prior to [Edits] Decouple form ("definition") and instance (initial structure/state) as initialization inputs #336... a 1:~1 mapping between
//*
(i.e. each node in the form-defined primary instance) and*Definition
. So I expected to have to identify the form-defined node based on repeat depth and position, and I expected that to get very complicated and very confusing! ↩I've now opened Eliminate remaining engine use of DOM APIs #351 to track doing less work in this regard. When we address that, we won't need to double parse XML to produce "static-dom" structures, which will probably more than make up for this small performance hit. ↩