-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arrays: Representation of single members of an array #826
Comments
The difference between array:split and array:members is essentially a choice on how to represent an array member: in one case we do it with a "value record" and in the other we do it with a singleton array. In recent work I have experimented with both, and I have to say I'm not happy with either. Neither really works well when you attempt a transformation based on a recursive tree walk using pattern matching. I'd like to consider going back to my original idea of splitting an array into "parcels" (or building an array from parcels), where a parcel is a zero-arity function carrying the annotation %parcel; calling the function delivers the contents of the array member. This is about as close as we can get to an encapsulated representation of the concept without actually extending the data model. I've just re-read your email summarising feedback from BaseX users. It's a very useful contribution, but I think it's very much an XQuery users' perspective. It doesn't feel to me that these users are struggling with the challenge of doing complex structural transformations of JSON documents. |
But I do agree that at the XQuery and XPath level, "for member $x in $array" and "for key $k value $v in $map" are nicer; and I'm inclined to (revert to) proposing something similar for XSLT:
In each case allowing the "loop body" part of the expression to be either a sequence constructor or a select attribute. For join operations there's definitely a benefit in being able to bind range variables rather than the context item. |
We use something like parcels for our current Java bindings: Java objects, in particular those that have no obvious XDM type, are wrapped into function items, and can explicitly be converted to XDM types by invoking them. It’s pretty convenient.
I completely agree that this discussion is driven by XQuery, and I haven't considered generic map/array updates at all. In our world, complex updates on JSON are usually done with XQUF (sometimes verbose, and custom to our JSON XML representation, but definitely powerful and versatile): '{ "one": 1, "due": 2, "three": 3 }'
! json:parse(.)
! (json update {
delete node ./three,
rename node ./one as 'uno'
})
! json:serialize(.) |
My side note can be ignored; the equivalent expression looks alright. |
I believe we absolutely need to find other names for
If the return type will be In any case, we may need to find and document more uses for these two functions, and cases where |
We need a mechanism to split an array into its parts (members) and to reassemble those parts in a different way. The question is, what is the best way of representing the parts? array:split and array:join represent the parts as an array of arrays, and that is certainly one way of doing it; array:members and array:of members represent the parts as "value records" and that is another way of doing it. When we're doing a rule-based tree-walking transformation in the XSLT style, we want to write rules that process the parts of the array and transform them. That means we need to match them, which means we need to distinguish them from other kinds of value. The challenge is therefore to find a representation that makes these "parts of an array" easily recognisable as such. Splitting into "value records" serves that purpose rather better than splitting into sub-arrays, though it is by no means perfect. When we work with XML, intermediate data values can be made very easily recognizable by choosing distinctive element names. Working with maps and arrays is much more difficult because there are no element names to match. Perhaps annotations can fill the gap. |
There was discussion today about deep lookup and deep update, and both of these would benefit from being able to talk about the "leaf values" in a map or array as something that's more than just a sequence of items. Rather in the same way that a text node is more than just a string. Related: when we talk about key-value pairs in a map, I often find it awkward that the word "value" is used both to mean "any XDM value; a sequence", and to mean one part of a map entry. Things would get much easier if we could improve the terminology:
It would be nice to think of a deep-lookup returning a set of members, in the same way as a path expression selects a set of nodes, which is then implicitly flattened/atomized if the context requires a flat sequence. This still leaves all the options open for how "members" are represented. |
Maybe values of map entries could be called members, and…
…instead of |
The proposed functions could also be used to convert arrays to maps, and vice versa: $array
=> array:entries() => map:merge()
=> map:entries() => array:merge()
(: Result: [ (), (), 'III', (), 'V' ] :)
array:merge((map { 3: 'III' }, map { 5: 'V' })) |
As a result of #1331, we now have formal equivalents for all array and map functions, with the exception of map:find, which depend only on other map/array functions, or on a small set of primitive constructors and accessors defined in XDM. Moreover, these equivalents have now been tested, at least to the extent that the run all the array/map function examples successfully. For the array functions, I have typically used As an internal mechanism for defining the array/map functionality, these functions have proved very useful. Whether they are equally useful for "real users" is an open question. But I see no reason to make them private. |
I believe we should continue this discussion and look at some questions more broadly. I’ll probably open a new issue for it.
Just to understand: Those are equivalent writings for $array => array:split() => reverse() => array:join()
$array => array:members() => reverse() => array:of-members() Why would I think the most confusing thing about |
The key problem is when you get confused about whether the members of the array have been "parcelled" or not. If you pass a "parcelled" value to a function that's not expecting it, it's nice to get a type error. Ideally for this purpose a "parcel" would be a completely separate data type. Short of that, I think the "value record" representation is more likely to trigger a type error than the "array" representation. The other option, which probably scores better than either of the above, is to parcel the members as zero-arity function items. |
Thanks. Yes, I agree we might really need a custom type to improve typing. Otherwise, code like… { "v": 1 } => reverse() => array:of-members() …might raise errors like “Cannot convert map(*) to record(value)”, which doesn’t really remind of array operations. On the other hand, typing may not be too important at all if we primarily want to use the functions to present equivalencies in the spec. Maybe you have seen #1338, in which I have proposed to use |
When introducing the new array features to some users, the
for member
syntax was welcomed by everyone.However, there was some confusion (again, see my past feedback to the mailing list) about what the QT4 group considers to be “members of an array”, and about value records.
In particular, the “value record” representation of arrays led to questions that I didn’t have a good answer for. In particular, people didn’t understand why an array member was returned as a map, and why that map is (again) called “array member” or “value record” – a term no one associated with arrays (at least for now… which somewhat is not surprising, as it has just been introduced).
Next, due to atomization (as mentioned before),
array:split
allows us to omit the explicit?value
lookups that are required forarray:members
:I suppose I have been biased in my presentation, but I’ve failed to give good arguments to justify the current solution in the spec. The questions that I think need to be answered are:
array:members
andarray:of-members
instead of using the existingarray:join
function, combined with the newarray:split
function?Out of interest, I have rewritten the formal equivalencies for the array functions with
array:split
/array:join
:array:append
array:of-members((array:members($array), map{'value':$member}))
array:join((array:split($array), array { $member }))
array:build
array:of-members($input ! map { 'value': $action(.) })
array:join($input ! array { $action(.) })
array:filter
array:of-members(array:members($array) => filter(function($m) { $predicate($m?value) })
array:join(array:split($array) => filter(function($m) { $predicate($m?*) })
array:for-each
array:of-members(array:members($array) ! map { 'value': $action(?value) })
array:join(array:split($array) ! array { $action(?*) })
array:for-each-pair
array:insert-before
array:of-members(array:members($array) => insert-before($position, map{'value':$member}))
array:join(array:split($array) => insert-before($position, array { $member }))
array:remove
array:of-members(array:members($array) => remove($positions))
array:join(array:split($array) => remove($positions))
array:reverse
array:of-members(array:members($array) => reverse())
array:join(array:split($array) => reverse())
array:slice
array:of-members(array:members($array) => slice($start, $end, $step))
array:join(array:split($array) => slice($start, $end, $step))
array:split
array:of-members(array:members($array) => sort($collation, function($x) { $key($x?value) }))
array:join(array:split($array) => sort($collation, function($x) { $key($x?*) }))
array:subarray
array:of-members(array:members($array) => subsequence($start, $length))
array:join(array:split($array) => subsequence($start, $length))
array { $sequence }
array:of-members($sequence ! map { 'value': . })
array:join($sequence ! array { . })
[E1, E2, E3, ..., En]
array:join((map { 'value': E1 }, map { 'value': E2 }, map { 'value': E3 }, ... map { 'value': En }))
array:join((array { E1 }, array { E2 }, array { E3 }, ... array { En }))
$array?*
array:members($array) ! ?value
array:split($array) ! ?*
$array?$N / $array($N)
array:members($array)[$N]?value
array:split($array)[$N]?*
(orarray:get($array, $N)
)As a side note, I noticed that the equivalence given for
array:join
must be buggy:Concluding, If I could choose, I would tend to drop
array:members
andarray:of-members
and renamearray:split
toarray:members
.The text was updated successfully, but these errors were encountered: