[sc34wg3] TMCL issue: regex?

Patrick Durusau patrick at durusau.net
Sat Nov 7 10:39:56 EST 2009


Lars,

Lars Marius Garshol wrote:
>
> * Patrick Durusau
>>
>> Well, I made the connection because we just went through this in ODF 
>> and realized you were mis-quoting the XML Schema Appendix, which 
>> reads: [...]
>
> Hmmmm. Actually, the draft isn't quoting XML Schema at all. The draft 
> says what it means for a string (s) to match a regular expression (r).
>
> It needs to say this because throughout the draft there are statements 
> like this one:
>
>> For each instance i of t the set of its subject identifiers which 
>> match r is referred to as S.
>
>
> To understand what this means, you need to see the definition in 3.8, 
> which says that a subject identifier (s) matches (r) if (s) "is a 
> member of the set of strings L(r) denoted by r as defined in [XML 
> Schema-2]." In other words, XML Schema is what defines the matching.
>
>> "A ·regular expression· 
>> <http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#dt-regex> /R/ is 
>> a sequence of characters that denote a *set of strings*  /L(R)/."
>
> Yep. So L(R) is the set of all strings which match (R). Therefore, if 
> our string (s) is a member of L(R) it matches (R). So the text works 
> just fine.
>
>> Defining what it means to match a regular expression isn't the same 
>> thing as defining a regular expression language.
>
> True, but XML Schema does both.
>
True but irrelevant. We aren't working on XML Schema. We are supposed to 
be defining TMCL. So far you have told me what strings must match and 
use XML Schema-2 to define that requirement. You have not said how I 
specify those strings, other than by implication.
>> If TMCL processor must/should support XML Schema Part 2, Appendix F, 
>> then let's simply say that.
>
> Clause 11 says you are conformant if you can "validate topic maps 
> against valid TMCL schemas as defined in Clause 4", and there's no way 
> to do that without supporting XML Schema Part 2, Appendix F, because 
> that's where the regexp syntax we use is defined, and clause 3.8 says 
> that it's defined there.
>
Sorry, that is *not* what 3.8 says. In full:

"A string /s/ matches a regular expression /r/ if the string is a member 
of the set of strings |L(r)| denoted by /r/ as defined in /[XML Schema-2 
<http://www.isotopicmaps.org/tmcl/tmcl.html#XSD2>]/. "

There are any number of regular expression syntaxes that I could use to 
denote an equivalent "set of string L(r) denoted by r...."

In other words you are confusing (I think) the set of strings with the 
means for specifying a set of strings. Not the same thing. And all 3.8 
talks about is matching a set of strings as specified by XML Schema-2. 
It does not say how those strings are specified in TMCL.

It is your *assumption* that implementers will use the syntax specified 
by XML Schema-2. No such requirement exists in the current draft.
> We could say it again in clause 11, but personally I feel that would 
> be redundant.
>
> If this is the key point (whether to say in clause 11 that 
> implementations must support that regexp language) I can add an issue.
>
No, the issue is whether TMCL is going to specify a syntax for regular 
expressions. It does not at this point.

Hope you are having a great weekend!

Patrick

-- 
Patrick Durusau
patrick at durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)



More information about the sc34wg3 mailing list