[sc34wg3] TMCL issue: regex?

Patrick Durusau patrick at durusau.net
Sat Nov 7 10:39:56 EST 2009


Lars Marius Garshol wrote:
> * Patrick Durusau
>> Well, I made the connection because we just went through this in ODF 
>> and realized you were mis-quoting the XML Schema Appendix, which 
>> reads: [...]
> Hmmmm. Actually, the draft isn't quoting XML Schema at all. The draft 
> says what it means for a string (s) to match a regular expression (r).
> It needs to say this because throughout the draft there are statements 
> like this one:
>> For each instance i of t the set of its subject identifiers which 
>> match r is referred to as S.
> To understand what this means, you need to see the definition in 3.8, 
> which says that a subject identifier (s) matches (r) if (s) "is a 
> member of the set of strings L(r) denoted by r as defined in [XML 
> Schema-2]." In other words, XML Schema is what defines the matching.
>> "A ·regular expression· 
>> <http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#dt-regex> /R/ is 
>> a sequence of characters that denote a *set of strings*  /L(R)/."
> Yep. So L(R) is the set of all strings which match (R). Therefore, if 
> our string (s) is a member of L(R) it matches (R). So the text works 
> just fine.
>> Defining what it means to match a regular expression isn't the same 
>> thing as defining a regular expression language.
> True, but XML Schema does both.
True but irrelevant. We aren't working on XML Schema. We are supposed to 
be defining TMCL. So far you have told me what strings must match and 
use XML Schema-2 to define that requirement. You have not said how I 
specify those strings, other than by implication.
>> If TMCL processor must/should support XML Schema Part 2, Appendix F, 
>> then let's simply say that.
> Clause 11 says you are conformant if you can "validate topic maps 
> against valid TMCL schemas as defined in Clause 4", and there's no way 
> to do that without supporting XML Schema Part 2, Appendix F, because 
> that's where the regexp syntax we use is defined, and clause 3.8 says 
> that it's defined there.
Sorry, that is *not* what 3.8 says. In full:

"A string /s/ matches a regular expression /r/ if the string is a member 
of the set of strings |L(r)| denoted by /r/ as defined in /[XML Schema-2 
<http://www.isotopicmaps.org/tmcl/tmcl.html#XSD2>]/. "

There are any number of regular expression syntaxes that I could use to 
denote an equivalent "set of string L(r) denoted by r...."

In other words you are confusing (I think) the set of strings with the 
means for specifying a set of strings. Not the same thing. And all 3.8 
talks about is matching a set of strings as specified by XML Schema-2. 
It does not say how those strings are specified in TMCL.

It is your *assumption* that implementers will use the syntax specified 
by XML Schema-2. No such requirement exists in the current draft.
> We could say it again in clause 11, but personally I feel that would 
> be redundant.
> If this is the key point (whether to say in clause 11 that 
> implementations must support that regexp language) I can add an issue.
No, the issue is whether TMCL is going to specify a syntax for regular 
expressions. It does not at this point.

Hope you are having a great weekend!


Patrick Durusau
patrick at durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

More information about the sc34wg3 mailing list