[sc34wg3] CTM: IRI pattern contest

Lars Heuer heuer at semagia.com
Fri Nov 7 08:12:33 EST 2008

Hi again,

> Anyway, we need a good pattern for it, so any suggestion would be
> helpful. Currently I use the following pattern (assuming that we
> restrict the IRIs further as I've proposed):
>    schema-name ::= [a-zA-Z]+[a-zA-Z0-9\+\-\.]*
>    autodetectable-iri ::= schema-name '://'
> ((;|\.|\(|\)|,|:)*[^\s;\.\(\)\,:])+')

Should be:

    schema-name ::= [a-zA-Z]+[a-zA-Z0-9\+\-\.]*
    autodetectable-iri ::= '://'([;\.\(\),:]*[^\s;\.\(\),:])+

Python pattern:
    pattern = re.compile(r'[a-zA-Z]+[a-zA-Z0-9\+\-\.]*://([;\.\(\),:]*[^\s;\.\(\),:])+')

The other solution would work as well, but it was unnecessary to
escape the "," and I converted the 'or' pattern into a character set.
Not sure which is more readable. I think, I'd prefer the character set
since I use a character set at the end of the pattern, too.

Best regards,


More information about the sc34wg3 mailing list