[sc34wg3] Merging/Viewing subject proxies

Patrick Durusau sc34wg3@isotopicmaps.org
Tue, 26 Jul 2005 14:14:52 -0400


Jan,

Jan Algermissen wrote:

>
> On Jul 26, 2005, at 6:03 PM, Patrick Durusau wrote:
>
>> Jan,
>>
>> Does seeing it done count at proof of doable?
>
>
> If Versavant (havent looked at it yet) works for the arbitrary case  
> and for realistic data set sizes (thousands (better millions) of  
> proxies with dozens of properties, then yes, that would be  
> sufficient, IMHO.
>
> The question remains though, what 'seeing it done' really means :o)
>
Versavant as it exists doesn't have great storage so I expect it would 
die rather quickly with millions of subject proxies. But, that is not 
really a test of the paradigm but simply a known weakness in the 
application.

As I said in my post to Sam on metrics and scalability, it really is a 
question of what you want to do? What disclosures do you want to support?

I can quite easily imagine rather limited disclosures that would be 
appropriate for MARC records for example, in fact having both optimized 
disclosures and applications that really work in the environment only. 
Simply have no requirement to allow any other disclosures.

So, would an application with a limited (they all are in some sense) 
disclosure and highly optimized application for MARC records prove 
feasibility?

Depends on what you are looking for. If you are looking for an 
application that accepts unbounded disclosure statements and millions of 
proxies with dozens of properties, probably not. If on the other hand, 
you want the the potential that the TMRM offers for your particular 
disclosure, then I would say so.

I have a data set on hand right now that has approximately 700,000 
bibliographic records that I was toying with prior to the last TMRM 
drafting cycle. The number of subject proxies will depend on the view of 
taken by the disclosure statement.

Assume I am a university tenure committee and so all I am really 
interested in is person, author/editor/reviewer, date of publication and 
the journal. The title, description of content, subject treated by the 
publication, being irrelevant to the task at hand. ;-) The resulting 
view would be far different from the view that a researcher who is 
searching for literature on a particular subject would want. In fact, 
that researcher might want to be able to add disclosures that are 
optimized for other data sources.

The point I am trying to make is there is no one "right" disclosure and 
therefore there is no one "right" application.  Depends on what you want 
to do. .

For what its worth, I think Versavant demonstrates the soundness of the 
paradigm. How difficult some disclosures are going to be to actually 
implement and how well those will scale is an open question.

Hope you are having a great day!

Patrick

> Jan
>
>
>
>>
>> If so, see www.versavant.com.
>>
>> Patrick
>>
>> Jan Algermissen wrote:
>>
>>
>>> Patrick,
>>>
>>> On Jul 26, 2005, at 4:22 PM, Patrick Durusau wrote:
>>>
>>>
>>>> hhh = { < name = "rabbit, coney" >, < webresource =   
>>>> "www.rabbitnetwork.net, en.wikipedia.org/wiki/Rabbit" >, <   
>>>> classification = "Oryctotagus cuniculus" > }
>>>>
>>>> Of course I am presuming that the disclosure for "name" allows  
>>>> the  creation of a list of names and provides that if any of the  
>>>> "names"  in the list match, further viewing with other subject  
>>>> proxies that  have either "rabbit" or "coney" for the name  
>>>> property will occur.
>>>>
>>>>
>>>
>>> Having spend about a year on implementing what happens when  
>>> proxies  merge and how the merged values demand further merges  etc. 
>>> and having  especially tried to trim the algorithm for this  stuff 
>>> down to O(logN)  I must say that the datatype magic you  describe 
>>> (here converting  scalar to set as needed) is unlikely to  be 
>>> doable. The consequence  IMHO is that most value types should  come 
>>> as sets in the first place  (e.g. 'names' as opposed to  'name' in 
>>> the example.
>>>
>>> All this becomes really, really nasty when it comes to proxies  
>>> being  (parts of) values...
>>>
>>> This is not to say that the RM is not brilliant....I just think  
>>> there  is serious stuff in there that would need to be made  
>>> explicit and  proven as doable. (There might well be problems  
>>> lurking in there that  are not computable at all in finite time,  
>>> dunno)
>>>
>>> Jan
>>>
>>> _____________________________________________________________________ 
>>> ___ _______________
>>> Jan Algermissen, Consultant & Programmer                          
>>> http://jalgermissen.com
>>> Tugboat Consulting, 'Applying Web technology to enterprise IT'    
>>> http://www.tugboat.de
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> sc34wg3 mailing list
>>> sc34wg3@isotopicmaps.org
>>> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3
>>>
>>>
>>>
>>
>> -- 
>> Patrick Durusau
>> Patrick@Durusau.net
>> Chair, V1 - Text Processing: Office and Publishing Systems Interface
>> Co-Editor, ISO 13250, Topic Maps -- Reference Model
>> Member, Text Encoding Initiative Board of Directors, 2003-2005
>>
>> Topic Maps: Human, not artificial, intelligence at work!
>>
>> _______________________________________________
>> sc34wg3 mailing list
>> sc34wg3@isotopicmaps.org
>> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3
>>
>
> ________________________________________________________________________ 
> _______________
> Jan Algermissen, Consultant & Programmer                         
> http://jalgermissen.com
> Tugboat Consulting, 'Applying Web technology to enterprise IT'   
> http://www.tugboat.de
>
>
>
>
> _______________________________________________
> sc34wg3 mailing list
> sc34wg3@isotopicmaps.org
> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3
>
>

-- 
Patrick Durusau
Patrick@Durusau.net
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Member, Text Encoding Initiative Board of Directors, 2003-2005

Topic Maps: Human, not artificial, intelligence at work!