Familypedia
Advertisement
Forums: Index > Help desk > SMW


Forums: Index > Watercooler > SMW



Archiving the first month of discussion; see /Archive 2009-05.
See /Archive 2009-07/ for most of the material that preceded the upgrading of most of the articles that used info pages.

This page is intended to be the main question and discussion forum for Familypedia:Semantic MediaWiki (which has shortcut link: "SMW"). It is starting with copies of extracts from existing discussions. Please feel free to start new sections about specific pages on their talk pages, but please come here for at least a link back if the discussion broadens at all or if it might interest other contributors.

It is preferable not to have user-talk pages containing more than the briefest of question-and-answer dialog(ue) on SMW. If your question could interest two or more contributors, please put it here (with heading and edit summary both meaningful) then use user talk pages merely to tell your respondent(s) that the question is here.

Main information pages (list to grow as their number does):


Categories

After SMW and the forms are implemented into people articles, what categories are we getting rid of and what categories are we keeping? -AMK152(talkcontribs) 21:44, 7 July 2009 (UTC)

That's a community choice. A property generally can do everything a category can do plus a bit more. There are some limitations though. Properties are generally nicer than properties because they are expressed as a flat list and contributors don't have to go poking around in sub cats and sub sub cats to find all the hits. However, while the query engine understands super properties (that is, you can search on death location because death county is a subprop of death location), it has can't be used for the things you would have thought you could. If you want to tell a form to autocomplete using a property, then it doesn't use the subproperties (as of the time of this writing)- just the values that encode using that specific property. For this reason, if we have a placename hierarchy tree of valid place names and their relationship to each other (this would be derived from wikipedia for all languages), it would have to be a category if you wanted to use it for the autocomplete feature in sematic forms. However, if it turns out that we don't need to force people to know the classification of placenames, then this would be a flat namespace and we could as easily use a property. Sorry for the slightly off topic remarks, but I know you were interested in various containment hierarchies earlier and although I was going that direction, I am thinking we may want to make it much simpler for the user. So long as they use disambiguated Wikipedia names, we can derive the containment hierarchy- and we don't need to force them to know obscure stuff like the fact that Moscow as a "Federal city" is actually a subdivision1.
The short answer is that we are probably going to turn all the Married in/ Born in cats that used to be generated by {{info categories}} into properties and "concepts". ~ Phlox 01:05, 8 July 2009 (UTC)
Ditto for surnames. No need to categorize them anymore. Variations of names and noble houses can be concepts. I would argue that we should also create a property with the surname of the partner. rtol 07:07, 8 July 2009 (UTC)
I'd say we make a decision soon to speed up the servers. rtol 10:42, 9 July 2009 (UTC)
By the way, there are substantial Wikipedia categories that we may like to keep as categories in order to avoid constant translation back and forth between our content and theirs. I can see this might be the case with certain geographic categories that are necessary for familypedia's functioning, but do not represent our core content. ~ Phlox 17:33, 9 July 2009 (UTC)

"Concept" (nee Category) Navigation

In place of the jumble of categories at the bottom of a person page, we possibly will we want to have a collapsible navbox that organizes Concepts thematically. EG-The births box in the following collapsible navbox would have a table like the one below with each bullet point linking to a concept page with the stated contents.

Births crossreferences
By date Jones births by date Births in NSW by date Jones births by location
  • 19th century
    • 1870s
      • 1876
  • 19th cent. Jones
    • 1870s Jones
      • 1876 Jones
  • 18th century births in New South Wales
    • 1870s births in New South Wales
      • 1876 births in New South Wales
  • Jones births in Australia
    • Jones births in New South Wales
      • Jones births in Sydney
  • This navbox would also contain a default sort statement, since as you will notice Concept pages otherwise will sort by the title of the person articles (first name order).
  • Concept pages hold 200 pages at a time, as with categories.
  • The concept statement text can be hidden with a span display none statement. For an example, see Concept:Born in New South Wales.

~ Phlox 02:40, 15 July 2009 (UTC)

I agree. I would show only three columns (by date, by location, by surname) rather than six or seven ((by date, by location, by surname, by date and location, by data and surname, by location and surname, by data location and surname). rtol 04:45, 15 July 2009 (UTC)
That's fine since we have 10K odd persons total. I am projecting out a few more chess moves. If World Connect has a couple million names, and it is not an exaggeration to expect we will be doing the same, then it's not hard to see that our visitors will have a hard time keeping up with the 10s of thousands of Jonses without the ability to narrow the search sets down considerably. Tables like this is one UI way of narrowing. Perhaps we will do a sidebar selector. That's a pretty familiar method- eg Google where you just check off some boxes. IMHO we shouldn't trouble our visitors with the fact that they are querying a database. Wikipedia users certainly aren't interested that this is what they are doing on every page view. Not sure why we should expect our visitors to be weighed down with jargon that has no value to them. ~ Phlox 08:52, 16 July 2009 (UTC)
I stand corrected. Three basic columns with four hideable ones would be a good solution. rtol 09:08, 16 July 2009 (UTC)

Categories are not multilingual. Concepts are

Note also that the restriction that all categories be in English is now bypassed by Concepts. Concepts can be in any language and retrieve the exact same results as the English category or concept. For example, Concept:Mort en Austrasie (.fr) and Concept:Died in Austrasia. ~ Phlox 08:52, 16 July 2009 (UTC)

Info Categories slowdowns

Rtol has been working hard on a feature implemented using SMW that allows Familypedia to display things like very important ancestors automatically in all articles on descendants. There has been objection that these computations make the general article on individuals too slow. The discussion of this issue is on the Info categories talk page, for details on how that issue is resolved, contributors are invited to contribute their opinions there.

Of general interest to SMW users is one possible generalized solution to these sorts of computationally intensive projects. Familypedia wants to welcome these features, while not slowing down the browsing experience of visitors to the site. One way we could do this is to ask such projects to create a subpage to the main article where the calculations can then be made. Genealogical relationships present problems most of which require exponential explosions of time for calculation. What does this mean in laymans terms? It means that calculating descendants is 2 times the cost of the first, then four times (2^2) for the next generation, then eight (2^3), 2^4, 2^5 and so on. You get out to 16 generations and the calculation is potentially 65,000 times slower than the first generation calculation. Although most everyone's tree of known descendants is typically more sparse, and there are nowhere near this number of operations, these calculations tend to get progressively slower for each added generation. This applies to other kinds of family history relationships such as for Six degrees of separation/ "human web" type features- For example, "My great great, great grandfather was best friends with your great great grandfather". Anyway, if such calculations are performed at times other than when the viewer is looking at the main article, this calculation penalty is substantially reduced. The researcher then may add time consuming calculations without being constrained by concerns of community objections about article rendering speed. Only viewers of that particular project page would experience the slowdown. ~ Phlox 16:44, 10 July 2009 (UTC)

{{Show VIA}} is a simple query. This does not slow down anything. rtol 07:51, 12 July 2009 (UTC)

Ancestors and descendants

Note that the lists of ancestors and descendants are becoming too large for some people. This means that the pages of these people behave in peculiar ways. Working on it. rtol 07:27, 20 July 2009 (UTC)

Solved, actually, already a while ago. rtol 19:54, 4 August 2009 (UTC)

Questions

Sorry for the many questions, but I'm getting very confused here. Q: So when we use SMW, we won't need info pages?

A: We won't "need" them, but it is up to the community to keep using them if they wish. My opinion is that there is no justification for their continued existence and that they should be phased out entirely. -Phlox

Q: Will all the information just be stored on the page using the templates that are generated from using form data?

A:Yes, although the templates can be used separately from the forms, and edited directly as one edits infobox parameters -Phlox

Q:Since it's hard for me to load the form page on my computer, will I be able to edit just using the templates without messing up the page?

A: Yep, that is what Thurstan has been doing. I'm not sure he even knows what the forms look like. -Phlox

Q: I read somewhere about the possibility of a search feature once the SMW system gets going. Will this allow us to eliminate some of the categories? If so, which ones?

A: We will be able to eliminate most of the categories we use for narrowing down subjects. EG: Born in London, Births/ Deaths/ Marriage/ <you name the event> in the 1910s.... This uses an SMW "Concept" which at this point not much more than a fancy term for "persistent query". Technically, a Category is itself a persistent query. Concepts expose the ability to make aggregate categories from constituent parts. For example, Concept:Born in the 1750s creates a Category like list. The major visible difference is that it is a flat list- there are no subcategories. This particular concept is formed by querying a single property, but a concept listing may use categories as well in combination with properties (docs on queries here. The advantage of this is that people don't have to write a template that is used in every single article to create a new "category". Maybe some bloke is interested in ancestors born in Greene county, Ohio in the 1860s. With a single like expression, he would have this on a concept page. I am of the opinion that the Navboxes I created for places will be upgraded to link to these sorts of concept pages. They could also be used to link to direct queries. An example of direct queries instead of concepts may be found in the navbox for The Hague article. Click on Births or Deaths. I am not happy with this design for a couple of reasons, but it is another option for an alternative to how we were formerly using info categories, and the combinatorial problem we were sort of hoping would go away.

Q: Once the SMW system for creating people pages is ready to be used on a large scale, will the quality of the resulting articles be as good as it was with the info page system? -AMK152(talkcontribs) 02:39, 3 August 2009 (UTC)

A:Much better. But it isn't at all ready yet, and it is fair for folks to be skeptical about how soon it will be ready and/or how desirable it will be when the real (as opposed to promised) features can be examined. At the risk of belaboring a point I have repeated often, I must emphasize folks should not hold their breath thinking that the SMW replacement of info pages are just around the corner and there is no reason to create info pages. Info pages work now, and articles using them will convert directly over to SMW articles without much effort so everyone should just behave as SMW didn't exist. Folks are welcome to use the provisional SMW code that has been created, with the understanding that it is in a pre-alpha state subject to wild changes in configuration that could break articles. SMW forms and articles may appear to be polished but in fact the underlying design is undergoing substantial changes that will affect how useful they are as Familypedia grows to significant dimensions.
There are a number of limitations of the Info page system that are blown away by SMW pages, not the least of these is performance cost. I put many optimizations in when I built Info Pages, but they are inherently slow because they are entirely implemented in templates. From an engineering perspective, the key is that since indirection is extremely cheap, the expressive power is significantly greater. That the core reason that SMW utterly leave info pages in the dust. Does this make categories obsolete? No. We need them for compatibility with imported WP articles, and there are some esoteric techniques that require the use of categories. For example, a category like Category:Valid name- locality will be needed for autocompletion in SMW forms. This is because autocompletion can use values from subcats if a category is used.
As usual, I have probably given too much information. Just try creating some Concepts, and I think it will become clear why the way we were doing info categories is no longer necessary. the docs are here. ~ Phlox 17:47, 3 August 2009 (UTC)
I think Phlox is too modest. SMW works better, is more finished and is more stable than one would get from the answers above. It is also downward compatible with info pages (but not with older formats). rtol 15:37, 4 August 2009 (UTC)
Right. The limitations I was speaking about are not those of the SMW extensions, but of the so called "facts pages" family of templates I have been working on since May as a comprehensive replacement to the "info page" family of templates. Sometimes the two get associated but folks are free to use SMW features as they wish separate from the way facts pages use the features. Rtol's ancestor and descendants templates in fact do just that. ~ Phlox 17:51, 4 August 2009 (UTC)
Exactly. While we may change the source of the information about "mother", and may introduce aliases like "Mutter" and "mere", Property:Mother is stable and {{getfact|mother}} is a lot quicker than {{get|key=mother}}. Similarly, Property:Surname is a lot more versatile than Category: (surname). rtol 19:49, 4 August 2009 (UTC)
Advertisement