Forum:SMW

'''This page is intended to be the main question and discussion forum for Genealogy:Semantic MediaWiki (which has shortcut link: "SMW"). It is starting with copies of extracts from existing discussions. Please feel free to start new sections about specific pages on their talk pages, but please come here for at least a link back if the discussion broadens at all or if it might interest other contributors.

It is preferable not to have user-talk pages containing more than the briefest of question-and-answer dialog(ue) on SMW. If your question could interest two or more contributors, please put it here (with heading and edit summary both meaningful) then use user talk pages merely to tell your respondent(s) that the question is here.'''

Main information pages (list to grow as their number does):
 * Help:Semantic MediaWiki
 * Query (redirect)
 * Unused properties
 * User:Phlox/2009 05 notes
 * Property talk pages
 * Type talk pages
 * Form talk pages
 * Concept talk pages

Requested extensions
I requested that wikia central dudes enable the following extension on genealogy. It is used by several other wikia including psychology:

Semantic Mediawiki.

They may ask you if it is ok. This may allow us to have forms that would help ordinary people access info pages. We will need it enabled to see if this is so or not. It also allows semantic searches, such as searching for a person located in a particular geographic area. Somewhat useful for us, no?

I also requested them to look into supporting extension "link attributes"- this is useful for microformats- allowing us theoretically to interact with other internet applications. Very experimental, but something may come of it. - ~  Ph l o x  01:08, 30 April 2009 (UTC)


 * Sounds like progress. See File:Smw-quick-reference.png. — Robin Patterson (Talk) 04:17, 30 April 2009 (UTC)


 * I played with it for a few hours and I couldn't break it very much.  More here.  Robin, I am very excited.  I was writing about the semantic wiki stuff a while back but you know, this may well be ready for prime time.  I am cautiously optimistic that this can be moved over to in the coming months.  If folks agree that we want to make the jump, I'll do the bot runs and help folks convert templates.  There may be a few hiccups but well worth it.  Worst case, we revert everything and we are back to normal.  - ~  Ph l o x   08:56, 30 April 2009 (UTC)

Properties
I see you are creating properties. Very good. I guess these can be used to create couples. For instance, shows people who married their first cousin, but not who they married. Should really be displayed as Arie Korver x Anna Korver. If Arie has the property "married cousin" this can be done. rtol 10:39, 1 May 2009 (UTC)


 * You would be able to do this dynamically via a query. It would be a single #ask, so for all practical purposes it would look like a property anyway.  I left some breadcrumb links over the help forum in future of info pages thread.  There are some example wikis that have used it and some good examples/discussions.


 * In db design the rule is not to redundantly code something that can be derived (easily). To do otherwise introduces database inconsistencies.  With that said, wikis are fundamentally unstructured databases.  Anyway, that is my first reaction to the cousin married thing.  We could also hard code stuff like property lived in New Hampshire.  I rather think that "had residence" whose property was "has location in state" is a more proper way to encode this kind of structure.


 * Everything I am doing now is very makeshift, so don't assume too much about what is happening with it. I haven't formulated strong opinions yet.  I plan to be doing a few approaches and then will release a tool so that everyone can convert over stuff as they please.  I am mostly interested that we do this right, and that our decisions will establish familypedia as the premier semantic genealogy site.  Everyone else is playing the quantity game of piling on gedcoms.  We need to play with different rules if we are to dominate.  Our strength is collaboration and quality- that will in the end crush everyone.  So I am of the opinion that we should move towards having a very friendly version of a hard core evidence based system that operates from the notion that genealogies represent multiple and often conflicting believed realities.  All of our so called facts are really just assertions with varying degrees of certainty.  To add a formal semantic layer over fundamentally contradictory knowledge is just GIGO (garbage in, garbage out).  Anyway, that is my thinking as of today.  - ~  Ph l o x   16:20, 1 May 2009 (UTC)


 * Got it, and look forward to these enhancements. rtol 18:26, 1 May 2009 (UTC)

Property names
I know that it is early days, but I have a suggestion about "Property names": there is an asymmetry in using "Spouse", "Spouse2", "Spouse3" etc which complicates the templates. I would prefer the series to be "Spouse1", "Spouse2", "Spouse3" etc, and so on for the various "multivalued" properties (eg AFN). Thurstan 04:18, 2 May 2009 (UTC)
 * Yeah, I did that ordinal numbering thing in a clumsy way. I am on it. - ~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   05:31, 2 May 2009 (UTC)

How the new thing works
This article employs the SMW extensions for all input George Spencer Geer (1836). I don't have support for all the fields yet, but this will give you an idea of the kind of UI that is possible. Note the "edit with form" on the toolbar. Click that and you can edit all the fields known to man for that person. At the bottom of the page, I have a collapsed edit box that allows the user to edit just selected items (like if they were only researching baptism records, they'd click baptism in that box. When an article is first created, the user is not bombarded with every choice possible.  Instead I give them a pared down list.  Click on the wife or children redlinks to see an example of that.  At the bottom of the page you will see a Facts list with the equivalent of all the info page fields displayed.  This can be turned off if desired on a per page basis.

As I said, not done yet by any stretch of the imagination, but this gives the general idea of how this mechanism works. It's a little bit of voodoo, but the docs on the semantic mediawiki site will explain what I am up to - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x  10:16, 9 May 2009 (UTC)

Pioneer George of New York
Page looks impressive, to say the least. I see no toolbar as described, but I saw the thing that opens a can of yummy spaghetti.

I've posted his page to my profile on Facebook, with the following introduction:
 * Semantic MediaWiki is moving Familypedia up a gear! (This page is a "Beta" version but pretty impressive.) Swiss ancestry certainly has its good points.

— Robin Patterson (Talk) 11:01, 9 May 2009 (UTC)
 * By edit bar, I meant the wikia bar that says edit this page, leave a message, history, etc. Before the "Edit this page" is the Edit with forms, so you obviously found it.


 * The thing I found was a show/hide thing about half-way down. I'll try top left next time. — Robin Patterson (Talk) 13:48, 10 May 2009 (UTC)


 * Editing with this long form is probably not the preferred mode for users. Perhaps they will learn to click just on the bit that they want to change, like birth, wedding, children, and what not.


 * Perhaps you did not see, but on each of the events is a place for images. Adding an item here guides the user through an upload, and the resulting file name is stored with the article for the contributor.  Further, the size of the image can be set by default, so they don't have to know anything about how to jam it into an infobox.  In similar ways, we can control the decoration to names of counties, surnames and so on.  I am not going to mess with the minutiae yet.  I just want to press to make sure it can do all the hard stuff, or whether there will emerge some hidden obstacle that blocks us from using SMW.  There is a bunch of stuff I find annoying, but all and all, much more usable for normal contributors, and for sophisticated users, I think they will welcome the departure from the frail and impenetrable obscurity of my Info templates code.


 * - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x  16:04, 9 May 2009 (UTC)


 * Please don't disparage the info templates code. Great work, which now is well enough documented to get most new users diving in successfully without specific invitation. You will have to persuade them that there's a better way. — Robin Patterson (Talk) 13:48, 10 May 2009 (UTC)


 * If the three wikis you've been trialling or discussing SMW on are not enough, try http://thirdturn.wikia.com/wiki/User_talk:DaNASCAT#blank_pages for a starter into that real-life subject to see if maybe Uberfuzzy can help you get some value from their experience. I listed a couple of other possibles on the Help:SMW page. — Robin Patterson (Talk) 13:48, 10 May 2009 (UTC)


 * Well, as a doting parent of course I am proud of the info pages approach. They are powerful and I am pleased with their success with new users.  SMW encoding may be an unwelcome transition for some and of course it is up to the community whether this radical departure from info pages merits attention.  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   17:33, 10 May 2009 (UTC)

More on George Geer
After just one save, I saw much of what I had just done on the display. — Robin Patterson (Talk) 10:02, 12 May 2009 (UTC)

Autocompletion

 * Note that Image and "had people" fields autocomplete with existing files from those namespaces. Similarly, if you click on the picture of Geer, then edit with form for that image, you will see the county is autocompleted.


 * It's nice, but the way we have been using categories is not perfect for getting these lists of correct values. Anything that is a subcategory of a state or county is added, so what we need is to have a category whose only members are only county articles, only locality articles, and so on.  It doesn't matter how deeply they are nested- so we could have


 * County articles
 * County articles- United States
 * County articles- Texas
 * Borough articles- Louisiana
 * County articles- United Kingdom
 * County articles- Scotland
 * These would be hidden categories added by bot. Hidden because we don't particularly need folks adding crud into them.
 * - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x  23:56, 12 May 2009 (UTC)

Definition of "locality" etc
A locality is a general term that refers to a settlement, village, town, city, municipality or other local community or governmental unit. (Initial definition on page.)


 * That's artificially narrowing the meaning of "locality". The Nullarbor Plain and the North Magnetic Pole are localities in normal English. Births can occur there. One easy way to restore a fair semblance of correspondence to plain English would be to replace "refers to" by "includes". A better long-term way would be to cut it right down to a simple dictionary definition such as "the position or site of something" (Concise Oxford Dictionary, 10th ed.). — Robin Patterson (Talk) 07:02, 11 May 2009 (UTC)


 * Why "locality" (four syllables) instead of "place" (one)? — Robin Patterson (Talk) 07:02, 11 May 2009 (UTC)
 * A contributor could construe a county, province or landmark like a building as a place. The location hierarchy is
 * street address(multiple, including building names); (I really don't like this name- I want something that covers buildings and landmarks like public squares, but since building names are allowed in these lines by internet convention, I went along with it)
 * locality; (this is what microformats community uses- and they picked it up from other internet standards to mean anything from a community like Riccarton in Christchurch that is not even a city or distinct settlement, to Shanghai/Moscow/New York are megalopolis's that have consumed neighboring cities. Maybe multiples should be allowed?  eg: Brooklyn, New York City
 * county;
 * "state"(province/oblast/department- "sub government unit"); (Isn't canterbury on the south island one of these?  It functions like a county, but there is nothing like a province level between it and the national government)
 * country
 * Syllables may not matter any more. Nobody will be typing "locality" because of the forms interface.  "Place" is ambiguous, and suggests it would be ok to put "Texas" in there.  I don't really care what we call "locality"- it just needs to be broad enough to cover something really small to something enormous, but still less than a county or province.  If you want to call it something different, play around with the names and suggest something else.  It isn't easy, and though locality is clumsy, it is better than the alternatives- eg town.  Shanghai is not a town, and neither is Riccarton. - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   17:39, 11 May 2009 (UTC)

OK, if nobody types it much if at all, syllables don't matter. "Locality" has more feeling of smallness than "place". But the idea that it must be inhabited won't do, for reasons I stated. Landmark Mount St Helen's can have births on it now that the ash has cooled. So we need more terms or maybe more variety of terms in our definition. I guess we also want the user to enter something unique. Riccarton (which was a borough until 1989) and Canterbury (which has been a local govt region since then and is a recognized genealogy region) and Brooklyn (which could be in Wellington although the prominent one is a whole county of New York) are not..... — Robin Patterson (Talk) 15:35, 12 May 2009 (UTC)


 * Ok. We can add something for uninhabited places at any time.  Let me know the field name and I'll stick it in.  Locations has been thought of a lot in the community of folks to concern themselves with these various metadata schemes.  Of course we could introduce something like "landmark" that could handle buildings, mountains, squares and what not.  For precise locations, we can use the coordinates, and those are very nice because we will be able to view stuff like gravesites and former residences from 300 feet up using google earth and and the other virtual earth programs.  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   16:56, 12 May 2009 (UTC)

Other Wikia sites; forms extension
Which Wikia sites have progressed further than you have? — Robin Patterson (Talk) 15:35, 12 May 2009 (UTC)
 * I am not aware of what the other wikia are doing. My initial survey was that hardly any that had asked for the extension were using it much and had even neglected to ask for the forms extension.  The forms are not normal pages and their interaction with templates is unusual compared to normal wikimedia functionality.


 * I need to tidy up the forms a bit- these are getting closer to the stage where some of the more technical dudes can figure out how to tweak them. It's just table fiddling- not too bad really.  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   16:56, 12 May 2009 (UTC)

Querying
I have not done anything with querying yet. This is where the SMW stuff is really going to bring home the bacon for our stalwart genealogist and micro history contributors. Regarding promotion, I don't see any harm to it..... - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x  16:56, 12 May 2009 (UTC)

That shouldn't have happened
Can you give me an example of the article where this occurred? - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x  08:04, 15 May 2009 (UTC)
 * Nevermind. I got it.  I will have it turned off shortly. Sorry.  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   08:04, 15 May 2009 (UTC)
 * Should be back to normal now. Forms will be optional- clicked from the banner text above the edit window.  If there is any other oddities you notice, let me know as I am about to knock off for the night. - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   08:30, 15 May 2009 (UTC)

Ontology, i.e. categories etc
So SMW doesn't replace categories soon if ever but works with them?

It may reduce the value of producing lots of high-level, complex cats, such as "Migrants from Europe to Canada in the 1820s"?

— Robin Patterson (Talk) 12:06, 16 May 2009 (UTC)


 * Actually, it looks like they are categories with a bunch of extra features.
 * You can query on categories and properties, except you can add numeric or date range checks if they are properties
 * You assign the item to a property nearly like you do a category, except you use two colons instead of one. birth county::Gloucestershire
 * When you create a property page, the items marked with that property appear as a list.
 * except: sub properties are not listed (fixed in SMW 1.2.1. release- this limitation will go away as soon)
 * except: lists are formatted differently - less compressed and with special features for examining the values.
 * You can declare a default form for a category, just as you can for a property
 * if you search for categories, it picks up articles from all subcats. The same is true of properties.  For instance, search on property state picks up death state, birth state, wedding1 state and so on.
 * creating sub properties is alike, but syntax is slightly different. With cat's you simply mention the supercat eg . If counties of colorado were a property, the corresponding syntax would be subproperty of::Property:Counties of Colorado.
 * Categories and properties can be used interchangably in queries. You can create a query eg one that combines Migrants from Europe with Migrants to Canada with Migration date before 1830 and after 1819.  Migrants from Europe and Migrants to Canada could be either categories or properties.  Migration date could not be a property, due to the range comparison.


 * So it is not so very odd after all. By the way, the combined query above can be saved as a "Concept".  There is a bit of hyperbole in calling it a concept, but the idea is that it is a saved query so instead of typing in all that query again, you can simply type Concept:Migrants from Europe to Canada in the 1820s .  You can use these in other queries, or on pages just as if it were a property.  Concept:1750s births implements the category:1750s births.  There may be performance reasons for using Categories rather than Concepts, but I have not yet determined this yet.  They may cache the results on the server for all I know, but I notice a lag when I use concepts versus categories.


 * Hope this helps. Maybe we should collect these remarks somewhere useful. - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   18:15, 16 May 2009 (UTC)


 * Thank you. Yes, we should collect them somewhere else. Not all of our users recognize that "Ontology" can be interesting. I'll give some thought to listing useful talk pages somewhere accessible and recognizable. and here it is! One idea already formed is that we can make subpages of the SMW-site-derived Help page so that individual components can be expounded, locally illustrated, and discussed more easily. — Robin Patterson (Talk) 06:55, 17 May 2009 (UTC)

Info pages in relation to SMW
..... Am I right in thinking info pages are a key feature of our use of SMW? — Robin Patterson (Talk) 04:51, 17 May 2009 (UTC)
 * The SMW stuff is temporarily dependent on info pages for their data. I put in a patch so that we have a mirror of many (but not all) of the fields in the info pages.  This sort of jury rig was necessary so that I could quickly do some volume tests.  The SMW approach is to have what we think of as info data on the same page as the article, as is the case on the George Spencer Geer (1836) prototype page.  The Geer article wikitext is what SMW pages will look like in the future.


 * As I was remarking this morning, the Geer article is what a largely tabular article would look like. However, it is possible to set the values inline in a text passage like with the robert hester example I was referring to in the 2007 article.  Some folks may prefer this style, and that is in many respects preferable to the tabular form that tends to perform a life-sucking reduction of people's lives into a table of vital statistics.  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   05:55, 17 May 2009 (UTC)


 * Thank you for all of that. I'd better check that I've read all your recent replies and essays before I ask too much more. Seems we have at least three of us working on this, with one doing over 99% but the others making an effort. It must be a good thing. I've mentioned it on two of my Squidoo lens updates now and copied to Twitter and Facebook. — Robin Patterson (Talk) 06:41, 17 May 2009 (UTC)
 * I am scouting out way way ahead of the main party on some of this, so if I lost you, just ignore me. I just want to make sure I don't set it up a certain way that will wind up coming back to bite us.  Speaking of which, did you follow what I was saying about the  migration kmz files?  ....., check out google earth and how it displays the notes.  Each one of the notes can have linkbacks to the genealogy wikia article.  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   06:53, 17 May 2009 (UTC)

Nested and recursive queries
Hoping to build a query "What is the relationship between Ms Smith and Mr Jones?". As a first step, I'd need the intersection of the sets of ancestors of both. So, returns my parents, but  returns an error rather than my grandparents. Any hints? rtol 07:55, 17 May 2009 (UTC)


 * Your married cousin test answers my question. rtol 08:10, 17 May 2009 (UTC)

the cousin test
see discussion on the talk page of married cousin test. I am turning in to bed soon so if you have any questions, you better fire them off soon.

I am not sure we would do the test this way, because it would require us to redundantly store children lists in the articles of both parents. My patch to info pages isn't doing that now, so we just have the list in either the father or the mother- that's why the grandparent lists are usually just one rather than four even though all may have info pages. We might do this inelegant redundant storage scheme due to performance reasons. We'd rely on bots to keep the lists consistent, and this will work if they are easy enough to use. The bot option will likely be, with some custom plugin(s) that I build for it. So it will be a lot easier to use than other bot tools (pywikipedia). - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x  08:43, 17 May 2009 (UTC)


 * See "" below. — Robin Patterson (Talk) 06:24, 22 May 2009 (UTC)

So I can't exactly put it in his article?
(I was impressed by Henry's lists of children and asked if I could put it in his article.) "Not exactly..." said the guru. So I tested it: http://genealogy.wikia.com/index.php?title=Henry_I,_King_of_England_(1068-1135)&action=history — Robin Patterson (Talk) 08:45, 17 May 2009 (UTC)
 * I don't see what's wrong, should be ok, but I have to turn in. I'll check it tomorrow.  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   08:51, 17 May 2009 (UTC)

Test
Template:OrderCMTest9 works fine in general, but not for Emma de Limoges (c960-c991). I guess this is because she's a daughter from her father's second marriage. Thanks! rtol 11:39, 17 May 2009 (UTC)


 * The problem in OrderCMTest9 was Children-S2 v Children-s2. The latter works. rtol 16:51, 17 May 2009 (UTC)


 * Note that I also made children-s2 a subproperty of "family member". This may be wrong. rtol 17:01, 17 May 2009 (UTC)
 * Yes, wrong: should be a subprop of children. Not sure why you are explicitly probing children-s2 anyway.  Children-s2 is a subproperty of Children, so you should be picking up all children-s2, s3 etc.  If this is not the case, point me to a specific example.  Thanks.  Also, by the way, don't get too attached to these field names.  We are at the learning curve stage and so everything should be regarded as makeshift and experimental, that could be radically revamped with new names or different approaches?  OK?  I expect we should have things substantially sorted in the next month though.  For permanent work, no one should be encoding assuming the current SMW fields except for experiments.  If people use Info fields, they are safe.  Ok?  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   17:03, 17 May 2009 (UTC)


 * Have a look at Emma de Limoges (c960-c991). "children-s2" returns her dad. "children" returns empty. I agree that "children" should be a superset of "children-s1", "children-s2" ... (just think of the test Married third cousin) but it is not at the moment. rtol 17:13, 17 May 2009 (UTC)
 * I'll look at that. Hopefully it is not a flaw in SMW, because the superprop stuff does work sometimes for these depth searches.
 * Problem fixed. Thanks! rtol 17:51, 17 May 2009 (UTC)
 * By the way... there is a bottom up search for cousins by following the parentage links rather than the children links. The weakness of this approach is that folks usually add bunches of children but don't bother declaring the child nodes.  Again, we could deal with this if AWB is easy enough to be used by adept folks such as yourself.  Right now, the latest release gets hung on wikia, at login, but I'll deal with that if the awb devs don't get to it soon.  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   17:22, 17 May 2009 (UTC)
 * Cousins is easy enough. 13th cousin twice removed is hard. rtol 17:51, 17 May 2009 (UTC)

(undent) Good. Yes. An issue to be sure, and exponential expansion problems are generic to this problem set. There is a caching approach that I am considering that will alleviate some of the server load issues. - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x  17:55, 17 May 2009 (UTC)

Query depth
I created a "Married third cousin test". This apparently fails (or rather, always returns true) because I nest 4 queries. Is that too much? The "married second cousin test" works fine. rtol 20:16, 17 May 2009 (UTC)
 * There is a query maximum nesting depth setting for the extension. You could have run into it, but that doesn't seem like a lot of queries to me.  15Queries to get to each set of gg grandparents, so 30Queries max, right?  You are in new territory here.  Let me know if you come up with anything.  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   01:21, 18 May 2009 (UTC)
 * Pls have a look at Isabella of Portugal (1397-1472) (experiments). The first query correctly returns Edward II. The second query returns an alphabetic list of everybody. Should be empty. Thanks! rtol 05:38, 18 May 2009 (UTC)

That's the end of the bulk-copying. Feel free to add queries or comments on specific items above, but most new additions should be below under specific headings even if the subject has been started earlier. A note on the earlier discussion can helpfully lead to the resumed discussion.

Pages for marital and extra-marital unions
Maybe it's time for these. See latest discussions on Forum:Semantic MediaWiki is here!!!!!. Having a page of this form: "Union:Joe Blow (YOB-YOD) and Jane Doe (?-1400)". Not at all unfamiliar to users of GEDCOM files.

Not unlike Children of Vivion Daniel and Elizabeth Vivion, but it would be able (as would that page) to have marriage/wedding etc details and a standard box for the offspring very like showinfo children. A single place to keep the children, avoiding the duplication that Thurstan and others prefer to avoid, while allowing direct (though one step longer) generational querying.

— Robin Patterson (Talk) 05:51, 22 May 2009 (UTC)

Maybe the number of steps can be reduced by adding to each union page a link to the union page on which each party to the union is a child. — Robin Patterson (Talk) 06:30, 22 May 2009 (UTC)

Ancestry and descent

 * (Extract from User:Phlox/2009 05 notes 21 May 2009)


 * General problem definition
 * Exponentiation expansions are generic to this problem domain, so optimizations will be necessary regardless how strong our software engine is at any point in time. In general, we will be using local processes to offload this to the client machines in order to execute massive depth searches or network walking that would time out on the server.

SMW does not mean ancestors are reduced to statistics

 * (Extract from User:Phlox/2009 05 notes 25 May 2009)


 * It is not required to use forms to add structured genealogy data. This information can be added inline, rather than using forms and person infobox.  For an example, see Agnes Margaret Mucha (1893-1965).
 * The norm for many genealogy sites is to reduce ancestors to lists of tabular material, and much of this is driven by the way database software works. The obvious nexus between wikis and structured databases is Infoboxes, and that naturally has been the focus for microformats.  It also presents low hanging fruit for SMW, through use of the Semantic forms extension.  However, the core of SMW frees wikis from the tabular approach.  It is entirely natural for family members to present the story of their family as a story, and will prefer familypedia on that basis.  They may shun the tabular approach, but SMW entirely supports that.  Full narratives that happen to have structured data within them.
 * It may turn out that we want to suggest that everyone use an infobox for quality reasons (less chaotic look and feel). However, even in that circumstance we might have people putting some optional information inline so that the infobox is less cluttered.
 * Observation- this affects naming. From an engineering perspective, the data types are central.  Dates are a general type, with multiple forms but they are all dates.  Events simply are the variants eg date birth; date death; date wedding; ....  But from the user's perspective, the events are central, and the various details about them are the variants.  Birth date, birth county, birth state, birth people present, birth notes.  Maybe we keep the naming the same.  Today, 25 May, I think so.  Okay.  This would chuck the whole inversion thing.  It would be wedding1 date etc.  Hmmm.  I suppose the variant can go in the middle with not that much difficulty.  Code will look a little uglier, but what the heck.  Ease of inline naming is more important.  I just don't know that folks will do it that much, or that we want that to be become a dominant way of stating things.  Hmmm.  If the community decides they like inline, then we would be painted in a corner if they wanted to rename all the properties because that would be really tough.  So let's name assuming ease of inline use, and just accept the slightly greater complexity in the code.  Face it, no one touches esoteric templates anyway, and in the grand scheme of things this "complexity" is trivial for an experienced template writer.

Naming
Phlox may be overstating the desire to have "birth date" rather than "date birth". Some languages don't use that word order: they put the adjectival expression "birth" after the noun, either adjectivally or with an "of"-type expression, as in French "de": jour de naissance. Even in English we have common expressions like that: YOB, DOB, "present at the birth" (not "birth people present"), "witnesses to the marriage". My version of FTM has a box called "Date born:" for each individual on the normal page. "Birth date, birth county, birth state," looks pretty tabular (as in the "Records" section of the abovementioned Agnes) and not the word order one would always prefer to include in a narrative. I feel that the ease of use will depend on what characters one has to include with the plain words for an inline SMW. I presume your Agnes example is to highlight expressions such as "was born in Birth state::Minnesota in 1893" but I'm not sure what that does beyond what would be achieved by "was born in Minnesota in 1893". Would "was born in state of birth::Minnesota in 1893" be a problem? Could you solve it with a redirect from "state of birth" to "state birth"? If there's not much difference in ease of use, there may be more value in keeping the coding and/or server load simpler. — Robin Patterson (Talk) 06:08, 26 May 2009 (UTC)

Standard Biographical Text
Would it be possible to take the vital statistics and insert them into a standard biographical text that would be generated by something like an infobox? That way, for the hundreds of ancestors for which you have nothing but vital statistics a brief biographical sketch could be automatically generated (with the option, of course, of substituting your own biography it you wish). Bill Hunsicker 11:32, 26 May 2009 (UTC)


 * That would be a straightforward thing to do. You can pick up things from the info page (using Get|Key=), or from the SMW (using properties or queries). rtol 12:39, 26 May 2009 (UTC)


 * To be clear on what I was asking: would it be possible to create a template that would create the biographical sketch automaticly? For instance, you could add a template called "biography" and a brief biological sketch would be generated. Bill Hunsicker 17:10, 26 May 2009 (UTC)


 * Please have a look at Clifford Sheridan Hunsicker (1903-1976). Why don't you complete the experimental biography, then I'll turn this into a template. You can also have a go at the letter. Search for page Template:Biography. Click the red link, copy the text from experimental biography, and save. The semantic stuff also allows us to add siblings in text, but that can be added later.rtol 17:41, 26 May 2009 (UTC)


 * Completed. See Template:Biography. Thank you for the lesson in how to create a template. Bill Hunsicker 02:40, 29 May 2009 (UTC)


 * The text about death is now conditional ... rtol 05:18, 29 May 2009 (UTC)


 * Is it possible to make the text prior to marriage "He" or "She" based on the gender of the person and make the whole marriage sentence conditional too? Bill Hunsicker 16:26, 29 May 2009 (UTC)
 * Update: I fooled around with it for a while but could not get it to choose "He" or "She". I did, however, make the marriage sentence conditional. Bill Hunsicker 21:59, 29 May 2009 (UTC)
 * Update: I made the parentage part conditional. Bill Hunsicker 23:11, 29 May 2009 (UTC)


 * Good work Bill! The marriage sentence now starts with He or She, with She as the default. rtol 06:46, 1 June 2009 (UTC)

I have started a Biography template for the SMW Person articles at Template:Biography SMW but I'm in over my head. A little assistance would be appreciated. Bill Hunsicker 12:41, 15 June 2009 (UTC)

Ahnentafel and related stuff
Phlox is convinced that the semantic ahnentafel works as it should. I therefore put the "set ahn" template in "info categories" so that all pages new and old will be included.

Note that the roll-out is slow. The ahnentafel works by taking the parents' ahnentafel and adding the person to it. Ahnentafel thus works from old to young. It requires an "edit and save page" for every person in the ahnentafel. Any time you add a distant ancestor, the entire tree needs to be reworked. Ahnentafel will therefore always lag behind the actual information on Familypedia.

The numbers in the ahnentafel are the actual ahnennumbers. If you click on the looking glass, relations show up (e.g., if Charlemagne is your maternal grandfather, you can look up all those who have CM as their maternal grandfather -- there is no English word for this particular type of cousin). This also specifies the #ask that could be used to build semantic version of the current template "show ahnentafel". The semantic version would be faster to compile and would not be restricted to six generations.

The ahnentafel works by storing on a person's page all those people from whom this person descends. Running a query on all those pages that list an individual as an ancestor means that you can use the ahnentafel to show all descendants of a person. (Twisted logic.) This is done in the template "show descendants". As this is based on a query, you can also use the ahnentafel to show the common descendants of two (or more, up to fifteen) people. This is done in the template "common descendants". Both templates are demonstrated at Louis the Pious (778-840)/descendants.

I'm working on the mirror of the ahnentafel: lists of descendants. This would allow for a template "show ancestors" and, more interestingly, "common ancestors" (which defines the family relationship between people). The nearest common ancestors in turn define the Coefficient of Inbreeding. Roll-out is slow and you may be part of that without knowing, but some results are still peculiar so please exercise some patience. rtol 12:39, 26 May 2009 (UTC)
 * Users that employ AWB can force these to update in an automated way. This is how I did volume testing.  Those interested should look at Help:AutoWikiaBrowser.  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   00:15, 28 May 2009 (UTC)

Descendant list
I believe I fixed the main bug in the semantic list of descendants. I switched it back on for another large scale test. If you find anything peculiar, do drop me a note.

To understand the power of this, check Beatrijs van Vlaanderen (c1253-1296). SMW will tell you that she married her eleventh cousin — and indeed will give you the relationship between any two people. rtol 18:57, 28 May 2009 (UTC)


 * Impressive. I looked at Beatrijs and clicked everything in sight, noting that one could add methods of sorting. Didn't see anything mentioning eleventh cousins or any obvious way to work out her exact relationship to her husband. You could do the same for Prince Charles and his first wife, who had a similarly distant relationship (times about three). Or maybe one with fewer generations that's easier for beginners like me to follow: George V of the United Kingdom (1865-1936), and his wife, who were third-cousins once removed, I think. We seem to be ready for a page with a name like Help:Cousins and ancestors to point users to the ways they can (and/or the system will automatically) display relationships that involve two or more steps. — Robin Patterson (Talk) 04:51, 29 May 2009 (UTC)


 * I was too succinct. shows the common ancestors of person1 and person2. At the bottom of the page of the nearest common ancestor, there is a list of descendants including the generation number of person1 and person2.
 * Extracting those generation numbers (13, 13) and displaying it as "11th cousin" is beyond my SMW skills still. rtol 05:08, 29 May 2009 (UTC)


 * You were. That's much better. (Good morning!) I go again to Beatrijs, see all of the common ancestors, pick the youngest (Gertrud) because that's likely to be the nearest (and I briefly muse that it would be nice to have their birth years in a separate sortable column), and look at her descendants. Interesting list that seems to be in generation order at first but then breaks down. A few apostrophes produce gloomy red messages, but I know you guys have that problem on the list. Here and there an unbelievable generation number: (Louise Barbara Trump (1900-1985), 1039) +, .  Finally, after nearly giving up hope, I find Beatrijs near the bottom of the list showing generation 8. 8! But in all the excitement of the chase I've forgotten her husband's name!! Never mind, the system works. Great stuff. (What order are the descendants in?) — Robin Patterson (Talk) 05:32, 29 May 2009 (UTC)


 * Beatrijs' husband is Floris V, god of the peasants and bane of the West-Frisians. How can you not know that?


 * My schools didn't go into Dutch history much. The flood of post-war immigrants hadn't had a full impact before I left school. Must brush up my Dutch in all respects: my daughter is expecting a child who will be 37.5% Dutch by extraction... Keep on teasing me. (Have you got Frysk on your Babel chart?) — Robin Patterson (Talk) 14:02, 30 May 2009 (UTC)


 * We should have the generation numbers in a column and sort on that. The current order is by birth year, but this is messed by birth years with letters in (c865 comes before 914). This is a legacy effect that will disappear over time. Trump is an invention by Phlox. I don't know what she's doing there.
 * More seriously, did you really see a Gertrude of generation 8? Or is this just an example you made up? rtol 07:21, 29 May 2009 (UTC)


 * She was fifth and last (but the list is now four times as long): Gertrud von Sachsen (1033-1113). Didn't you click to see them all? I see Floris V, (9), so yes 11th-cousins but also 6th-cousins once removed. — Robin Patterson (Talk) 14:02, 30 May 2009 (UTC)
 * Order of descendants is not birth year, except in a few blocks. Here are Gertrud's last few: (Richard III, King of England (1452-1485), 14) +, → (William II, Count of Hainaut (1307-1345), 10)  +, → (Baudoin d&, Expression error: Unrecognised punctuation character "?")  +, → (Gui de Dampierre (1225-1305), 7)  +, → (Robert III of Flanders (1249–1322), 8)  +, → (Beatrijs van Vlaanderen (c1253-1296), 8)  +, → (Jan I van Holland (1284-1299), 9)  +, → (?, 13)  +, and → (?, 14)  +. — Robin Patterson (Talk) 14:02, 30 May 2009 (UTC)
 * (I think Ms Trump has a close relative with a similar 4-digit number. Real volume-testing - or error-checking. We'll see what Phlox has to say.) — Robin Patterson (Talk) 14:02, 30 May 2009 (UTC)


 * Ms Trump is a programming trick, not a relative.
 * Order is birth year, but birth year is a string rather than a number. So 1001 is before 999; and c1900 is after 1100.
 * I'll see whether I can reproduce the 6th cousin once removed, and the template to return that. rtol 14:09, 30 May 2009 (UTC)


 * There you have it. A clear demonstration of the power of SMW. I had clicked through 2x13 generations to teach the machine that Bea and Floor are 11th cousins, and N iterations later it shows me that they are 6h cousins once removed as well. rtol 15:37, 30 May 2009 (UTC)
 * I don't see any demonstration related to iterations there. You manually found an 11th-cousin relationship (and were thinking of a way to get the machine to put it into a form showing cousin relationship). I looked a bit further in the same list and found a closer relationship. What's with iterations? — Robin Patterson (Talk) 02:31, 31 May 2009 (UTC)


 * I strongly disagree with your repeated assertion that the descendants are in birth-year-string order. Look at my example more carefully. 1452, 1307, 1225, 1249, c1253, 1284. — Robin Patterson (Talk) 02:31, 31 May 2009 (UTC)


 * and display a list of common ancestors reverse ordered by year-of-birth-as-string.
 * The list of Descendants at the bottom of the page displays as first child, first child of first child, first child of first child of first child, second child of first child of first child, second child of first child, third child of first child, first child of third child of first child, and so on.
 * On the iterations, the ahnentafel and descendant list are emergent features. With every save, they copy information from the parents' and the children's pages -- and that information is different every time the parents' and the children's pages are saved.
 * More concretely, when I first ran on Beatrijs' page, there was no relationship with her husband, the awful Floris. I knew that was not true, because some 300 years earlier another Count of Holland had married another daughter of a Count of Flanders and both dynasties were uninterrupted for a very long time. So I opened and saved the pages of the patrilineal ancestors of both Bea and Floor until  showed a relationship: 11th cousins. When you had a look, many hours later, someone had re-saved the page of Gertrud von Sachsen who first married a Count of Holland and then a Count of Flanders. This information was added, and that is what you found (and I add first thought was a bug).
 * So, while SMW is pretty powerful, it takes time for information to propagate and answers to queries evolve over time. To go back to the story of Bea and Floor, both dynasties married into royal family of France and into the ducal family of Brabant (and those two families intermarried as well). None of this information is up on Familypedia, but if and when it is uploaded, we will probably find a closer relationship between the two. rtol 06:53, 31 May 2009 (UTC)

Lineage
The intersection between a list of ancestors and a list of descendants is, of course, a lineage. Now as a template. shows all lines of descent from Person1 to Person2. rtol 11:55, 30 May 2009 (UTC)
 * Very good. — Robin Patterson (Talk) 02:31, 31 May 2009 (UTC)

Summary of where we are at

 * copied from a user talk page

Currently, we are revising the way that we centrally store information, so everything you are learning about info pages and showinfo etc will be still work and all- it just will be obsolete. ... We will be encouraging people to use the new mechanism and be converting data over for them.

We are using semantic mediawiki extensions, something I believe that WP will eventually have. The new templates and properties are not yet ready for prime time, but if you are curious, you can see them demonstrated in article George Spencer Geer (1836) instead of the error prone process of editing info pages (a mechanism of my creation), the new scheme uses forms. See the edit with form item in the menu bar. This stores information directly in MySQL relations so that we can do databasey things like search for people born in location, with birth date greater than somedate and less than other date. This sort of thing is partly activated due to a hack I put into the info page mechanism. Those pages go away in the future, and pages more resemble those of the geer article. - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x  01:21, 1 June 2009 (UTC)

Even clearer indication from Phlox

 * (Extracts from recent conversation on another forum)


 * What I and other "followers" would like is an assurance that the info put into info pages will not be wasted but can be converted to SMW facts etc by a bot or similar automatic process. — Robin Patterson (Talk) 02:37, 14 June 2009 (UTC)

Nothing will be wasted, and info pages have given us a huge head start because the data is directly transferable. ... everything will be transferred over from info pages. ... Mind you, some of the info page values have been used in odd ways, such as place names in date fields. People ... will really want to correct these problems to get the most out of the SMW features.

William, I think you will see/find that form entry is vastly superior to the error-prone info pages method. And the volume and detail of information one can enter has been increased as well. What is particularly attractive is that when you are entering place names, the Familypedia will autocomplete the entry for you. So if a county name is in 3 different states, you will be made aware of that as you are typing in. This autocompletion is also active for all file and article fields, so there is no problem with typos as with info pages. '''Some of this stuff is working and you can create your own articles using Form:Person. People are welcome to try creating their own articles using it.''' Understand that it is under construction and it is not ready for prime time yet... . Certainly if anyone would like to stay with the info pages, they are welcome to. The SMW can coexist perfectly well with info pages. Of course, contributors making that choice will not be able to access any of the new features such as a rich set of database queries, and dynamic categories. For example, one could have a table on one's Userpage that dynamically displays all changes to articles for all surnames the contributor is interested in since the last time the contributor logged in. I think everyone will be pleased with the features that SMW will deliver for Familypedia. - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x  04:45, 14 June 2009 (UTC)


 * I intend to start testing the forms on new Person articles soon. Will the conversion from info pages be done automatically at some time in the near future and will contributors who wish to stay with info pages (not me right now - I want to check out the forms first) be able to indicate that they don't want their articles converted? Bill Hunsicker 05:27, 15 June 2009 (UTC)


 * As Familypedia policy regarding bots, all site wide automated conversions (this one included) take place only after there has been adequate public notice to the community and there has been opportunity for objections to be raised. To your question, the answer is yes- Contributors will have the opportunity to specify articles they have been the primary contributor on to opt out of this conversion phase.  I shall provide details when we are closer to the conversion phase.  Although a bot run could easily transfer all info pages in a few hours, the conversion will be gradual, rather than a single massive push.  This will allow us to notice any shortcomings in the conversion, and more importantly, the SMW structures that they are being converted into.
 * SMW is not in beta test yet: Bill, I am glad you realize that this should be used for testing only at this point. The Form:Person may look polished, and the query pages may deliver interesting results but believe me, the SMW pages are very much a construction site with heavy equipment slamming into structures every now and then. If to fix something I need to rename some templates or parameters, that means any SMW test articles may partially break.  For example, as thurstan noted last night, birth-state was recently renamed to birth-subdivision1.  Well, that sort of thing is trivial to fix in test articles (just renaming a parameter in an article), but I don't have the time to go back and fix test articles other than the ones I use, so you are on your own until SMW goes into beta test.  We simply aren't there yet, and if you want to use it as your primary input means that is fine, but understand that you may have to go back and fixup articles, so it is not for the faint of heart. There are enormous numbers of things to tweak, but at this point what we need to understand is if there are any major structural errors- like missing the capture of some information, how we are doing events, or multilingual, and so on.
 * Regarding the timing of the conversion, I am in no big hurry and shall remain reluctant to do this until the structure is more tested and mature. I would have liked us to have a more complete geographic knowlegebase prior to the conversion, but I suppose we could do it as a post process cleanup by marking the origin of the smw pages and later going back to translate geographic names more properly.  For instance, Greene county ->Greene County, Alabama rather than Greene County, Arkansas.  In any case, we will need this database prior to going into beta, because while autocompletion sort of works now, it is using an unclean database that using categories that were never intended for this purpose- you get suggestions like map of XX county for a county name autocompletion.  This geographic knowlegebase bot run shall not be modest and the main purpose is to generate autocompletion lists. A byproduct will be substantially more of the auxilliary articles on places.  I figure we should have all villages and towns mentioned in wikipedia, not to mention the counties and subdivision1's.  - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x   17:41, 15 June 2009 (UTC)
 * - <font color="#0A9DC2">~  <font color="#0DC4F2">Ph <font color="#3DD0F5">l <font color="#6EDCF7">o <font color="#9EE8FA">x  17:41, 15 June 2009 (UTC)

More wonders
I'm beginning to get the hang of this. See Descendants of Charlemagne (couples) for another demonstration of the powers of SMW. The same structure can be used to construct lists of, say, Franco-Italian couples. rtol 09:42, 14 June 2009 (UTC)

Listing in birth order
Rtol will see that I have indeed looked at the abovementioned Descendants of Charlemagne (couples). It lists children in birth order: great; but does that risk omitting some, or has that problem been fixed? — Robin Patterson (Talk) 14:07, 14 June 2009 (UTC)


 * Children in birth order works fine. There are problems with the first column only. rtol 18:32, 14 June 2009 (UTC)