Forum:How we encode our data

There are some in the wiki and internet community who advocate representation of information in a way that computers can evaluate. This movement is of relevance to genealogy researchers since the inferences that can be made from such representation of information delivers valuable results. Genealogy information is more simple than the general case of information representation. EG. Situations with conflicting information such as item #2's version of the truth 2 and item 3's alternate view are well known to anyone dabbling even briefly with genealogy research. New information can alter the confidence in a particular view of a family history.
 * 1) Joe was married to Mary.
 * 2) *The date of this event was X.
 * 3) Child Y 's mother was Mary.
 * 4) *The source of this idea is A.
 * 5) Child Y's mother was Jane.
 * 6) *The source of this idea is B.

Gedcom has some of this information, but the goal of the Gedcom is to be a format for exporting or importing data to various programs or internet sites, nothing more. Gedcom 6.0 (XML) format deliberately confines itself to that goal only.

In the Wikicommunity, persondata templates are recording information conforming to the HCard Microformat. This sort of encoding contains a superset of what Gedcom 6.0 does. If we which to follow that sort of direction, the Gedcom5.5 java program that converts to Gedcom6.0 like XML or alternatively to Resource Description Framework (RDF) format might be of interest. Further information on the program and discussion of the issues for such semantic representation of genealogical information may be found on Jay Askren's site.

Brushing aside all the gee whiz applications of semantic databases, our genealogy wikia could benefit from the simple idea that it allows information to be shared. The Semantic Mediawiki extention mediawiki supports encoding data in a central way that can be accessed anywhere in the wiki. It looks like normal wikitext. For example, a person article for Joseph Hester might have the text:

Joseph's parents were father is::Elias Hester (c1832.

Now, any time this information is updated, everyone that wants the change can get it. EG. I have a family tree, and for one of the cells I can hardcode the Elias Hester or I simply put 

Some of this stuff is working today, (see example for california at ontology semantic wiki page ). When it matures, it is surely something that future contributors to Genealogy wikia will want to begin to use. Note that any it is just another wikitext operator, and this doesn't impose any radical demands on authors. It can be ignored by the majority of contributors, but I expect will gradually gain many converts simply due to time savings. It can be used in an evolutionary way, and I expect the transition will be fairly gradual, with a mixture of usage of hardcoding versus re-using data. This will suit wikia managers very well, because the server loading created by complex templates using such queries are not well understood. It could be that caching will make it a non issue, but note that data dependencies are multiplied. Change the data declaration father is:: relation for William the conquerer, and you could potentially invalidate the cached pages of hundreds and hundreds of pages using this information. It's also impossible to predict what the issues are with vandals. The same issue arose when wikipedia first started, (the objection was that allowing users great power will mean they will abuse it)- come to think of it, I think the nobility said the same thing about allowing the rabble to vote. Anyhow, a gradual transition allows everyone to learn and adapt.

Other explorations of interest:
 * Microformats and genealogy information
 * Inline queries using Semantic mediawiki extension
 * Meta's article on the extension: Semantic_MediaWiki

In the near term, we cannot predict how the data representation formats will evolve, and can only adapt along with them. At some point, it is inevitable that Genealogy wikia will have a data mass sufficent to earn us a seat at the table so that we may positively influence such evolution.

For the near term, we should encourage folks to encode information using standard templates such as Template:Person.