Familypedia
No edit summary
(Response)
Line 36: Line 36:
 
[[User:Robin Patterson|Robin Patterson]] 11:21, 5 October 2007 (UTC)
 
[[User:Robin Patterson|Robin Patterson]] 11:21, 5 October 2007 (UTC)
 
:As a matter of principle, I think all high volume bots should use the Pywikipedia framework. My quick look at it implies he wants to do managed conflict resolution, which is a good thing. With a more populated network of ancestors, we will want to have a tool so that folks can do uploads on their own, perhaps as part of a .JS wikimedia tool extension. Writing code for my personal purposes is one thing- writing it for end users requires about 100 times more effort. As a former pro programmer, I make no exageration here. Handling errors gracefully, thinking about UIs that make sense, dealing with the special cases- it is just a son of a gun of a task. Maybe if I get really crazy about genealogy I might do it, but for now, it is just a hobby and I can't invest that kind of time. [[User:Phlox|<span style="font-family:Trebuchet MS">''<font color="#0A9DC2">''~''</font>'''''&nbsp;<font color="#0DC4F2">Ph</font><font color="#3DD0F5">l</font><font color="#6EDCF7">o</font><font color="#9EE8FA">x</font>'''</span>]] 23:53, 5 October 2007 (UTC)
 
:As a matter of principle, I think all high volume bots should use the Pywikipedia framework. My quick look at it implies he wants to do managed conflict resolution, which is a good thing. With a more populated network of ancestors, we will want to have a tool so that folks can do uploads on their own, perhaps as part of a .JS wikimedia tool extension. Writing code for my personal purposes is one thing- writing it for end users requires about 100 times more effort. As a former pro programmer, I make no exageration here. Handling errors gracefully, thinking about UIs that make sense, dealing with the special cases- it is just a son of a gun of a task. Maybe if I get really crazy about genealogy I might do it, but for now, it is just a hobby and I can't invest that kind of time. [[User:Phlox|<span style="font-family:Trebuchet MS">''<font color="#0A9DC2">''~''</font>'''''&nbsp;<font color="#0DC4F2">Ph</font><font color="#3DD0F5">l</font><font color="#6EDCF7">o</font><font color="#9EE8FA">x</font>'''</span>]] 23:53, 5 October 2007 (UTC)
  +
  +
::You seem to have bypassed or leapfrogged Brian's program. I think I approve highly. So I can toss my 13,000-strong GEDCOM into a near-page that you will name or otherwise set up, and PhloxBot will create pages for everyone except that if there is a matching page it will merely add the info at the bottom and leave a note saying data should be checked and integrated? [[User:Robin Patterson|Robin Patterson]] 13:59, 8 October 2007 (UTC)

Revision as of 13:59, 8 October 2007

Forums: Index > Watercooler > Gedcom bot and Rtol's improvements to Yewenyi's process


Is there any interest in a Gedcom bot? The idea is that folks would place their Gedcom data in an otherwise blank article, It would get approved/ disapproved for upload, then all those approved would get articles auto created for them using the Gedcom data and using a standard article structure. -~ Phlox 00:08, 29 September 2007 (UTC)

Good idea! -AMK152(TalkContributions 01:25, 29 September 2007 (UTC)
OK, any proposals for standard components to add, chime in. I figure:
  1. Person template
  2. Children of each spouse (if multiples) in a subpage so that it can be included into each parent's page.
  3. By option, an ancestor tree using Ahnentafel template
  4. Surname template
  5. Notes section with references/>
  6. Template:Stub-incomplete inviting folks to update/ fix the article.
We'll see if I can scounge some time up for this, but it seems like this would help us bulk up pretty dang fast. No promises though. The wife has been packing around a list of things to do around the house...
~ Phlox 01:49, 29 September 2007 (UTC)
Categories are a big one too. -AMK152(TalkContributions 02:15, 29 September 2007 (UTC)

Yes of course cats. I think an Ahnentafel can be generated for each individual, a separate "Family tree tab". Ahnentafel template could be modified to autocollapse but that will be tedious task I will leave for a rainy day. It would be nice to record not just ancestors but descendents with links from any node to the tree page of the other individuals- this could easily be generated by such a program.

Building on the Past? Upcoming Gedcom changes

The Gedcom spec is surprizingly archaic- EG: "Logical GEDCOM record sizes should be constrained so that they will fit in a memory buffer of less than 32K. GEDCOM files with records sizes greater than 32K run the risk of not being able to be loaded in some programs." Jeez. Programs from which decade? I don't think we have to run on Robin's Atari (or was it Amiga?) computer anymore do we? ~ Phlox 17:07, 4 October 2007 (UTC)


LDS announced that they are moving their sites and apps completely to XML based Gedcom. That is nice, because at the most rudimentary level, character encoding will not use this weird non unicode thing that Gedcom is using. (The very latest draft 5.5.1 Gedcom spec proposed UTF-8, and it isn't even approved yet? Jeez I guess all those folks using characters with double bytes aren't interested in genealogies. Sheesh.) But the semantic representations are awfully primitive. Not only are there the weaknesses in representing ambiguous information, the way they are representing the data is not leveraging standards in content representation such as the microformats that are being supported by searchers and browsers. Anyway, I would much rather build a parser to input a format that is going to be around a few years. It looks like everyone is giving Gedcom the boot.


So the strategy would be to pick some format that is one of the best bets for what will replace Gedcom (either a Gedcom iteratation like Gedcom XML, or some other thing). These proponents usually write a converter app- so we still support old gedcom but by first running the converter over it- then running the real parser.


Anyone have any opinion about which of these new formats is best, feel free to chime in here. ~ Phlox 17:28, 4 October 2007 (UTC)

Question about Brian's Java Gedcom thing

(copied from Forum:How we encode our data

You mentioned the GEDCOM bot - I hope that's Brian's Help:Loading Gedcoms program and I hope you can

  1. make it more usable (or the instructions clearer) for those of us who are not sure whether we could import or use all the necessary Java etc
  2. tweak it so that it produces pages that conform with your above ideas or any variations of them that might achieve consensus

Robin Patterson 11:21, 5 October 2007 (UTC)

As a matter of principle, I think all high volume bots should use the Pywikipedia framework. My quick look at it implies he wants to do managed conflict resolution, which is a good thing. With a more populated network of ancestors, we will want to have a tool so that folks can do uploads on their own, perhaps as part of a .JS wikimedia tool extension. Writing code for my personal purposes is one thing- writing it for end users requires about 100 times more effort. As a former pro programmer, I make no exageration here. Handling errors gracefully, thinking about UIs that make sense, dealing with the special cases- it is just a son of a gun of a task. Maybe if I get really crazy about genealogy I might do it, but for now, it is just a hobby and I can't invest that kind of time. ~ Phlox 23:53, 5 October 2007 (UTC)
You seem to have bypassed or leapfrogged Brian's program. I think I approve highly. So I can toss my 13,000-strong GEDCOM into a near-page that you will name or otherwise set up, and PhloxBot will create pages for everyone except that if there is a matching page it will merely add the info at the bottom and leave a note saying data should be checked and integrated? Robin Patterson 13:59, 8 October 2007 (UTC)