Familypedia
No edit summary
Line 33: Line 33:
 
:::Yes, data entry on this site can be time consuming, particularly if you want a very specific layout. I streamlined it for myself by creating a specific template/input box that I access from my user's page---it lays things out the way I want them, at least for initial purposes, and simpifies data input considerably. There's something similar in the "Create a page" of the navigation bar, but having to scroll down through all of the descriptive stuff is something of a drag. (I'd rather see the input boxes up at the top where they would be most directly usable. Then if you needed the caveats and explanations you could scroll down. Robin likes it the otherway around because he's concerned that people won't read the caveats. [[User:WMWillis|Bill]] 15:50, 1 June 2007 (UTC)
 
:::Yes, data entry on this site can be time consuming, particularly if you want a very specific layout. I streamlined it for myself by creating a specific template/input box that I access from my user's page---it lays things out the way I want them, at least for initial purposes, and simpifies data input considerably. There's something similar in the "Create a page" of the navigation bar, but having to scroll down through all of the descriptive stuff is something of a drag. (I'd rather see the input boxes up at the top where they would be most directly usable. Then if you needed the caveats and explanations you could scroll down. Robin likes it the otherway around because he's concerned that people won't read the caveats. [[User:WMWillis|Bill]] 15:50, 1 June 2007 (UTC)
   
  +
::I can understand that, I think I side a little with Robin on that one (Though I see the validity of both sides). I have templates already laid out on my pages, so that's covered. Really (And there is absolutely nothing you guys can do about this one :D) the most time consuming thing is that when I enter a person into the Wiki, I take the opportunity to go hunting for more information to assemble onto that person, when I get out of the blood of Scottish kings, I'm sure things will go a lot faster (I just hit a person that there is only one badly mangled record on a single persons' tree, effectively no data, it only took a moment to enter). Back to the point: I have yet ''another'' question (You guys will get so tired of me), this one is a Wiki programming question: If I don't have vital information, like the birth and death dates, can I enter the data as Joe Smith (1810s-1880s) to avoid the Joe Smiths that are in the other centuries? That's sort of how I started it... and, lets say we fix Joe there, and another researcher finds Joe's birthdate... can we fix Joe so that the dates are in? If we reroute Joe Smith (1810s-1880s) to Joe Smith (1812-1865), and another Joe Smith (Not related) shows up with no birth date and no death date, but from the same general area, there's really nothing we can do, is there? We'll have to make a record called "Joe Smith (1810s-1880s)-II", even though we have fixed the first Joe, we can't change his physical records name. Am I correct in this? If we did, all the links to it would get lost, am I correct? I ask this because if so there is a lot more weight in making an original record... you can't just make "Joe Smith (sometime)" and then go back and fix it later. [[User:Aabh|Aabh]] 23:24, 1 June 2007 (UTC)
   
 
==Other responses==
 
==Other responses==

Revision as of 23:24, 1 June 2007

Forums: Index > Watercooler > Standardization of records


Opening questions

So, I have been pondering... A Genealogy Wiki is so cool! :D But how do we link everyone together so that once you connect to a family, you aren't entering new pages for all the folks that are already here from another persons' family? For example: Let's say I have a Joe Smith, he was born in 1711 and died in 1760. I make a record for Joe and put in all the details. Later another researcher puts their tree here online, they, too have Joe Smith, so they make a record called "Joe Smith (1711-1760)", since that isn't exactly like my Joe Smith, a new record is created. Worse, their Joe Smith had three sons, my family is related through Kate Smith, Joe's daughter, but theirs is related through Joe Smith Jr, the son. So their record is Joe Smith I. How can we kind of "funnel" people into editing my Joe Smith instead of making a new Joe Smith? I've been thinking about how to make records work well (I have Stewarts in my family, a lot of them... and they love to name themselves "James"... so I have, literally, 20 James Stewarts... I'm going to have to do something to differentiate them). Is there a standard we can sort of encourage researchers to use so that all of the records mesh nicely? Personally, I've been using GivName FamName(bdate-ddate) as my naming convention, that pretty much gives a unique record for each person, and allows for a ton of "James Stewart"s (And it looked like other folks were doing that too, so I'm kinda following suit). Is this even a concern? Again, I'm really new to the Genealogy Wiki... so...erm... Thoughts? :D Aabh 21:37, 30 May 2007 (UTC)

Response from Bill

A couple of things to keep in mind---in particular, this wiki has about 6000 articles, and 13000 pages. Compared to Wikipedia, this site is tiny. On the other hand, the site is growing. The significance of this is that so far running into duplications hasn't been much of a problem. There simply isn't that much overlap. If you create an article about your great grandmother, its not likely that you will find it duplicated by someone else. On the otherhand, its only a matter of time before we start getting significant cross connections. In point of fact, there are at least two contributors on this site who have found that they DO share some common ancestors---albeit by marriage. Eventually they will end up putting in those ancestors, and at that point the porblem you describe will arise.

How do you avoid that? Fairly easy in theory: you check the site to see if there's an existing article. To check, use the search function in the navigation pane. There's also a more complex search function that can be used, though the pointers to it are a bit obscure. First, check the Main Page. There's a link there to Main Page/Getting Started on the Wiki. One of the topics under that subarticle is using the search function.

Another mechanism that can catch duplications is based on the article name structure:
[First Name] [Middle Name] [Last Name] (YOB=YOD)
Theoretically, if you get the name right, any instance where there's an article with the same name will get picked up. Definitely not foolproof. While the inclusion of YOB-YOD in the name does not guarantee each person article will have a unique name (There could be two John Smiths born 1806 and died 1896---long odds, but possible), it's fairly unlikely that we would have any duplication simply because we have two different people with the same name and vita.

You might get a case where you miss a previous article about someone because of slight variatons in the article title. For example you might want to write an article about John Quincy Smith, search for "John Quincy Smith (1806-1896)", and miss an existing article simply because a) the previous article has a different DOB (John Quincy Smith (1807-1896)), added an extra inadvertent space (John Quincy Smith 1806-_1896), or used a middle initial (John Q. Smith (1806-1896)). You might spot such similar articles in a search, but the system wouldn't automatically warn you that there an article about "John Quincy Smith (1806-1896)" already existed.

It would take only a modestly intelligent algorithm to spot articles about the same person but under slightly different names. It's not likely to happen anytime soon, though I'm sure that eventually something like that will exist. (the basic approach for that would be a routine looked at simlarities in the name "John Smith" vs" John Q Smith" vs John Q. Smith" etc, and then compared significant facts in the articles (DOB's etc). Of course, it would have to recognize what was the DOB---realizing for example that 10/09/1806 is the same as "October 09, 1806", but different from "September 10, 1806".)

I should also add that yes, there is a naming convention used on this site. SeeGenealogy:Page names Bill 00:30, 31 May 2007 (UTC)


Cool! Thanks! I have another question; it has taken me two days to get a whopping 6 pages made here... I'm getting faster as I work, but I've got literally thousands of pages to upload here to get my tree online... I might look up each and every individual to see if they already exist... but I'm not sure everyone would... It might be a little too much to do if you are trying to get an entire tree on here... which I think is what we want folks to do... I guess the answer is that we just need to police our work pretty heavily to make sure it's tight... Aabh 14:01, 1 June 2007 (UTC)
It's conceivable that you may be able to copy Brian Yap's method - see Help talk:Loading Gedcoms. Robin Patterson 14:12, 1 June 2007 (UTC)
Eventually, when I have the time, and have invested in the needed software upgrades, I will start working on a broader solution to this and related problems. Its a long standingitem on my "to do". I've been leary of the Yap approach, as I can't figure out how it works. I don't know, for example, if it will overwrite files that already exist under the same name. I've also found the explanation of how to use it a bit obscure. Probably haven't invested enough time in it. If you try it, let me know how it worked for you.
Yes, data entry on this site can be time consuming, particularly if you want a very specific layout. I streamlined it for myself by creating a specific template/input box that I access from my user's page---it lays things out the way I want them, at least for initial purposes, and simpifies data input considerably. There's something similar in the "Create a page" of the navigation bar, but having to scroll down through all of the descriptive stuff is something of a drag. (I'd rather see the input boxes up at the top where they would be most directly usable. Then if you needed the caveats and explanations you could scroll down. Robin likes it the otherway around because he's concerned that people won't read the caveats. Bill 15:50, 1 June 2007 (UTC)
I can understand that, I think I side a little with Robin on that one (Though I see the validity of both sides). I have templates already laid out on my pages, so that's covered. Really (And there is absolutely nothing you guys can do about this one :D) the most time consuming thing is that when I enter a person into the Wiki, I take the opportunity to go hunting for more information to assemble onto that person, when I get out of the blood of Scottish kings, I'm sure things will go a lot faster (I just hit a person that there is only one badly mangled record on a single persons' tree, effectively no data, it only took a moment to enter). Back to the point: I have yet another question (You guys will get so tired of me), this one is a Wiki programming question: If I don't have vital information, like the birth and death dates, can I enter the data as Joe Smith (1810s-1880s) to avoid the Joe Smiths that are in the other centuries? That's sort of how I started it... and, lets say we fix Joe there, and another researcher finds Joe's birthdate... can we fix Joe so that the dates are in? If we reroute Joe Smith (1810s-1880s) to Joe Smith (1812-1865), and another Joe Smith (Not related) shows up with no birth date and no death date, but from the same general area, there's really nothing we can do, is there? We'll have to make a record called "Joe Smith (1810s-1880s)-II", even though we have fixed the first Joe, we can't change his physical records name. Am I correct in this? If we did, all the links to it would get lost, am I correct? I ask this because if so there is a lot more weight in making an original record... you can't just make "Joe Smith (sometime)" and then go back and fix it later. Aabh 23:24, 1 June 2007 (UTC)

Other responses

Bill has said most of it. Another way duplicates may be found is in the categories. Surname categories, birth and death year categories, for example. See User_talk:WMWillis#Category:Created_Using_Research_Template for an example of where it has already happened. In your example at the top, only the second researcher followed the recommended format (as you do in reality). If everyone follows the standard first time, there will be insignificant duplication. When a third person tries to create a page Joe Smith (1711-1760), the program will display the existing page of that name. If they are actually different people, whoever works that out is welcome to create a disambiguation page listing the more distinctive pages each has been given - Robin Patterson 12:03, 31 May 2007 (UTC)