Familypedia:Beyond webBases

(Moved here from Genealogy:Collaboration of the month, so it can get meaningful section headings better than on that general page (but the history is still there). Robin Patterson 12:31, 27 February 2007 (UTC) )

(This is a discussion page until definite ideas or procedures arise.)

Hey Robin, Glad to see the Genealogy Wiki. This is exactly what I have been trying to accomplish for about 10 yrs now.

In my past experience, I tried to have a database of all individuals I was researching. Thru experience I found that each particular Surname needed it's own database which included variations of that Surname's spelling. This worked quite well, but I did have trouble when I included comments on that individual. This made the database quite large and became very very slow while doing a search.

These comments are best made in a wiki application such as this one. Rather than recoding your current wiki application to include a database which this application would be the only one who could use this data, what if you had an external database? This information could then be used by other applications as well, such as charting, timelines, mapping, etc.

I have started playing with this idea recently with the new options available at zoho.com and google spreadsheets. The database can be embeded into your wiki very easily with links back to the wiki as well.

Some examples for thought:

The zoho example can be found at: http://creator.zoho.com/cobnet/crawford-webbase/ You should notice a link near the top left hand side of the page for "Embed this page into your site".

The Google Spreadsheet examples can be found at: http://spreadsheets.google.com/pub?key=pAPPDTKnzhFoC_qmk6yovTQ

If you have a chance to look at both examples you may find they could be a solution for your problem. Myself I like the zoho example better simply because it contains links in the page for Add, Edit, Search, etc. The Google example requires more coding and is less user friendly in my mind. Notice the "Resource" tab in the zoho example above. One could very easily add a link to the Wiki page of any particular individual there.

Also note the "Filter" option near the top left of the spreadsheet of the "Resources Tab". One could easily sort the list for any particular individual by INDI_ID, which would result in the link to the WIKI PAGE if one was made. This is not very user friendly, most folks will not the INDI_ID of the person they are searching for. So I made yet another tab titled: "GenCard". Now I can see all information on an individual at once.

There is lots more work to be completed here before this is a completed option. Is there a way to embed any of these spreadsheets into this wiki? If it cannot be embedded there may be other options with the "More Actions" link?

For anyone wishing to test this idea further choose the "Copy This Application" link near the top of the above Crawford example), which can be copied and a new Surname Database can be started. All comments, ideas, suggested greatly appreciated.  I have noticed that a visitor wishing to ADD, EDIT, etc may need to be registered at the zoho site?

Mark


 * Thanks, Mark. I've encouraged a couple of the active members to look at your new work. Bill had a look. He's probably the expert on embedding things. I know the art of embedding and including templates and other things inside templates is well developed on Wikipedia and moving that way in some of the Wikia sites. I expect we can accommodate that sort of thing. Robin Patterson 11:32, 28 January 2007 (UTC)

HI Mark

I did look at your material, and probably need to look at it more closely. Looking forward to a potential application of what you've done raises a question....perhaps I don't understand something, and I'd like to hear how you might apply your work on this wiki. But my question---wouldn't this require the end user to create a spread sheet style data base of some sort, or at a minimum require that they insert their information into an existing spreadsheet on the site? Bill 14:17, 13 February 2007 (UTC)

Hi Bill,

Sorry it took me so long to reply, you are correct, however adding info to the spreadsheet can be done right from the wiki itself. Please be patient with me as I try my best to make a demostration of this. I don't have as much time as most folks to work with this, but I will make progress.

You might look more at the formatting of the data rather than where the data is contained. It is more important to come up with a standard format that can be used by many applications. I am not saying what I have currently is the "standard format", but I am trying my best to get as many folks as I can to input their thoughts on such a standard. The GEDCOM standard is pretty wide spread, however I feel this format is bloated and a database can hold the same information in less space as the GEDCOM uses.

I am glad to see that you and Robin are interested in this area of genealogy research. It is very difficult to find folks who are interested in the programming side of genealogy. Robin has been a great help to me in the past and is very thorough with advice.

I will try to get back with this site when I have more info to share. I don't want to hold anyone up on their work here. As I said previously, I would be more concerned with the database formatting of the individual data than where that data was contained. More specifically, I am hoping for access to this data from many applications rather than an Individual taking a whole lot of time to enter that data, then when a new application comes around, such as mapping, having to enter that same data again for the new application.

Hopefully you can download the data from our previous webBases into your new application without too much work. You can download this data in many different formats, RSS, XLS, etc., simply choose the "More Actions" button from this link: http://creator.zoho.com/cobnet/cochrane-webbase/ to see all the ways to download this info. This should save alot of efforts in moving this data to other applications.

Until next time, good luck.

Mark


 * Good stuff, Mark. (By the way - please don't start lines with blanks; see http://genealogy.wikia.com/index.php?title=Genealogy:Collaboration_of_the_month&oldid=25112 .)
 * Keep it up, programmers! --Robin Patterson 06:10, 24 February 2007 (UTC)


 * Thanks Mark, appreciate the feedback. I sometimes think that the potential speed of the internet does us something of a mis-service.  I've got almost all of my fathers genealogical correspondance from the 60's until his death in the 90's---thirty years worth of letters---the old fashioned kind filled with "Sorry I didn't get back to you sooner, but we had a wedding to plan, than John was in the hospital for a month...."  When I go through the correspondance, and realize how much time that it took to get any replies, I'm amazed at what he was able to accomplish.  I've been in correspondance with some of the people he wrote to, and consistently, even though they are now working the internet, its that same delay between message and response---sometimes months go by!  But the correspondance does keep flowing, and its sort of fun to see what they now think, these folks who were corresponding with my father 30 years ago.


 * But I'm digressing---that was just a way of saying, I'm used to, and have an appreciation for, long gaps in conversations.


 * I'll look forward to your further thoughts on this. When I look at the data table you've adopted, I can see, of course, where you are going with this. I can easily see how we could craft a robot to go to another site, and extract the needed information.  You're list is reasonably comprehensive, and could be used in this way.  The fact that it is in a set format would definitely facilitate the work of the robot.  (Read, make its work possible).    I think the problem here is that a) it would be outside of the control span for Wikia, and there would be no guarantee that the database would always be available.  To meet the needs of this wiki, I think a database like this would pretty much have to be house somewhere within the Wikia community, preferrably within this wiki.


 * However, what I'm thinking of is a bit different. Among other things, I'm trying to reduce the number of things a user has to know about. To that end I've experimented a bit with formatted tables for entry of Vita data such as yours.  That includes both the Vita Box that appears on many of my pages, plus the child list table.  I'm still experiementing with those.  Because this requires HTML programming, the use of these tables currently exceeds the capabilities of most genealogists.  Not that HTML programming is that hard, its just that its more than most are willing to undertake.  So what I'm looking for eventually is to create versions of these tables that include embedded text or input boxes.  When someone creates an article the basic framework would appear in the article in the appropriate places.  When they edit the article, the input boxes would be shown, and they could input whatever they might want. When the edit pages close, the display page would show the vita box etc, with the newly added information.  This process would be repeated everytime the page was edited (or perhaps, everytime the section containing the tables was edited.)  This would avoid having to create a separate and independent database either within or outside of the Wiki.  The only requirement here, is that a specialized extension of the underlying wikimedia programming would be required to make this work.  That and the creation of other extensions that would transfer information from page to page---that's required in order to meet the objective of eliminating double entry of data on spouse, child, and parental pages.


 * This is going to take awhile to implement. I've got some work things that I have to clean up prior to retirement, and that's going to sap my time for awhile.  But creating something like this is not exactly simple.  There's a lot of in's and out's of setting up something like this.  So I figure this is at least a 6 month project to get the basics down right. Bill 14:08, 24 February 2007 (UTC)

Food for thought on data formatting and tables,


 * Thanks for the tip, Robin, I wasn't aware I was starting a line that way.


 * On this topic: "Among other things, I'm trying to reduce the number of things a user has to know about. To that end I've experimented a bit with formatted tables for entry of Vita data such as yours. That includes both the Vita Box that appears on many of my pages, plus the child list table. I'm still experiementing with those."


 * Do you have a link to this Vita data? The child list can be solved easily, if you notice the Parents_ID field.  Any children on these parents can be found by this field within the same table.  The Individual is found by the INDI_ID field.  I tried many different ways to come up with a value for this field and found that a number is the best way for me.  This number can be derived many different ways as long as it is a unique value.  The Parents_ID field has 3 parts: Part 1 is this INDI_ID value, Part 2 is simply a M used as a seperator, Part 3 is the Number of the Family this particular individual has children in.  Most folks would be a 1, however sometimes there are 2 different families with children so it could progress.


 * "Among other things, I'm trying to reduce the number of things a user has to know about." Using the above example to look up an individual's name would require only knowing his INDI_ID. Finding his children would use this same number with "M1" added to it to find the first family and adding "M2" to find the second family, etc.


 * Now for the following generations: At first I had my program find an individual, then that individuals Parents_ID, then look for that individual and his Parents_ID and so forth until a complete family tree was found.  I quickly found this took the program quite awhile to create a tree using this method.  So I created a new field titled: "Tree_ID" which would allow any characters or numbers.  Pretty much at the researcher's choice as long as each Tree_ID was unique.  Now with one pass the program could pull the entire tree and then loop thru the tree to place individual into their appropriate place.  This was a great speed improvement on creating the family trees.


 * Other Event related data to this particular individual could easily be found using the same INDI_ID plus searching the Event_Type field for whichever data you were looking for such as: Birth, Death, Marriage. You could easily pull all info by just searching the INDI_ID field.


 * The Parents Table shows only the INDI_ID of Parent 1. This is because whoever Parent 1 is, if it is either the Mother or Father, that individual should be in the current database.  This would eliminate having to store Parent 1's name information into 2 different tables and also makes editing that same name information very easy.    The INDI_ID, FName, and Surname of Parent 2 is also collected here.  The second Parent, Parent 2 would attempt to show the location of that Parent's info in another database, if one existed.  This was not always the case, so we attempted to at least collect that Parent's First Name and Surname.  Any other info could easily be placed into the Comments table.


 * Lastly, if you plan on keeping all the information on ALL SURNAMES in one database, good luck. I started out this way and it very quickly got out of hand.  One SURNAME alone was over 5 MEGS and a second wasn't too far from that.  I am not saying it cannot be done, but it will be a large challenge.  You can quickly shorten the space needed to store the information if you do not store the SURNAME, image 10,000 individuals all with the Surname of "Crawford".  They all have the same name, why store that?  You may notice that the current link above for the Cochrane webBase lists a Variants tab.  At first I choose to place all variant spellings of a Surname in one database.   I am currently re-thinking this one and it probably should not be there.  Each database should contain one and only one Surname.  If this is to be redone, then the Parents Table above would have to expand Parent 1's field to include both FName and Surname so that parent could be located as Parent 2 is located.  Sometimes a Father's Surname is not spelled the same as his children's Surname?


 * Hopefully this one INDI_ID would allow us to collect all the information we wanted about any individual in the database. Now, it does not necessarily have to be a number, but it does need to be unique for each individual.  There are many ways to accomplish this.  As I said earlier, this is just food for thought, I totally understand the taking time to implement part.  Good Luck on the Retirement Part.  More to come later, Mark


 * PS: I found the Vita Box, GREAT WORK HERE! You may not need me after all, lol.  Is this data accessible to outside apps, such mapping?  Can one search POB field of a table and have all names, dates, etc returned for say a Country?  This would allow a great mapping app you use your data and whenever your data was updated, the map app would be updated as well?

Mark

(From here on, all is new; insertions may also be made above where appropriate. PLEASE REMEMBER TO SIGN AND DATE (" ~ ") Robin Patterson 12:31, 27 February 2007 (UTC) )