User blog:Phlox/What's in a placename? A rose is a rose is a rose?

To extrapolate from Gertrude Stein then, A London is a London is a London? Well, not quite.

Maybe Bill had it right- A rose by any other name smells just as sweet.

Alternatives to WP as an authority on Place Names looks like a dead end. Some might be better in terms of uniformity, but worse in terms of quality of data. Consider the Getty Online Thesaurus of Geographic Names. One would think that this would handle the 1700 year old town of Zhujiajiao properly. It is surprizing it doesn't. It claims to have vernacular display, but no chinese is available. WP has it, along with a proper containment hierarchy unlike Getty. Getty's type for this town is the unhelpful catch all "inhabited place", and it omits the fact that it is in the district of Qingpu. To get some perspective, Shanghai has a population of 20 million- a little smaller than Australia, so this is like mentioning that there is a fishing village in Australia and then the next level up is the national government, never mind all the intermediate levels. It's not like they don't know about Qingpu, because they do have an entry for it- notwithstanding Qingpu's half million population it too is given the catch all designation "inhabited place". Not very helpful, but hey- its uniform. On the other hand, WP more appropriately identifies it as a district of Shanghai, and id's Zhujiajiao as a town in it. Personally, I don't think we can wait long enough for Getty to bring its database up to a high enough quality for our needs but perhaps we can learn something from their choices on hierarchy. Clearly their locality subdivisions are somewhat lacking. How do they handle higher regional designations. Let's turn back to Shanghai. What part of China is Shanghai located in? It doesn't tell us because its "geographic" hierarchy is administrative unit geography, not physical geography. WP is not much better.

WP's Infobox Settlement has parameter values of "subdivision type" and "subdivision name" for sets to which a settlement belongs. However in its typical informal style there are no specified types. These are really labels for the infobox to place before the name of the containing entity and perhaps an icon for it. We do know the order of the containment.

Getty's structure
 * Extraterrestrial places
 * World
 * general region (Pacific Islands)
 * historical region (Roman empire)
 * association (Commonwealth)
 * group of nations/states/cities (Third World)
 * organization (Nato, United Nations)
 * Continent
 * Nation (China, UK)
 * country (England) (note- none of the Regions of England are mentioned as children.)
 * former administrative division (avon)
 * unitary authority (Blackpool)
 * county (Bedfordshire)
 * general region (Breckland)
 * Metropolitan area (Greater London)
 * borough (Bexley, Sutton)
 * Inhabited place (London)
 * borough! (Camden, Greenwich)
 * borough! (City of Westminster, City of London)
 * Neighborhood (City of London!)
 * Inhabited place (Bishopsgate)


 * province
 * autonomous region
 * fief
 * territory
 * former nation (Toba Wei Empire)
 * prefecture

OK- Getty went on a wild ride with London. Let's see how WP handles it. Disambiguation includes these other places:
 * Greater London (metropolis "larger administrative area")
 * London (city and capital)
 * City of London (small ancient city)
 * Wards of the City of London ([[wikipedia:Bishopsgate)
 * County of London
 * Outer London (versus inner london)
 * Central London (and similar regions of london)
 * London
 * London postal district

Given what has been seen in Gedcoms, it can only be expected that informal references such as being employed at a candle factory in the "East end of London" could be supported with wikipedia names.

Ok, maybe we put a "Locations" field in the form, and tell folks to put in any additional places in small to large order as an augment/ alternative to the formal hierarchy we have now. This field is fair game for Bots to extract from and move to the formal fields like subdivision1, and possibly future formal fields if we ever have need of them (neighborhoods, boroughs, metropolis, subdivision2). What this supports is search hits on "Births in Soho" even though we don't have a formal field for a borough of New York City. Ditto for defunct settlements and nations and unusual locations like where Niel Armstrong landed his exploratory craft. When people start yelling about needing a separate column in a table, that's when we run the bot to generate a new formal property like borough and hoist it out of the catch all Locations property. Maybe need a different name than Locations or places. Hmmm. Any suggestions? Or am I the only one in this echo booth here? 07:01, 14 July 2009 (UTC)