March 27th, 2014


Meta-Genealogy: What I wish the software did better

So I've been doing this genealogy thing now for about 18 months now, and there are a few things that I wish worked differently. I'd love to hear feedback from others who may have suggestions about other tools that I should look into.

First: I've come to the conclusion that the common data model is backwards. We assert "facts" about people, and then we're supposed to back those up with sources. But I think we'd have better results if we started by cataloging and transcribing sources, and then asserting "facts" from them. That's probably too hard for most people to wrap their brains around, but it would solve two problems: First, it's too easy to enter a "fact" with the best of intentions of getting around to the tedious step of adding the source info later, and never getting back to it. Second, and more pressing, sometimes what you start with is a (virtual) pile of source documents, and your task is to organize them into a network of relationships. Managing those is difficult at best.

Second: I want a fast plain-text fuzzy search. "Hmm, this naturalization record has the address 62 Common St., that sounds familiar." Right now, I'm actually doing most of my work in Emacs. It doesn't represent everything nicely, but it's fast and it represents anything that I can type.

Third: I want to have a "trust chain." That is, for every "fact" that I assert, I want to be able to assign it a probability score: 100% means I'm certain it's true (I was at my own wedding and know my own name); 0% means I have no idea if it's true or not, but I encountered the assertion somewhere, -100% means I'm certain that the assertion is incorrect but I'm including it to show that I've encountered it, evaluated it, and rejected it.

(For example, an otherwise wonderful tree of my mother's father's family that a cousin spent decades researching introduced a spurious twin sister three generations back by accidentally combining two records for different people. I've contacted the cousin and he agrees that it was a mistake -- but dozens of people have downstream copies of his tree. So I want to indicate on my tree that that individual doesn't exist, not just have her omitted from my tree which could be considered ambiguous.)

And then I want to be able to assign a "trust factor" for other researchers whose trees are loosely liked to mine. I want to get notified when they make changes -- and I want the trust factors multiplied. So my otherwise very reliable cousin would have a trust factor of 90%, while someone I barely know might only have a trust factor of 50%. Then if my reliable cousin has marked a "fact" as 50% reliable, while the unknown quantity marked a "fact" as 90% reliable, they'd both show up as 45% reliable in my summary of things to look at and consider importing. (Of course, when I import that "fact", I'd get to assign it my own level of trust.)

Fourth: I want to be able to divide my tree into segments, and grant different people different permissions depending on what part of the family they belong to. I'm doing some great collaboration with my Werdesheim cousins, but they don't need full access to my Bissinger data. 'Nuff said.

Fifth: I want a lot more flexibility in how I visualize my data. I want to be able to apply filters and templates of my own design in generating reports. I want a real API into my data so I can write scripts to extract things.