Help get this topic noticed by sharing it on Twitter, Facebook, or email.

We need a better comparison tool.

I know that FamilySearch is reworking a lot of the code used with FamilyTree. We have seen some improvements, and a number of requests for the source linker, merges, as well as importing data into Family Tree (regardless of the source).

I am posting this as a separate thread because my recommended system tends to get lost in the other threads. Hopefully, this recommendation will be acknowledge by FamilySearch.

First, the new screen will take time to load. There is no getting around that because It entails showing every element of an existing record in the massive tree.

Second, the original records data is pushed into a temporary storage area so that the only time the original record will be updated is when the final step of completed.

All elements of an existing record includes, but is not limited to:
1. All facts with 1) conclusions, 2) reason statements if they exist and 3) all tagged sources*
2. All events, incluidng couple relationship events, with their 1) conclusions, 2) reason statements if they exist and 3) all tagged sources*

* all tagged sources will reference the source section by number. If a source is tagged to a fact or event, the source number will show up and be hyperlinked to the specific source.

Place conclusions will include standardized places if they exist in the original record.

3. All notes
4. All discussions
5. The life sketch
6. Any (future enhancement) notice regarding the record
7. For Church member FamilySearch accounts, the ordinance data
8. Sources attached with reference numbers (see 1 & 2, above)
9. Memories attached with reference numbers
10. All family members and their relationship and any supporting sources by number

The screen can be used for hints of and/or found -- historical records. All of the indexed information from the record will appear opposite the applicable recorded data in a manner similar to the current source linker. My proposed screen is a replacement for the source linker.

The screen can be used for comparing one 1) possible duplicate, 2) Find Similar people, 3) merge by ID, or 4) new person entry.

The screen can be used with an imported GEDCOM tree.

Once the historical record or duplicate/imported person is loaded:

A discrepancy routine is run that will locate and rank 1) place discrepancies from minor to major (different address, different city, different county/political subdivision, different country) and in particular, compare standardized places as well; 2) date discrepancies from minor to major (different day, different month, number of years different with minor being 1 year to major being 20 or more years, to catastrophic being 100 or more years.

The original existing record will be displayed on the left side (like the source linker and duplicate merge screen is today) and the source to be added or record to be merged will appear on the right side.

If a historical record is involved, the source, as it will be added to the source list is provided on the right side is shown.**

Options:

** If a new source is being attached, all portions of the left side can be edited, but if an edit is performed, the current reason statement is displayed and can be edited and if not, the current "You did not change the reason statement" is displayed. Changes made will be tagged to the source, regardless of where they appear in the person's record. This can include discussions as well as notes.

All elements of the new source can be edited including the date, title, description and so on. If a memory is being used as a source, that can be done as well, by referencing the memory number.

Transfer/edit dates: if a minor discrepancy is involved, request a change in the reason statement. If a major discrepancy is involved, a pop-up message is presented along with the noted discrepancy and a "Are you sure you want to make this change" question and a change must be made to the reason statement, suggesting that the discrepancy is noted and why the original date was wrong.

Transfer/edit places:

1) The transferred/edit requires standardization. If "none of the above" applies, the standards place is loaded in a separate tab/window and the place automatically populated and standards listed. The user has the option of changing the populated place so that a standard can be selected -or- request the place be added to the standards. The standards team will need to be involved with this process. Note, and change to the place is not reflected back to the new comparison screen.

2) if a minor discrepancy is involved, request a change in the reason statement. If a major discrepancy is involved, a pop-up message is presented along with the noted discrepancy and a "Are you sure you want to continue with this edit" and "Please enter/append the reason statement for your conclusion," suggesting that the discrepancy is noted and why the original place was wrong.

Once all of the changes have been made (discussions can be added or comments added to existing discussion, but any changes to the life sketch cannot be made. Notes may be changed or deleted, but a reason statement must be added.

Completing the task:

I would like to see three options:

1. Save -- the original existing record is compared against the originally (before edits) existing data on the left side and if not changes have been made, the save records the changes, including the new source (if involved). Any merged record is noted and cannot be used a second time without restoring it. All changes are recorded in the change log, including edits.

2. Save edits, but do not change existing record. The user will see a list of these in the lists option. The comparison screen can be reopened from the list and any changes in the original existing record noted, including merge-deletions. If the original record is merge deleted, the surviving record is opened in the original's place and noted that it is the surviving record.

3. Cancel all changes. If a saved comparison (#2 in this list), then it is removed from the saved edits list.

This should allow us to start getting a handle on what newbies (to Family Tree) can do (or us, for that matter). Otherwise, we are stuck with the ongoing problems that newbies can cause (and that included us when we first started using Family Tree). It would also give us a reason to stop and look at what we are doing if we did something that does not match up with the existing record.

Comments for additions or changes to my proposal are welcome.

It would be nice if FamilySearch acknowledges if this will be considered, is being considered, or is on a list of to do changes for the site.
4 people like
this idea
+1
Reply
  • By the way, I expect this to be opened in its own tab or window, not overwriting wherever the comparison screen was initiated.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. kidding, amused, unsure, silly happy, confident, thankful, excited sad, anxious, confused, frustrated indifferent, undecided, unconcerned

  • 1
    One thing that needs to be included is a better tool for when an entered person triggers a selection of possible persons from the tree, that would also be triggered by the GEDCOM compare process.

    First, if a possible historical source, when used with the above proposed screen does not find an equivalent person in the massive tree, that it be treated as a new person, triggering its own new person routines. Again, the source does not matter.

    If a new person cannot locate an existing person and if the person is not living, then the screen would likely be a similar to Find Similar person. Selecting one of the displayed people would trigger its own comparison screen.

    If a new person is not located within the massive tree, and the person does not have a death/burial date/place, then the current rules apply with regard to automatically accepting the person as deceased. If living, then a notice that the person will be placed in the user's private space should be displayed.

    If a GEDCOM uploaded file is used, the tool does not act any differently. However, there would be a mandatory reason statement required that would reject any reference to GEDCOM / gedcom / GedCom, et al or Ancestry or a variation on any other site. Garbage would also be rejected.

    This kind of action with regards to mandatory reason statements would require some very good parsing routines to make sure that the user has not entered something that is meaningless and would need to handle all accepted languages that are presently used with FamilySearch FamilyTree.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. sad, anxious, confused, frustrated kidding, amused, unsure, silly happy, confident, thankful, excited indifferent, undecided, unconcerned

  • Granny was the genealogist in our family and I want what she did in the FamilySearch "Family Tree." Upload GEDCOM and click (one by one) for ALL to be included in the FamilySearch "Family Tree." Don't have time to COMPARE!, Granny knew what she was doing anyway.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. sad, anxious, confused, frustrated kidding, amused, unsure, silly happy, confident, thankful, excited indifferent, undecided, unconcerned

  • 1
    On GEDCOM comparisons. If the record from the GEDCOM file does not have any sources, then the system will switch to a temporary screen and the fields filled with the data from the GEDCOM person. and the Source Tool FamilySearch will be triggered.

    As source must be added to the person before it can be added to FamilySearch. Once the source is added, a new Find similar person search is implemented, now with the source and updated person's record. If not found, then the user has the option for saving the person for later consideration (which will appear in a list), allow the hinting system to search for possible hints and duplicates, or cancel the screen. Note: the person is not added to FamilyTree.

    To add the person to FamilyTree, all hints must be processed as they would apply to the new person and once completed, then a new Find similar people screen would open with the new person sourced person, and when none apply, then a new comparison screen opened with the potential person to be added on the left and the original entered/GEDCOM person on the right.

    Keep in mind that it is entirely possible for a single session to produce a number of comparison screens. That is why I would want them to be stored in a user-oriented space on the FamilySearch servers, where it can be reopened.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. sad, anxious, confused, frustrated kidding, amused, unsure, silly happy, confident, thankful, excited indifferent, undecided, unconcerned

  • I recognize that any approach to implementing the above elements will require considerable development and testing to make sure it can work as designed.

    But the result should go a long way to resolving a lot of the issues surrounding bad changes that occur now.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. sad, anxious, confused, frustrated indifferent, undecided, unconcerned kidding, amused, unsure, silly happy, confident, thankful, excited

  • 1
    Tom, I am sorry, but I disagree with your whole concept of comparing, and a better comparison tool. I don't think the comparison tool is broken. The FamilySearch "Family Tree" already does a real good job of seeing if there is a person in the "Family Tree" that matches your input data. What is broken is how new information is added to the "Family Tree."

    GEDCOMing to add new data to the "Family Tree" is really really broken.

    Guy Lamoyne Black got it right when he stated, "I think it would be better to force users to manually input any information into familytree."
    https://getsatisfaction.com/familysea...
    • With respect to Guy's comment, I fully agree with his assessment, as long as we have the present system in place.

      But what I am proposing is a new system that has checks against conclusions that do not agree with an existing record. It can be multipurpose in that it can be used with record hints and sources, merges, and even GEDCOM uploads (which really are a form of merging, but so old and clunky that it should be disabled).

      Had the proposed system been in place, I would not have had to restore a record because the person was unaware of different parents. The proposed system will highlight that problem and would have gone a long way to make the user aware of the situation.

      This same check could be used to warn a user who is merging children with the same name in the same family when one child died before the next was born.

      However, any "fixes" to the present system will really be nothing more than patches on a badly worn tire, any of which could blow out at any moment.

      People have suggested checks like those I propose in my suggested comparison tool in the past. The problem is that the present system isn't set up to adequately provide those checks.
    • Disagree! The problem is GEDCOMing new databases, (with old information) into the FamilySearch "Family Tree." Allowing GEDCOMs to remain with the present system, or the system you propose, is to KEEP THE OLD SYSTEM! We already have the system in place of adding new data to the FamilySearch "Family Tree." The system we have now is almost perfect, just STOP the input of GEDCOMs and the system is perfect.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. kidding, amused, unsure, silly happy, confident, thankful, excited sad, anxious, confused, frustrated indifferent, undecided, unconcerned

  • Disagree! The problem is GEDCOMing new databases, (with old information) into the FamilySearch "Family Tree." Allowing GEDCOMs to remain with the present system, or the system you propose, is to KEEP THE OLD SYSTEM! We already have the system in place of adding new data to the FamilySearch "Family Tree." The system we have now is almost perfect, just STOP the input of GEDCOMs and the system is perfect.

    NO COMPARISON TOOL NEEDED!
    • view 16 more comments
    • I knew and know that one uploads a GEDCOM to "Genealogies." I knew and know that the uploaded file can sit there forever, or be compared with the "Family Tree" data. (I have my updated file sitting.) I knew and know that I must compare my uploaded GEDCOM file in "Genealogies," with the existing data in the "Family Tree." I knew and know that because my GEDCOM "Genealogies" file was such a LARGE GEDCOM file, when I compared it years ago with the "Family Tree" data, I hurriedly said yes and no to everything and probably caused many many duplicates, and a lot of hard work for other "Family Tree" patrons. I knew and know that because I probably caused many many duplicates, I now enter my ancestors one at a time to the "Family Tree." And yes, David Newton, I knew and know that uploading a GEDCOM file to "Genealogies" is not the problem, it is the comparison process that is the problem. And also Tom Huber, I knew and know that Family Search does not promote any GEDCOM files and you misunderstood me. And I knew and know that I am not a knowledgeable person and computer savvy as you-all are, and misspeak a lot, and probably should really watch how I am wording my sentences. And yes. if the above is not correct then yes, David Newton, I do not know and understand what I am being told.
    • Don, you kept saying that a new comparison tool is not needed for GEDCOM file uploads into the Genealogies section.

      Your reasoning does not compute at all. The current system, which is likely not to change (much), is the problem and there is no indication from anyone at FS that they will remove the existing/faulty compare function from that section of FamilySearch.

      That was one of purposes of my proposal, to make a comparison tool that could be widely used, regardless of the source of the new material, whether GEDCOM, manually entered, a historical record to be attached (possibly via a hint), or a duplicate or "found similar record" from the massive tree.

      You are against the tool because of the GEDCOM problem. I get that, but that is what is behind my recommended system. I want to know what is needed that I missed, not a hefty "NO" from someone that does not understand its purpose.

      One thing is certain, unless the thinking behind the current comparison method with uploaded GEDCOM files changes, that is not going away, so I have included the GEDCOM file uploads in my tool, in the same manner as merging a different record from the massive tree.

      Now, if you can suggest something useful for what I am proposing, which means that you need to study it and my follow on comments to good suggestions, then I will be very appreciated.

      The effort will be extensive and I expect it to take a lot of time, but eventually one comparison tool can serve all needs, including adding sources, merging existing tree persons, and also merging new adds from sources or manually entered.

      The key difference is that the comparison screen will flag every discrepancy from minor (with a notation) to major (in some manner) and the major differences may even prevent the desired operation from happening. More importantly, everything in the existing record will be displayed, giving a user a full view of the material as well.

      Where sources are involved, I envision the new source title to be displayed and the proposed source editable so that everything can be set up, including tagging (which should, by then, be applicable to all events and facts as well as family relationships), all on one screen.

      The end effect will be a single person at a time. Where a GEDCOM file upload is involved, the system will maintain the last record compared, but will not automatically display what is now displayed. All of that nonsense will be replaced by the new system, which will start the same way a new person, not attached to the family, is identified in the source linker, by displaying a number of likely candidates. If none are chosen, then the new system will not go to the comparison screen, but the system will then go to another preliminary screen where more information, including sources, have to be added to the proposed new person.

      That process should effectively stop the duplicates from any source from being added to the massive tree, as well as given the user a full view of the existing record.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. kidding, amused, unsure, silly happy, confident, thankful, excited sad, anxious, confused, frustrated indifferent, undecided, unconcerned

  • 1
    I am a little shocked to see that FamilySearch is encouraging the usage of GEDCOMs https://www.familysearch.org/mytrees/...

    Never noticed the "Add GEDCOM" on that page.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. sad, anxious, confused, frustrated indifferent, undecided, unconcerned kidding, amused, unsure, silly happy, confident, thankful, excited

  • 1
    One would hope and expect that FS would at least explain on that page that the best place to put gedcom names is into Genealogies and NOT into the actual tree (and even explain the pollutant dangers of adding a gedcom to the tree).

    I believe adding Gedcoms to the actual tree should be discontinued immediately. I've had my share of hard work undoing the results of adding gedcoms.
    • view 4 more comments
    • Yes, if my employers made it into the (fixed) genealogies, then if I were to run a comparison between FSFT and my loaded genealogy, I suspect that the employers wouldn't be found because I'd only have skeleton details for them - their situation at one census, eg. That might lead me to loading them under the impression that they are new. While they might be, they might also be duplicates. Given that I know that they are skeletons, it would probably make more sense for me to exclude them from the comparison with FSFT right from the start. That might be easy to accomplish - there's no genealogical connection of any sort between my family and them - they're not even relatives of relatives of relatives... So I'd just need to select a root person and so all the root's relatives of relatives of relatives.... would get compared for loading. The floaters could be ignored.

      This is not really about the comparison user interface but it is the previous step that needs to be thought about at some point.
    • Yup and that is when an existing person is not found in the tree. Or a new person is being added manually.

      I suspect that my proposed system is going to play havoc with the API and the third party programs and sites that can enter "new" people into the tree.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. sad, anxious, confused, frustrated happy, confident, thankful, excited kidding, amused, unsure, silly indifferent, undecided, unconcerned

  • They at FamilySearch who promote GEDCOMs, and refuse to see the damage they do to the "Family Tree," need to be visited once by someone adding by GEDCOM to their portion of the "Family Tree" or ancestors. It is not fun spending hours and hours of correction work. I can see why more and more patrons are asking for private trees at FamilySearch so they can still do LDS things and also keep their records corrected.

    Tom Huber, I am against your Feedback, "We need a better comparison tool" in that I don't think anything should be done to help with the promotion of GEDCOMs, and that is how I see your Feedback.
    • view 1 more comment
    • I am not talking about "Genealogies" or the Pedigree Resource File. I have from day one of this Feedback, have been talking about the FamilySearch "Family Tree."

      And the user does change things in the "Family Tree" by his or her usage of GEDCOMs
    • I could except your, " We need a better comparison tool," if it were not associated with GEDCOMs.

      Yes, FamilySearch does need a better comparison tool that happens, as it does now, automatically when one inputs ancestors into the "Family Tree" one at a time.

      FamilySearch is encouraging on its "Submit Your Trees to Genealogies" - "Add GEDCOM" page, https://www.familysearch.org/mytrees/... the uploading of massive GEDCOMs. Patrons do not spend the time that is needed in the GEDCOM and "Family Tree" comparison process with the task of conversion of such large GEDCOMs. FamilySearch change this page to where it only uploads to "Genealogies" and stop the comparison process into the Family Tree from GEDCOMs, please . Data input to the "Family Tree" should be one ancestor at a time.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. kidding, amused, unsure, silly happy, confident, thankful, excited indifferent, undecided, unconcerned sad, anxious, confused, frustrated

  • 1
    I won't pretend to comment on your whole idea, but I sometimes think about some features in what I think you call the discrepancy routine. Common sense to me, but for whatever reason they tend to be ignored by many users doing merge in FamilyTree.

    If things like the following are discovered by discrepancy routine, put up big warning saying something like "You should probably not proceed with this merge unless you understand these issues and this is just one step in a multi-step cleanup!" Maybe even add some fine-print type footnotes about how if this is being done just because FamilySearch offered a "hint" about merge (or that the discrepancy was recently created because of attaching a hinted source), FamilySearch hints are not always perfect (especially if mistakes have already been made for the person). Maybe even some simple language reminding people that we are a single collaborative tree and that multiple people with same name (or couples with same names) do occur and should not be combined unless they were known to be the same exact person (same parents, same spouse, same children, same approximate location, though there can remain valid arguments about some vital statistic such as birth date/location).

    One is to look for discrepancies within the existing record - if it already has problems such as bad miscombine back in nFS, don't make it worse (keep separate until the existing mistake gets cleaned up first).

    Another is to look whether you end up with multiple sources of the same type where only expect one, in other words more than one census from same year (but it is just a warning - a user who knows there were multiple enumerations can proceed).

    Another is to look for children or siblings born within 9 months of each other (but not on same date such as twins, or duplicates which would be merge candidates). If merge would create such situation for siblings then warn that it might be wrong, or if merge would create such situation for children then warn that it might be wrong.

    Similarly look at locations for multiple siblings and multiple children. You have to do it from major to minor as some other thread discussed here recently, but what I mean is that if merge would end up with location of sequential births bouncing back and forth between two very different locations (for siblings of person or for their children), warn that the user might be attempting a merge that does not make sense.

    Do similar for birth/christening source locations - if merge would create sequential siblings or children having birth/christening sources which bounce back and forth from very different locations, warn that it might not be a good merge.

    These are just examples, I've probably forgotten more examples but the point is too add some impediments (as I think Ron calls them) toward users messing up in collaborative tree.

    If not obvious, too often I encounter a person in tree who has common sense impossibilities when you look at their family of siblings and/or children, and it is frustrating to me how easy it was for someone to create that situation vs how hard it is for me to untangle it (since I want to properly research the multiple people including attaching appropriate sources to each of them).
    • I agree completely, Doug. One of the things that should be added to the features of the new comparison feature is if a record has some impossible material, that must be addressed before proceeding with any merge or even added historical record sources.

      I appreciate the input on these types of things. It provides new information for how this thing, the new better comparison tool, to be truly useful, is put together.

      Thank you.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. kidding, amused, unsure, silly happy, confident, thankful, excited indifferent, undecided, unconcerned sad, anxious, confused, frustrated

  • I consider this to be a universal comparison tool, complete with date, place, and relationship warnings. It is something that certainly could take into account a lot of the previously discussed (in their own threads) methods to prevent bad merges, to stop attaching hints that are obviously not of the person, and so on.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. kidding, amused, unsure, silly happy, confident, thankful, excited sad, anxious, confused, frustrated indifferent, undecided, unconcerned

  • It should be noted that the existing merge comparison screen has now been updated to include reason statements that are entered for fact and event conclusions for both person records.

    However, that doesn't resolve some of the many issues with the source linker.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. sad, anxious, confused, frustrated kidding, amused, unsure, silly happy, confident, thankful, excited indifferent, undecided, unconcerned

  • Yes, the new additions on the merge page are welcome and useful. They give us immediate knowledge that previously we could only obtain by opening the person pages of both people being compared (which I doubt many patrons ever do).

    As you say, an even more comprehensive comparison tool would be extremely useful and could help prevent many faulty merges.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. sad, anxious, confused, frustrated kidding, amused, unsure, silly happy, confident, thankful, excited indifferent, undecided, unconcerned

  • Great conversation and much of this aligns with my thinking. I would like to see a analysis/comparison tool that works between two PIDs, between two artiacts, between an artifact and a PID. When I say artifact I mean some representation of a historical record (Index, Image, Memory, Source). 

    Also analysis is not the same as use/incorporate the data of one to another. Those are two separate activities, though closely bound.

    Regarding GEDCOM, I don't make a distinction between GEDCOM data, and third party data. You don't know where that 3rd party data came from (may itself been a GEDCOM). BTW: last month GEDCOM ingest had the lowest duplication rate of all the input products - it really depends on the users using the tool. 

    Another complication is FS could create a great tool but third parties will be using their own compare and write/sync tools. But I like many of the ideas here regarding the info that needs to be seen and during compare and then acted upon in incorporate step.
    • Thanks, Joe.

      I think the new process (which is available only for newly uploaded GEDCOM files and not existing files that have had the compare process run before the change) is slowing down those who were wholesale adding to FamilyTree. The process is long and detailed, but still lacks some needed features.

      It doesn't help those files for which the compare process was run previous to the updated view process, but it certainly will impact those who newly add files.

      I consider any source that is not another PID to be like a historical source (an artifact of some sort), and so if a universal comparison tool is created (and the current merge compare screen is rapidly getting to be many times more complete) wherein artifacts can be compared and processed (if possible) then I think a lot of the problems with people making changes will simply "go away" much like the reduction of duplicates coming out of the GEDCOM process.
    • And yes, re GEDCOM file duplicates -- they were universally within my experience, an inexperienced user who added their data to FamilyTree.
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. sad, anxious, confused, frustrated indifferent, undecided, unconcerned kidding, amused, unsure, silly happy, confident, thankful, excited