21 March 2017

Wanted: Genetic Genealogy Analysis Tools Incorporating Family Tree Charts

Programmers who are interested in genetic genealogy have provided some great tools for DNA analysis. Testing companies provide some great tools, too. New tools are produced all of the time and I use and love most of them. I WANT MORE. I want some specific features. Few of these tools today correlate the detailed DNA data (Y-DNA STR mutations, atDNA shared segments with start and stop points on a chromosome) with the family tree in an easily understood way. That correlation is essential for the "genealogy" in genetic genealogy.

One of the to-do tasks that keeps getting shoved lower on my priority list is to provide genealogy software developers with a list of what we need to incorporate DNA data into our databases and to create useful output for analysis. Genetic genealogical research has matured to the point where this should become a priority.

If we all pool our ideas, we can come up with a good list to provide to developers so the output is what we want. We need a list of the data we want to store in the genealogy database as well as what type of output reports we need.

So what would your ideal genealogy database incorporate and provide as output for your genetic analysis? Not necessarily the raw DNA data for analysis, but the shared DNA data related to the other test-takers in your database.

Feel free to provide suggestions as comments to this blog post, as Facebook comments if you read this on Facebook, or contact me directly using the email addresses on my website http://debbiewayne.com/ (scroll to the bottom of any page to see contact info).

So what finally spurred me to make this a priority after all this time? (1) I investigated different tree creation tools a few months ago and discussed it on Facebook. I found none of the tools produce exactly what I want for DNA analysis. (2) The McGuire Method of charting several of us saw last summer was published.


Lauren McGuire recently wrote a guest post on Blaine T. Bettinger's The Genetic Genealogist blog, "GUEST POST: The McGuire Method – Simplified Visual DNA Comparisons." This describes the great chart she designed for correlating a family tree and shared autosomal DNA (atDNA) totals for analysis. The chart Lauren uses in the blog post displays total shared centimorgans (cM), percentage shared, and relationship of each person on the tree in an efficient and compact format. I like seeing all of these items at once as all are important during analysis.

I immediately loved Lauren's chart when I first saw it. She and I obviously think the same way about what we want to see when analyzing DNA information.

My own charts started out with printed trees - either created in an image editor, Microsoft Word SmartArt, RootsMagic genealogy software, Progeny Charting Companion, or, more recently TreeDraw. Lauren and many others use Excel. Some use LucidChart and other online charting tools (find more info on these tools with a Google search). None of these tools provide an easy way to create a tree that only includes the DNA test-takers, much less incorporate the DNA data with the tree. And often my DNA data is handwritten at the bottom of the printed chart. If I want to make it look prettier then I spend a lot of time getting a Word table to line up under the family tree.

My own charts have evolved over the years. I started by creating an image of the tree and Y-DNA STR differences in an image editor:

Y-DNA and Tree Chart as Image, Debbie Parker Wayne

That evolved into a Word table that was easier to modify:


Y-DNA and Tree Chart as Table, Debbie Parker Wayne

Then into Word SmartArt which was better to show in a presentation:

Y-DNA and Tree Chart as Smart Art, Debbie Parker Wayne

For autosomal DNA triangulation I started with Word SmartArt and hand-written shared segment info:


atDNA and Tree Chart as Smart Art with Hand-written Notes, Debbie Parker Wayne

That evolved into simplified trees with a Word table showing shared segment info:


atDNA and Tree Chart as Smart Art with Shared Segment Table, Debbie Parker Wayne

I use a similar table when I am analyzing total shared DNA against the tree relationships.

What other formats have you found useful? What would make your DNA analysis process easier?

In my opinion, tree charts are most useful when each test-taker's lineage is shown in a column and each generation is contained in a row. The DNA data for a test-taker can be shown below in the same column as the lineage. The rows allow for easy calculation of relationships - which the software could do for us and include in the chart.

A chart including only the people in the DNA study is essential. I have been creating additional RootsMagic databases including only the DNA test-takers and their ancestors, but this takes a lot of time. The pared down database is input to one of the charting programs, but I still sometimes have to remove spouse boxes when I am only interested in the men for a Y-DNA study, for example. Creating a chart from my full database and then deleting the people I do not want takes even longer.

The DNA data to incorporate into our genealogy database varies for Y-DNA, autosomal and X-DNA, and mitochondrial DNA. Autosomal DNA analysis requires total shared DNA or shared segment information. Y-DNA analysis requires notation of differing Y-DNA STR and SNP markers. Mitochondrial DNA requires listing the locations that differ from a reference sequence and/or between test-takers. For Y-DNA and mtDNA we may want to include haplogroups. Even though we all know the admixture estimates vary depending on the reference population and algorithm used, we might want to record the estimates and which portions of which chromosomes match which reference populations.

Send me your ideas and I will compile a list we can prioritize and provide to the genealogy software developers. This new list will not be specific to any testing company or software, but a list of data we want to track in our DNA analysis and provide in reports we use for our analysis and publications. There may some overlap between the list I compile and the ISOGG Wiki wish lists for the testing companies:
https://isogg.org/wiki/FTDNA_wish_list,
https://isogg.org/wiki/23andMe_wish_list,
https://isogg.org/wiki/AncestryDNA_wish_list, and
https://isogg.org/wiki/MyHeritage_wish_list.

If you are on Facebook, these discussions relate to this issue although you may not be able to see the posts depending on Facebook settings
https://www.facebook.com/debbie.p.wayne/posts/10212404574018709
https://www.facebook.com/groups/DNADetectives/permalink/1389234674480979/





March 21: Added a cropped portion of the McGuire chart with permission of creator, Lauren McGuire.

March 22: Image added to illustrate reply below to nut4nature22 dated March, 2017 05:43:

RootsMagic Relationship Chart Sample, Debbie Parker Wayne



To cite this blog post: Debbie Parker Wayne, "Wanted: Genetic Genealogy Analysis Tools Incorporating Family Tree Charts," Deb's Delvings, 21 March 2017 (http://debsdelvings.blogspot.com/ : accessed [date]).

© 2017, Debbie Parker Wayne, Certified Genealogist®, All Rights Reserved

8 comments:

  1. I wanted to let you know I am listening, and very interested in this area of software development. I also saw the McGuire chart and was intrigued by the possibilities.

    One issue as a software developer is I am not an expert in all areas of genealogy, and look to the community to help identify the needs.

    I would love to hear more.

    ReplyDelete
    Replies
    1. Glad to hear of your interest, Ed. I will definitely be sharing the list once I compile it. I know what things I would like to see, but we will get much better tools if we share ideas within the community as with Lauren sharing her charting method.

      Delete
  2. Good post Debbie. Most of us are listening to anyone like you who's got good ideas.

    I'm not normally a fan of charts. They have their use when there's maybe only 10 or 20 people involved. But once you get more than that, diagrams become difficult for programs to display and become too complex for people to interpret.

    I am working on incorporating some important DNA goodies into the next version of Behold. We need to know: (1) How we're related to each other (every way, as we can be related more than once), (2a) The probability that we share autosomal DNA and (2b) The amount we are likely to share if we do, (3a & b) Same for X, (4a & b) Same for Y, (5a & b) Same for mt, (6) who are our ancestors Y-candidates, (7) who are our ancestors mt-candidates, and (8) which ancestors is each person a Y or mt candidate for?

    This array of features is not available yet in any genealogy software.

    And thanks for the links to the ISOGG Wiki wish lists. Don't know how I missed them before.

    Louis

    ReplyDelete
    Replies
    1. Thanks for the input, Louis. I will add these items to the list. And thanks for writing Double Match Triangulator. I love using it and have not learned the full power of all of the features yet.

      For those of us who need pictures to see, the charts are indispensable, even if we cannot fit every potential DNA match onto one page. And you are right, even current software packages that allow the user to enter some of the mtDNA and Y-DNA marker values do not use it for any meaningful output. And I have not seen any software that does much more than print a list of a person's potential ancestors for a specific type of DNA (mtDNA, Y-DNA, X-DNA). I think that will change in the future with programmers like you who are interested in using genetic data for genealogy.

      Delete
  3. Totally agree. I would like to see the functionality of a genealogy program combined with software for recording and reviewing DNA matches. A FTM meets FTDNA meets Gedmatch. So a genealogy program with the ability to import data about DNA matches that allows users to either review the matches and assign them to a person already in their tree or set up individual person pages for new matches. This would allow people to add information about matches in an organised fashion and information would not have to be re-keyed once a match is confirmed.As a FTDNA user the biggest drawback is that information has to either be manually keyed in every time you want to add someone to your tree or you need to upload a new tree which means that you have to reassign all of your matches which is a time consuming process.

    ReplyDelete
    Replies
    1. Having to reformat data and re-key the data as I move from tool to tool is one of my big complaints right now. In addition to ideas about what data we need to record and how we want the output to look we may need some data format standards to make transfers easier. Thanks for sharing your ideas.

      Delete
  4. So glad you wrote this piece. Being spread out among multiple tools is a challenge. If we're going to dream, I'll dream big and wish that such a program would incorporate segment mapping with reports that can be pulled and filtered for multiple reports/charts. Also to include source & citations for research. I am intrigued by Evdentia's new Cousin Finder but wish after confirming you could also view an entire branch, much like the McGuire method. Thanks Debbie!

    ReplyDelete
    Replies
    1. We may wind up with incremental wins, but if we have the "big dream" goal in mind the incremental improvements can be designed so it is easier to add enhancements. For example, RootsMagic now has a Relationship Chart (image added at the end of the blog post) that would work well if we could select more than two people and space the boxes closer together. We might want to mark one of the selected end-line people as the focus person so relationships to that person are listed in the other end-line boxes.

      Thanks for sharing your ideas. I think discussing this openly among the community will give the developers some good ideas.

      Delete