This is a continuation of my series on converting to RootsMagic version 5 (RM5) genealogy software. Earlier articles can be found at:
1. Software Conversion - Moving to RootsMagic
2. Before Converting to a New Genealogy Program - RootsMagic Conversion Notes
As part of my conversion to RM5 I decided to modify the GEDCOM file directly prior to import to get the cleanest possible import of my data. This should only be attempted by those who understand the GEDCOM spec and how to edit files in a pure text mode (not using a word processor that inserts invisible formatting characters). See the "Better GEDCOM Wiki" for more information on GEDCOM.
I've decided to describe the changes I made generically instead of including the actual code. First, the descriptions will be more readable. Second, I don't want to give enough information to allow novices to cause problems they then take to RootsMagic tech support. Anyone who understands GEDCOM will probably understand exactly what I did based on the descriptions below. If you try this and have problems, it’s your problem, not something the vendor should have to solve for you. Also, if I include all the details from the notes I made as I implemented the changes, this blog post would be many pages longer.
If I had implemented more of the custom features in my old genealogy program I would have needed to do a lot more manual work to get a clean import. Each person will have to look at their own usage and determine what additional GEDCOM changes might be needed. If you can modify a GEDCOM file directly you probably know enough about the customizations you used to figure that out.
This is the list of things I changed in the GEDCOM file before I got a fairly clean import:
- Moved place details from the place field to place details. The output GEDCOM file includes all place fields at one level. RM5 has a place details subfield for things like "Baylor Hospital" so the master place contains city, county, state, and country if you use it.
- Moved long text notes from the occupation details field. There seems to be a character limit on how much RM5 will import. Or something in my data was causing an issue on the events where I had a long set of sentences. RM5 uses the occupation detail field to hold the occupation name.
- Moved long text notes from the military details field. There seems to be a character limit on how much RM5 will import. Or something in my data was causing an issue on the events where I had a long set of sentences.
- Renamed my custom event tags to a standard one used by RM5, where a standard one will work. I kept custom tags I still use or that contain data I need to transfer inside of RM5 (my Research tags which may become ToDo items, Research Logs, and Research Notes in RM5).
- Changed the path to exhibits (media in RM5 lingo). As part of the conversion I decided to move my document images and photographs to a higher level folder in my document area. It shortens the full path name if I want to use links to these images from other programs.
- Removed references to customizations that don't transfer through GEDCOM such as [WO], [LINDEX], and a few others.
- Removed special characters my old program used for privacy that are not understood by RootsMagic. RootsMagic knows how to handle my sensitivity brackets {} so I can leave those. It does not know how to handle hyphens used as exclusion markers.
- Removed pseudo-people such as the "census" people I added years ago and no longer use.
- Modified the sources so I could get a clean, consistent first footnote or endnote. This will be usable until I convert the sources to formatted source types within RM5. RM5 imports sources as free-form sources. My old program exported them with the data mixed with the field names where I had customized the sources to match Evidence!1 and later Evidence Explained.2
- Modified repository references in source citations. I also took this opportunity to consistently refer to repositories with the same name. National Archives, NA-Washington, and NA all became NARA. I chose to use the abbreviation NARA (and TSLAC for Texas State Library and Archives Commission, dw. for dwelling, fam. for family, etc.) in the free-form sources as there will be a lot more "subsequent" footnotes than "first" footnotes. If I can remember to change the first abbreviation usage in a report to use the full name and show the abbreviation in parentheses, there will be fewer changes to make in output reports until I convert my sources within RM5. RM5 has a nice way of handling first and subsequent references and the abbreviations within.
- Used a Perl script (a simple programming language) to remove the reference number for each person so I did not have to do this manually. Years ago I used a modified Henry number, as defined in William Dollarhide's Managing a Genealogical Project.3 Once genealogy software was capable of printing an indented descendant list I really didn't need these numbers any longer, but they were still in my old database.
- Fixed a few spelling errors.
After importing the GEDCOM file I reviewed the import errors, fixed them, and then imported to a clean database. Repeat until no import errors are reported. I then spent some time reviewing the data and reports in RM5. I exported a GEDCOM file from RM5 and was pleasantly surprised. Not only does it export the basic data, it exports many of the customizations. If you color code people in your RM5 database, even the color is exported. This means imports to another RootsMagic database will be cleaner. Other programs likely won't know how to import all of the information that RM5 exports. But maybe someday all genealogy programs will do this.
RM5 also exports all three types of printed sources—first notes, subsequent notes, and bibliography format sources. Because my sources were imported as free-form sources, all three formats contain identical text right now. I considered taking the time to edit the GEDCOM file output by RM5 to make my citations clean in all three forms. But I decided to wait until I convert from free-form to formatted sources instead of spending a lot of time on the free-form sources.
There was one major issue I had with the way my data printed after my last GEDCOM import. RootsMagic has a different philosophy about how event notes should display in reports. The event data such as date, place, and details print as specified in a sentence template. Then a citation reference is printed. Then any "notes" associated with that event are printed.
Basic writing guidelines say a citation reference should follow all of the information that comes from the cited source. But I now have a lot of information that is printed after the citation reference for the source of the information. At this time there doesn't seem to be a really good workaround for this. This may not be a big deal for those who aren't printing many reports or don't need the report to follow a particular set of style guidelines. But it is a big deal to me. Not big enough to keep me from using RM5 for now.
As the best workaround I could think of after asking experienced users, I made one last set of changes to the GEDCOM file output from RM5. After making this change I then imported it into a clean RM5 database. The change I made was to add the phrase "Additional information: " to every note attached to an event. This won't be pretty when I print a report. But it gives me a unique phrase to search for that marks the places where the text will need to be cleaned up in each report. If (hopefully when) the RootsMagic program is changed to print citation references at the end of a note, I can use the search and replace option inside of RootsMagic to remove this phrase from my notes. For now I can search and replace in my word processor so I know where I need to make changes before sharing a report.
So is my database clean and perfect now? Not by a long shot. But it wasn't perfect in my old program either. Data entry standards change over the decades. Not all of my sources of a similar type were entered the same way in my old program. At least now they are consistent. And I have a usable database I can add to and get output reports from—even in a word processor format in my 64-bit operating system.
One thing I did notice when I imported the RootsMagic GEDCOM back into RM5, DNA events don't seem to be imported and DNA test result values don't seem to be exported via GEDCOM at this time. I had entered DNA results for one person so I could test some things as I played with RM5. A DNA event was included in the GEDCOM file created by RootsMagic. The marker values were not exported. Even the included DNA event was not imported. The lesson learned here is be sure you know what is not included in a GEDCOM export or import—don't assume everything is included even if a GEDCOM file includes a lot data more than some other vendors include.
Good luck if you decide to convert to a different genealogy database. It's not an activity for sissies. If you want more information you can contact me using the e-mail address on my website or add a comment below.
1. Elizabeth Shown Mills, Evidence! Citing History Sources from Artifacts to Cyberspace (Baltimore, Maryland: Genealogical Publishing Company, 2007).
2. Elizabeth Shown Mills, Evidence Explained: Citing History Sources from Artifacts to Cyberspace (Baltimore, Maryland: Genealogical Publishing Company, 2007). Evidence Explained expands on and updates concepts presented by Ms. Mills in Evidence! Citation & Analysis for the Family Historian. There is a newer edition of Evidence Explained than the one I have. See evidenceexplained.com for current version and ordering information.
3. William Dollarhide, Managing a Genealogical Project: A Complete Manual for the Management and Organization of Genealogical Materials (Baltimore: Genealogical Publishing Co., rev. ed. 1998).
© 2012, Debbie Parker Wayne, CG, All Rights Reserved