ChatGPT and GEDCOM Files

Blog Post Banner ChatGPT and GEDCOM Files

Before I was a professor, I was a flight test engineer. My love of testing systems goes back to my early days working in a lab during college. My particular gift was always find a way to “break” hardware or software through use. My desire to investigate the use of ChatGPT in genealogy has definitely coincided with my enjoyment of testing. In this blog post, I take a look at what ChatGPT knows about GEDCOMs, how it builds one and how it can create a narrative when given an individual’s data formatted in a GEDCOM.

The technical jargon in this paragraph is available for those who want a slightly deeper understanding. In computer science, data can be grouped together in meaningful representation of things that live in the real world. A data structure is a way to group fields in a specific order for a program to input data, manipulate it, and output it. The way that genealogical data is formatted and shared is the GEnealogical Data COMmunication (GEDCOM) standard.

GEDCOM (Genealogical Data Communication) is a file format used to exchange genealogical data between different genealogy software programs. It is a standard format for saving family tree data, and it allows users to transfer their family tree data from one program to another.

GEDCOM files are saved with the extension “.ged” and are made up of text-based data that includes information about individuals, families, and events such as births, marriages, and deaths. The data is organized in a hierarchical format, with each record containing information about a single individual or event.

GEDCOM files can be used to create family trees, research family history, and share information with other genealogists. They are widely used by genealogy software programs and online genealogy databases. For example, you can export a GEDCOM from your family tree program or download a GEDCOM from

NOTE: DO NOT ENTER PRIVATE OR SENSITIVE DATA INTO ChatGPT. Your data is used for training, and is reviewed by OpenAI to verify that content complies with their policies and safety requirements. They may be used for training purposes.

I asked ChatGPT what it knew about GEDCOMs with prompts: What is a GEDCOM file? What is the GEDCOM standard? What are the fields in the GEDCOM standard?

ChatGPT answered reasonably well, except that it confidently stated the latest version of GEDCOM being used was 5.5.1. This is understandable because ChatGPT’s training ended in 2021. (As of the writing of this blog post, the current version  is 7.0. For more information see the FamilySearch wiki entry for GEDCOM.)

Knowing that ChatGPT was using GEDCOM 5.5.1 was not a problem for these experiments.

Creating a GEDCOM

I would not choose to build a GEDCOM in this manner, but I could see how entering a narrative about ancestors into the prompt and let ChatGPT build the relationships from written language could be helpful. Beginning a family tree or adding a separate branch could be done by ChatGPT, then imported into a family tree program.

Investigating how effective ChatGPT was at creating a simple GEDCOM, I asked it to:

Create a GEDCOM file for James Charles McMahon, born 10 Oct 1920, father Joseph Francis McMahon, mother was Ella Small.

GEDCOM file from ChatGPT

ChatGPT extracted the information from my request and filled in the fields. I only asked for a simple GEDCOM file, and had been very specific in what details to include. ChatGPT did fine with this request. You can see the button to copy the code so that I could store it in a file with a .ged extension that would be usable by a family tree program that conformed to the GEDCOM specification. In fact, it even warned me:

ChatGPT warning

By the way, the clipboard next to the response lets a user copy the whole response so that a user can paste the response into the document of their choice. When clicked, the clipboard turns into a checkmark momentarily, then returns to being a clipboard. The thumbs up and thumbs down allow a user to provide additional feedback. If the feedback is thumbs down, another version of the reply is generated and a user has the opportunity to share whether the new one or previous response is better, or if they were the same. Giving feedback is always optional.

ChatGPT feedback

NOTE: This is a representation of an individual in a GEDCOM format and is not a file that can be directly imported into a family tree program. The header and footer information is not present, however, I could give ChatGPT that information and ask it to update the GEDCOM to include it.

I tried again with a new prompt that contained more details about the person’s life:

Create a GEDCOM file for James Charles McMahon, born 10 Oct 1920 in Brooklyn, Kings County, NY, father Joseph Francis McMahon, mother was Ella Small. James Charles McMahon died on 28 Nov 1987 in New York, New York, New York, US.

The response was filed the additional data correctly into the GEDCOM:

Updated GEDCOM file from ChatGPT

Using the GEDCOM as input to a family tree program

I asked for the file in a couple of different ways, but ChatGPT gave me only the section of the file for an individual. Rootsmagic had problems with importing this and creating a family tree, but after a little experimentation, I found that was because the was missing the header and trailer information. This was quickly remedied by editing the file.

It was interesting how the placeholder text for the birth and date information for the individual’s mother and father was inserted into the GEDCOM to be interpreted by the program. Of course, this could be fixed later in the conversation by asking for an updated GEDCOM with this information. As the chat went on, I also gave ChatGPT their marriage information and asked it to update the GEDCOM.

Creating a narrative from a GEDCOM

For my next experiment, I copied the second GEDCOM that ChatGPT had generated and fed it back into the prompt, asking:

Write a narrative for James Charles McMahon given his GEDCOM information:

0 @I1@ INDI

1 NAME James Charles /McMahon/


[the rest of the file is not shown for brevity]

ChatGPT had learned details from our previous conversation, and inserted details about the individual learned from previous GEDCOMS. Starting the request in a new conversation brought its knowledge about the individual back to the nothing and the story included only the information from the prompt.

Of course, ChatGPT only uses what I told it. In reality, this individual was not an only child. Interestingly, after it writes that he grew up in a family of three, with himself and his parents, he was depicted as a beloved brother. This is due to large language models relying on their training to build the next part of their output.

Next, I checked if the format of the input mattered to ChatGPT, and made the GEDCOM data into one continuous stream, rather than distinct lines, in my prompt:

Write a narrative for James Charles McMahon given his GEDCOM information:

0 @I1@ INDI 1 NAME James Charles /McMahon/ 1 SEX M [the rest of the file is not shown for brevity]

ChatGPT did not need lines of the file to be formatted; it interpreted the data correctly then wrote a narrative. (This is also true when entering data from a table into the prompt.) Without information about the individual’s parents death, the model built the text that they survived him, and in the same sentence that they were deceased before his passing. ChatGPT can appear to loose its mind, so always proofread any output before using it.

Next, I carved out the lines for this individual from a GEDCOM that had been exported from a family tree program, complete with source citations embedded in the code. This text was used it as input to ChatGPT, and I asked it again to write a narrative from the GEDCOM. ChatGPT was successful in capturing the details it knew. It also created some generalizations like: “Throughout his life, James was a beloved member of his family and community.” It also added context without being prompted: “Though we don’t have much information about his specific experiences, we can imagine that he lived through many significant moments in history, including World War II and the civil rights movement.”

The tales that ChatGPT weaves from a user’s input can be a combination of technically accurate and fanciful. The facts that are input can be woven into a smoother and grammatically correct output. Any additional text that ChatGPT generates or additional contextual content it adds does need to be verified. ChatGPT is a generative language model that creates sentences without judgement, and those facts are presented as correct.  (Always check the details that ChatGPT adds, as it may “hallucinate”!)

ChatGPT generates text with an optimistic tone. The tales do all seem to end on a positive note, reminding me of appending “and a good time was had by all” to a story.

As with any tool, how we used the output matters. ChatGPT has the flexibility to regenerate a response to our prompt, and we have the ability to edit the text as we see fit. This tool could be helpful to a genealogist trying to get started on that family history they have been planning to write. ChatGPT can help someone get around a writer’s block by providing a starting place. It can also proofread what you generate. All you have to do is ask.

It was instructive to see how the narrative text that was put into the prompt was translated into lines in the GEDCOM file. I enjoyed peaking under the hood of the implementation that is at the heart of family tree programs.

Let me know how you do, and send along any questions.

ChatGPT May 3 Version was used for these experiments. Expect ChatGPT to change over time as the technology matures.

Please check out other posts about ChatGPT and Artificial Intelligence:

5 Ways to Use ChatGPT to Research an Ancestor

Getting Started with ChatGPT

Artificial Intelligence and Genealogy

5 Ways to Use ChatGPT to Research an Ancestor

Blog post - 5 Ways to Use ChatGPT to Research an Ancestor

You may have been wondering how ChatGPT can help with genealogical research. This is a first look at using ChatGPT for research about a specific ancestor. For simplicity our conversation focused on where to find information, rather than on more complicated topics. ChatGPT held its own in our conversation, and was a pleasant companion and offered answers based on its training.

You can view our other posts about Artificial Intelligence, Artificial Intelligence and Genealogy and Getting Started with ChatGPT for when you are ready to try it!

NOTE: DO NOT ENTER PRIVATE OR SENSITIVE DATA INTO ChatGPT. Your data is used for training, and is reviewed by OpenAI to verify that content complies with their policies and safety requirements. They may be used for training purposes.

1. Ask ChatGPT general questions. Unless your ancestor is notable or famous, your main benefit will come from looking for general information about individuals with similar origins, living conditions or professions.

In my ongoing research into using ChatGPT for genealogical research, I decided to focus on one ancestor who has been a brick wall.

I started by asking if ChatGPT knew my ancestor. I did not expect an answer for an ancestor who was not a public figure, but I thought I would ask.

I'm sorry, but I don't have any information

2. Forming questions/prompts is an important part of getting the most out of conversations with ChatGPT. Sometimes you need to rephrase or reformulate your approach to obtaining information.

I thought that ChatGPT might have more general information about immigrants from County Sligo, Ireland, so I asked a more general question::

What can you tell me about immigrants from Sligo, Ireland to the United States?

ChatGPT answered with general information about Sligo immigrants, sharing their reasons for emigrating and where they tended to settle in the United States, and which popular professions they chose.

3. ChatGPT cannot footnote its answers. It can give sources that were used to build its knowledge base.

When I asked what sources ChatGPT what sources it used for the answer about Sligo. Since it is a trained artificial intelligence, and not a lookup service, this is like asking a person on the street to cite the sources for the statements they make in conversation.

What sources did you use for the above answer?

In response, I reformulated my question:

What sources do you recommend for researching an ancestor from Sligo?

ChatGPT offered five suggestions. This was more successful, until it was not. The suggestions were solid, but the details behind them were sometimes general and may not be up-to-date. (Remember, ChatGPT knows nothing of the world since 2021 and is NOT connected to the Internet.) The description it offered for civil registration records did not include the fact that many can be found online for free. The census records advice was factual about when censuses were conducted, but did not relate that only fragments exist for other than 1901 and 1911.

  1. Civil registration records
  2. Church records
  3. Census records
  4. Local history resources
  5. DNA testing

4. Given a list of facts, ChatGPT can write a smooth narrative.

Next, I entered a text version of a timeline for Timothy Gilroy’s life. The text was copied then pasted into the prompt.

Below is a timeline of events

ChatGPT interpreted the data correctly, and fed back a narrative incorporating the facts. This was how I gave ChatGPT the data for my next questions.

Thank you for timeline

5. ChatGPT can make good research suggestions. Treat ChatGPT’s research suggestions as hints. Be ready to investigate the leads it gives, keeping in mind that it may not know every aspect of every record set.

Remember that you are having a conversation with ChatGPT, and it remembers previous input in the same chat.

Next I asked: Can you suggest genealogical research ideas for Timothy Gilroy

ChatGPT pleasantly answered with a list of ideas. (Details of each item are omitted for brevity.)

  1. Research his family in Sligo County, Ireland
  2. Find records of his immigration
  3. Locate records of his military service
  4. Explore his occupation
  5. Investigate his naturalization record
  6. Look for church records
  7. Conduct DNA testing
These are just a few ideas

Since these were just “some” ideas, I asked: Do you have additional ideas?

ChatGPT was more than willing.

  1. Explore the neighborhood where he lived
  2. Investigate the history of Irish immigrants in Newport
  3. Search for newspaper articles
  4. Consult with local historical societies or genealogical societies
  5. Utilize online genealogical resources

ChatGPT is ready and willing to help us research our ancestors. We have to be clear in our conversations, and be ready to ask questions from different perspectives. Overall, the ideas ChatGPT offered were sound. Of course, they were general and did not have all the potential constraints. ChatGPT stressed learning about context in many of the research questions, which was good. Of course, be sure to use its suggestions as just that, and not definitive facts.

The above conversation was with ChatGPT Mar 23 Version. Free Research Preview.

Getting Started with ChatGPT

Blog post banner - Getting Started with ChatGPT

By now you have probably heard about OpenAI’s systen, ChatGPT. You can use the Preview Preview for free with an account. ChatGPT has a number of ways it can support the genealogical community, covered in Artificial Intelligence and Genealogy. It can also support your personal genealogical efforts, covered in other posts.

NOTE: DO NOT ENTER PRIVATE OR SENSITIVE DATA INTO ChatGPT. Your data is used for training, and is reviewed by OpenAI to verify that content complies with their policies and safety requirements. They may be used for training purposes.

Once you establish an account, using ChatGPT is as easy as typing in your questions or requests, which become the “prompts” to which ChatGPT generates responses. Underneath the hood, ChatGPT uses prompt engineering as part of its natural language processing capabilities to get meaningful responses from its models. Knowledge databases, texts, and other sources as well as an understanding of language has been used to train its neural network. When it has not been trained about a specific topic, it uses relevant information from external sources. ChatGPT answered a few questions about this for me. ChatGPT told me it did not search the web as humans would. In fact, ChatGPT is not connected to the Internet, and it has limited knowledge of world events after 2021. In response to another question, ChatGPT answered that it did not need question marks for it to understand that I asked a question, but that using them might clarify the input.

You can use the research preview of ChatGPT for free. You own the output that is created. The output from a paid or free plan can be reprinted, sold and merchandised.

To get a free research account select “Sign up” and follow the steps.

Welcome to ChatGPT

To sign up for an account, you have to provide your email address and a phone number. The email address and phone number do have to be validated before your account is activated.


The “Send a message…” box at the bottom is where to type a question or issue a request.

At the end of the generated response, you can continue the conversation by asking another question. You also have the option to select “Regenerate response” to make ChatGPT process the request again and generate another response.

Regenerate response button

NOTE: If you choose to REGENERATE RESPONSE, the original one will be replaced. So, if you are looking to combine or compare responses, be sure to copy the original response.

Your conversations will appear on the left side of the screen in a laptop or desktop browser. You have the option to edit the automatically assigned label for the chat, or delete it. There is also an option to begin a “New chat.” NOTE: Conversations with the Free Research Preview are reviewed to improve systems and to verify that content complies with their policies and safety requirements. They may be used for training purposes. You can request to delete your conversations from a link in the FAQ.

New chat, chat label

Here is an example where I started out with a simple question in my message prompt: What is a GEDCOM file?

ChatGPT example

ChatGPT answered this prompt. While it was answering, there was an option to “Stop generating” the response. Note the “Regeneration response” button at the bottom of the reply.

Example conversation

In the image above, you can see the thumbs up and down buttons so that you can provide feedback by about the answers.

The same prompts generated different responses, as evidenced by the regenerated responses. To see if the responses might be presented in a preplanned sequence, I asked a friend to enter the same prompt (different than the example given). The response she received certainly had similar elements, but the responses were definitely not the same. The responses were more different than rearranged words; the concepts were expressed in a different manner.

The conversations you have with ChatGPT can be saved through browser addons, but I found it far simpler to copy-and-paste into Word or Wordpad documents (for now).

As for how long my input prompt could be, I asked ChatGPT directly about that. The answer is 2048 tokens, which can be interpreted as characters. ChatGPT needs you to know that a spaces and punctuation marks count.

Input question

This technology really is impressive. I began with giving specific prompts but before too long I found myself falling into a pattern like conversation with the ChatAPT. It seemed very natural. I could also ask for clarification about a previous answer, or change the intent of my question. I could lead the conversation in different directions based on the responses. According to the ChatGPT FAQ,

ChatGPT Mar 14 Version of the Free Research Preview was used for examples in this tutorial. Future releases may have slightly different interfaces and options.

Artificial Intelligence and Genealogy

By now, you have probably heard about ChatGPT. This blog post will discuss how Artificial Intelligence (AI) is used in Genealogy with the help of ChatGPT.

In other posts I will cover how to use ChatGPT and some other AI tools that can help you in the pursuit of genealogy.

Genealogy is the study of family history and ancestry, and it has become increasingly popular in recent years. With the advancement of technology, researchers have been able to access more information about their ancestors, making the process of genealogy more accessible and convenient. Artificial intelligence (AI) has played a significant role in making genealogy research more efficient and effective.

AI is a technology that uses algorithms to mimic the human brain’s decision-making process. When it comes to genealogy, AI can be used to sift through large amounts of data, uncovering hidden connections, and providing insights that would have been difficult to find otherwise.

Here are some of the ways AI is being used in genealogy research:

  1. Record Linkage: Record linkage is a process that involves connecting different sources of data to create a comprehensive profile of an individual. AI algorithms can match and link various documents such as birth certificates, marriage licenses, and census data, making it easier to trace family lineage.
  2. Facial Recognition: Facial recognition technology can analyze photos and match them with other images in the database, creating a visual family tree. It can also be used to identify unknown ancestors in old family photos.
  3. DNA Analysis: AI can analyze DNA test results to find genetic matches and identify relationships between family members. It can help to identify distant cousins, uncover ethnic origins, and find long-lost family members.
  4. Translation: AI-powered translation tools can help researchers decipher and translate foreign language documents, which can be a valuable resource for uncovering family history in different parts of the world.
  5. Predictive Analysis: AI can analyze existing data to create predictive models of likely family connections. This can help researchers to identify family members they might not have known existed and to predict possible future discoveries.

In conclusion, AI has revolutionized the field of genealogy by enabling researchers to access and analyze vast amounts of data quickly and accurately. By using AI-powered tools and techniques, genealogy researchers can unlock a wealth of information about their ancestors and uncover hidden connections that would have been impossible to find otherwise. As the technology continues to evolve, it is likely that genealogy research will become even more accessible and exciting.

Book Review: “Generation by Generation”

Generation by Generation cover

With a wealth of knowledge and experience in researching, lecturing, and teaching others, Drew Smith has now turned his efforts to create a book for those who are beginning their genealogical research in the United States. “Generation by Generation: A Modern Approach to the Basics of Genealogy” is a concise way for new genealogists to benefit from Mr. Smith’s wisdom as well as enjoy his warm and approachable manner. He makes good use of analogies and examples so that the content is manageable by even the most novice researcher.

Part I of the book lays a solid foundation of key knowledge and skills a reader needs to conduct successful genealogical research. In Part II, readers are guided while they actually research their own ancestors. The book lends itself to navigating through its sections in order, supporting the reader with both a table of contents and an index.

The topics covered in Part I are important to understand and practice for successful research outcomes. Given that understanding cousin relationships can be tricky, the book is specific with regard to those relationships. Topics from changing calendars to DNA are presented clearly and painlessly throughout. As I was reviewing this section the book, I found that just as I would wonder, “will he tell beginners about…,” he did! The breadth of those examples ranged from genetic recombination and to ethics of DNA testing to the fact that the records of the Freedmen’s Bureau also include records of employees. The importance of introducing the genealogical research process and the Genealogical Proof Standard (GPS) to new genealogists cannot be overstated. The book conveys an appreciation of citing sources, while eliminating the fear of them.

A key part of researching using “Generation by Generation” is helping a new researcher travel back in time, organizing how they will research within each time frame of generations of ancestors. The book provides descriptions of which records are appropriate for each time frame. With Mr. Smith’s guidance, the researcher steps backwards through their US ancestors’ generations who lived during the time periods: 1950 to now, 1880-1950, 1850-1880, 1776-1850 and pre-1776 British America. Mr. Smith also supports readers as they start to tackle researching their ancestors back to their European or Canadian roots. These divisions are logical, and it would be straightforward to follow the book’s structure to approach personal research or formulate a syllabus for a class or study group.

Another feature is due to the printing process. The chapters that contain an odd number of pages include a blank page at the end. These blank pages are an ideal location to enter notes and record questions.

This is a book to both read and use. It is a way for a reader to bring Mr. Smith home and have him alongside while taking significant steps to research their family history. Using Part I to learn the main ideas and terminology, and pitfalls, prepares the reader to be ready to do their own research using Part II, and have a good foundation before advancing into more detailed research.

The book is available at and other booksellers.

Notes: A review copy of the book was provided by the publisher. Like many other genealogists, I am a fan of The Genealogy Guys podcast, and recognize both of its hosts for service to the genealogical community.

This blog post is copyright ©2023 by Margaret M. McMahon. All rights reserved. No part of this post may be reproduced in any manner whatsoever without written permission, except in the case of brief quotations in articles and reviews. All copyrights and trademarks mentioned herein are the possession of their respective owners and the author makes no claims of ownership by mention of the products that contain these marks.

Cultural Anthropology and Genealogy

Blog Header Cultural Anthropology and Genealogy

Cultural Anthropology

Last semester I took a third course in anthropology. After taking courses in Archaeology and Biological Anthropology, the next for me to tackle was Cultural Anthropology. (Our local community college does not offer a course in the fourth area of anthropology, linguistic anthropology.) Due to the nature of the subject material, this class was the least rooted in hard science. Cultural Anthropology studies how a society organizes itself. This is done through its beliefs, and how people live, think, create and find meaning. It introduces the concept that cultures have an intrinsic logic in their practices.

A big part of this branch of anthropology is fieldwork. Anthropologists in the field study societies, collecting data to build ethnographies. This data is often qualitative. Originally fieldworkers studied societies as impartial and distant observers; later they shifted to coming off the veranda to be participant observers.

When we go beyond our ancestors’ birth and death dates to fill in the dashes with what they did between those two dates, we are doing something similar to the fieldwork done by anthropologists. We often wish that we could go back in time to come off the veranda to be participant observers but lacking that option we can use the older anthropologists’ method of building their work on others’ first-hand source material. In our pursuit, we can use published sources that were contemporary to their times to learn about their culture at their time. When we research and write about our ancestors, we are building an ethnography. We can interact with the artifacts that they and their contemporaries left behind, which is like the activities of archaeologists.

Even though we cannot be participant observers in our ancestor’s society during their time, sometimes we can participate with a society that is close to theirs. This can be done through participating in ethnic crafts, cooking, dancing, clothing, reading the books they read, learning stories they told and heard, and learning about or practicing their beliefs.