Surname Study and AI Part 1: The Approach

log banner - Blog Post Surname and AI 1

This blog post begins a series of posts exploring an ongoing surname study and my recent use of artificial intelligence (AI) in it. In this post, I will describe the history of getting to this point in my efforts.

Over the course of several years, I have been working on a surname study. My goal was to find out if and how families who lived in Rhode Island from 1850-1900 were connected. Chain migration to the United States from Ireland was entirely likely, and by connecting these family units I could potentially research collateral relatives to learn more about the family unit(s) back in Ireland.

Using what I had learned from researching my direct ancestors, these were the parameters:

  • Surname: Gilroy
  • Place: Rhode Island, US
  • Timeframe: 1850-1900

For this project, I collected both federal and census data to use as the backbone of the research. Then I built upon the intermediate years using vital records. I faced some challenges when collecting the data. At that time, Rhode Island Censuses and vital records were obtained by mailing requests to an incredibly helpful and knowledgeable staff at the Rhode Island State Archives. Copies of the records were available for modest fees, but you did require data about the record you sought. (Contrast that with the ability to search for everyone with the same or similar name in a record set through a digital database.) At the time that meant that some of the names came from index-only databases as place holders until copies of the original records could be found. An index of vital records for the state was available on Ancestry, as were a composite of indexed city directories which formed an 1890 US Census substitute.

Another challenge was correlating dissimilar data. Just as every federal census asks different questions, so does every state census. Vital records change what data is recorded over time, too. The data found in city directories is also different from the other records, containing addresses and occupations but lacking explicit family connections.

My main product was an Excel spreadsheet with tabs for the data collected from each record type by year. I worked to reconcile the different data collected from similar record types. From that spreadsheet, I extracted family units, capturing them in PowerPoint to visually show how the family units changed over time. This gave me some insights but was labor intensive. I contemplated my next steps, knowing that analyses of ages, appearances of people with the same surnames in Rhode Island, and child naming patterns, as well as mapping the neighborhoods were among them.

Fast-forward to now, when more records are available online. For example, in addition to the vital record indexes, images of the RI vital record ledgers are now online. The Rhode Island state censuses are also online. And then there is AI to help with formatting, visualizing and analyzing data.  

Some challenges still exist. There were gaps in census coverage, due to the 1890 US Population Census and the 1895 Rhode Island Census no longer being available. The use of other record types will help to fill in the census gaps. A state-specific challenge is the fact that the 1885 Rhode Island Census is available as an alphabetized index of names, requiring family units to be connected using data in the “Family Number” column.

The state of AI is constantly changing, but I decided to investigate how AI could help this the collection and analysis of data. 

I did try an analysis of the whole spreadsheet in ChatGPT, and I had been able to create family groups and use them to discriminate between some people who had the same name. However, the data was not combined in an efficient manner, and rather than have one large spreadsheet, I decided it would be more understandable to break the data into more manageable pieces, based on the record types. The composite spreadsheet was broken down into different spreadsheets: (1) censuses, (2) births, marriages, and deaths and (3) city directories. I also decided to use AI to help with the data collection process, the analysis and different ways to visualize the data.

At the end of this step: I had a basic plan to redo the data collection, collect additional data that had become available online, and developed ideas on how AI could support this study. The next step will be to use only census date and have AI create the backbone of a timeline for the individuals and families.

AI: Meta Prompting

Blog Banner AI Meta Prompting

If you have attended one of my AI presentations, then you know how important it is to develop prompt engineering skills to get the most out of Large Language Models (LLMs). The good news is that we do not always have to create the perfect prompt on our own!   

There is a harsh term used in my field, GIGO, which stands for Garbage In, Garbage Out. When it comes to AIs, this applies to the fact that the LLM response (output) will only be as good as our prompts (input).  

A simple explanation of meta prompting is to have one Large Language Model (LLM) create a prompt for another one. Meta prompting is more involved than that because it builds a prompt with more specific instructions about the steps to take to realize the goal of the prompt. It is as if the LLM is translating what you want to do into LLM language!

The cinematic arts student at my home gave me some insights into his practical use of meta prompting. He was having an issue with an AI that generates video. It was not creating what he was describing, so he turned to ChatGPT to explain his vision and ask for a prompt to use for generating that image. ChatGPT dutifully responded with a prompt that did work with the AI video generator. The message is that when it comes to crafting prompts, we are not on our own.

While working to understand meta prompting, I thought of an example application to try before applying this skill to genealogy. I asked ChatGPT to create a prompt for me that I could use to have a research report generated for me about a topic. I also specified what and how I wanted to investigate the topic, as well as the fact that I wanted sources and in-text citations. Using the power of the AI to recognize patterns, I certainly wanted analysis to be part of generating the data in the report.

Prompt for a prompt to generate a world building prompt

A prompt was created, but ChatGPT had some specific questions that it included in its response about the type of citation I wanted and asked if there were other constraints, such as word count or including quotes. We had a conversation to refine the prompt, starting with a 308-word prompt and concluding with the final response which was a modular, reusable 1122-word prompt.

The prompt began with: “You are an expert in …

The prompt contained sections for FOCUS & SCOPE, RESEARCH & SOURCES, STRUCTURE OF THE REPORT, STYLE & LENGTH and FINAL OUTPUT

ChatGPT’s prompt also included some interesting anti-hallucination guidance: “If there are areas where evidence is limited (for instance, few direct author comments about a particular name), clearly indicate uncertainty and base comments on reasonable inference, not fabrication.”

I decided to use the prompt in ChatGPT, and opened a new chat. I pasted in the prompt, and it responded with a request for clarification:

ChatGPT asks for clarification

It offered me options, providing details, which are omitted for brevity:

  • Option A — Use only 100% verifiable, well-known, widely documented sources
  • Option B — Allow me to cite plausible but harder-to-verify sources
  • Option C — A blended approach

Then it asked me to respond with which option it should use:

ChatGPT asks for which option to use

After the clarification interaction, ChatGPT told me that

ChatGPT advising me of a long reply

It waited for my response before it began to generate the report:

My response to generate the report

The report was reasonable, and described patterns. ChatGPT offered me formats for downloading the report and other products based on the report, an executive summary and PowerPoint presentations. If I want to dig deeper, this report is valuable to me as a starting place.

Of course, the caveats still remain about not using this for school reports (unless the assignment calls for the use of AI) and not submitting it to a client. There can be tell-tale signs of an AI-generated report, as I know from a high school science fair project done by that same cinematic arts student, and documentation out on the web.

So, will you try meta prompting? Let me know how you do.

NCGS Fall Conference 2025

Blog Post Banner NCGS Fall Conference 2025

Recently I had the pleasure of presenting at, and attending, the North Carolina Genealogical Society Fall Conference 2025. The Conference was very well planned and organized at a wonderful venue with great food. As much as I appreciate the reach of virtual presentations to give presentations at many places far from where I am based, it was nice to be with a group of genealogists, learning and chatting.  

At the Conference, I presented sessions about Military Research and Artificial Intelligence (AI). When speaking about military research, I always customize my presentation to include finding military records for the location of the audience. North Carolina has great resources, both in person and online!

NCGS Military Presentation - Cover

With a Ph.D. in Computer Science and Engineering, I am always reaching deep into the technology of AI to learn its inner workings, and to then share an understanding of how it works and how to use it. As a graduate school professor in cybersecurity, and having tested computer code used on military aircraft for years, I also have a perspective about what we should be concerned about and what can go wrong.

Ancestors, AI and Prompt Engineering NCGS - COVER

What was also fantastic about the Conference was that people could attend the lectures virtually. The NCGS members and technical staff streamed the presentations and recorded them for attendees to watch later. I knew everything was working when questions from online viewers came during the lectures and insightful questions via email were waiting when I returned to my hotel.

Even though my research in North Carolina is limited to a few months during WWII at Camp Davis, I did attend J. Mark Lowe’s presentation, “Creating North Carolina Local and Regional Locality Guides.” (Mark’s smile is even bigger in person!) The presentation definitely had information that I will carry forward to the places where I do research. I will never look at detailed maps the same way again.

I attended another terrific presentation about using DNA to solve maternal surnames by Kate Penney Howard. Jon Smith’s workshop about using AI for creating locality guides certainly shifted my mindset from the free form text I have been using, and his tips about using Gemini in Chrome tabs were game changers. Thankfully the presentations were recorded so that I can enjoy Diane L. Richard’s presentation about Researching Your Ancestors as Kids. (Diane and I share an educational experience: Go RPI Engineers!)

The beginning-to-intermediate artificial intelligence presentation I gave on the first morning may have provided a warm-up for Steve Little’s intermediate artificial intelligence presentation. It is always interesting to see how other genealogists are using AI tools, and how its use is gaining acceptance. Promise to keep checking your output and stay sensitive to privacy concerns!

Thank you to everyone who planned and worked on making the 2025 North Carolina Genealogical Society Annual Conference such a great experience, to the audience members who shared their time with me, and all the other instructors and attendees for a rewarding and fun time!

Recent AI Developments

Blog post banner Recent AI Developments

Have you been following the latest in AI?

One thing I always guarantee during my presentations is that AI models will change! There have been changes to ChatGPT’s video generating model, Sora. As a result, I don’t see Sora anymore when I login to my Plus account on ChatGPT. Now I have to login separately to use Sora. Part of the change is that Sora 2 is now available! Pro users can use it now, but as a Plus user, it may be a while before I get a chance. You can read about the new video model at: https://openai.com/index/sora-2/

An AI ‘actor’ known as Tilly Norwood has been provoking Hollywood. She is a purely AI-generated character coming from Xicoia, the AI division of Particle6. You can watch her, and a cast of AI-generated characters in a sketch written by ChatGPT: AI Commissioner | Comedy Sketch | Particle6

When exploring the world of copyright and artificial intelligence, you may want to check out the U.S Copyright Office’s 3-part Report on Copyright and Artificial Intelligence that can be viewed and downloaded at https://www.copyright.gov/ai/ Purely AI-generated content is not protected by copyright. There has to be a human contribution.

AI copyright infringement lawsuits continue, with the latest one being Warner Bros. Discovery against Midjourney, an AI image generator. You can read about it at:  https://apnews.com/article/warner-bros-midjourney-ai-copyright-lawsuit-dc-studios-b87d80d7b4a4dfdcf0ee149d30830551 This article describes how this AI can output images that violate copyright.

Meanwhile, some lawsuits are drawing to a close. Although a judge stated that Anthropic AI training a model using authors’ material was fair use, the problem was that it used pirated versions of the books for that training. Anthropic agreed to pay $1.5 billion to settle this copyright infringement lawsuit, but the court will need to approve this settlement. If approved, the authors of over 500,000 books will each receive about $3,000. You can read an NPR article about it: “Anthropic settles with authors in first-of-its-kind AI copyright infringement lawsuit” at https://www.npr.org/2025/09/05/nx-s1-5529404/anthropic-settlement-authors-copyright-ai.

NCGS 2025 Fall Conference

NCGS Fall Genealogical Society 2025 Fall Conference ad

Will I see you there?

I am excited to be invited to present in person and online!

On Friday, I will be presenting Ancestors, AI, and Prompt Engineering.

NCGS Fall Genealogical Society 2025 Fall Conference McMahon AI

On Saturday, I will be presenting a Crash Course in Researching Ancestors in the US Military.

NCGS Fall Genealogical Society 2025 Fall Conference McMahon Military Research

There are great speakers, and great talks, Friday and Saturday. There is also an optional Beginner Day on Thursday, featuring four lectures just for beginners!

NCGS Fall Genealogical Society 2025 Fall Conference Beginner Day Ad

Rev. Fr. Kennedy and AI

Blog Post banner Rev Fr Kennedy and AI

It has been a while since there has been a blog post. In that time, I have been working on my newest presentation, Mining Morning Reports for Genealogical Gold. You can read a review here: https://aweekofgenealogy.com/comments

In addition to getting ready for other presentations, I have also been experimenting with the NARA Catalog API to get an alternate way of searching the catalog.

I did spend some time with AI offerings in my research into the Rev. Fr. Thomas J. Kennedy.

First, I uploaded the sketch that I have of him from the newspaper to ChatGPT and prompted it to: Change this line drawing into a picture

A few liberties were taken by the built-in DALL·E image generation system when creating this image. In the sketch it does appear that he is probably wearing a cassock of the time, but the details of buttons and the notch in the collar are not evident in the sketch.

I may need to try this process again with a stricter prompt to rein in ChatGPT’s creative vision.

I looked up his eye color recorded in a Civil War roster and asked in a follow-on prompt asked: can the image be changed so that his eyes are more grey

The resulting image looked less like the sketch.

Since the Rev. Fr. Kennedy was dying at the time of the column in the Brooklyn Eagle, it finally occurred to me that there must have been a photo of him that was used as the basis of this sketch. I have not located one yet. This image also looks like that of a younger man. My focus has been on the data, but it seems I may need to be searching for the original photo of him. Does the original photo still exist? (Although the Archivist at the Diocesan Archives of the Roman Catholic Diocese of Brooklyn was very helpful, they did not have a deceased priest personnel file for him at in their archives because he had died in Kentucky and not in Brooklyn.)

Using the original sketch, I did a Google Image search at https://images.google.com, adding the search terms: Kennedy Brooklyn

Naturally, our blog posts showed up, and data about the life of the Rev Fr. Thomas J. Kennedy extracted from the blog posts appeared in the AI summary. Many of the photos that were returned in the results were of men religious of all different faiths.

The “Dive Deeper in AI Mode” button that appeared at the end of the AI Summary made me curious, so I clicked on it. Gemini let me know the number of sites it was searching, and informed me about two sites: our blog and the New York Times. There was an article from the NY Times dated Oct. 5, 1901: “Rev. T.J. Kennedy Said to be Dying.”

Our county library has a subscription to the ProQuest Historical Newspapers, which includes the New York Times, so I logged in and searched for the article using these search terms:

Rev. T.J. Kennedy Said to be Dying 1901

There were three results, two of which were ads from the 1970s.

The New York Times article was succinct and did not offer more information than the article in the Brooklyn Eagle. It was actually published several days after his death in Kentucky. It mentioned that he retired about a year ago, and that his ill health for the reason for his pension. He was in Kentucky, at a Trappist Monastery. He was well-known in the Grand Army of the Republic (GAR) circles.

Of course I downloaded a pdf file with the article, a pdf file with the whole newspaper page, and a (brief) citation in Chicago style: “Rev. T.J. Kennedy Said to be Dying.” 1901., Oct 05 New York Times (1857-1922), 9. https://www.proquest.com/newspapers/rev-t-j-kennedy-said-be-dying/docview/96159883/se-2. (Further reproduction of New York Times articles is prohibited without permission.)

There is certainly more to do to fill in this ancestor’s story, but the use of the AI tools ChatGPT and Gemini inspired both my creativity and my next steps in the research.