Site Overlay

What Are The Odds? Photo Sleuthing by the Numbers

By Kurt Luther

One of the great strengths of the Civil War Photo Sleuth software we’re developing is that it makes it easy to find soldiers who look very much like your mystery photo. The software uses face recognition technology to automatically search our archive of more than 15,000 identified soldier photos and return the closest matches.

The power of this feature, however, can be misleading. When finding a soldier with a similar-looking face to your unknown portrait is exponentially easier, the threat of confirmation bias looms even larger. With a few button clicks, you can find 100 similar-looking soldiers, and five or 10 look like dead-ringers. But now what?

Facial similarity is a key consideration in photo sleuthing. Rarely definitive, face recognition is fundamentally hard, for both humans and computers. According to a 2017 New York Times article about face recognition and law enforcement, “Numerous studies have shown that people will be wrong 10 to 30 percent of the time when asked to determine whether two photos of similar-looking strangers are the same person.” Even worse, the same article reported that greater experience does not meaningfully improve a person’s performance: seasoned detectives did no better than novices.

Computerized approaches to face recognition, such as the one used in our software, can prove faster and more systematic. But it has its limitations. Most crucial, they are designed to focus on facial features, especially the ratios between facial landmarks. This means that these systems tend to ignore important physical clues involving hair, ears, skin and other non-facial features.

The nature of our source material makes the task harder. The photographic techniques of the 19th century capture limited information about hair, skin and eye color. Profile views, vignettes and other period flourishes further obscure details. The subjects often appear physically homogeneous—young, white, male, of European ancestry—making them harder to tell apart. Yet, their faces changed over the four years of the conflict, making photos of the same soldier look different. War exacted a physical toll on the human body. Soldiers modified their diet, appearance and grooming habits, out of choice or necessity.

We have spent long nights discussing these challenges and potential solutions. We don’t want to release a tool that encourages false identifications. And, false information spreads incredibly fast online, as we’ve seen on genealogy sites like Ancestry and Find-A-Grave. Once out there, it’s almost impossible to eliminate it.

We have a variety of ideas in mind for promoting high-quality research on Civil War Photo Sleuth. One tool we’ve explored would provide statistical visualizations to help users weigh the likelihood of different theories, and measure the strength of their evidence. Although identification is almost always a subjective process, we can consult the probabilities through a bit of scientific objectivity.

A more quantitative understanding of rarity can inform photo sleuthing in several interconnected ways. Rare elements streamline the identification process, getting us more quickly to the point where we have an answer, or know we won’t get one. Because Civil War uniforms and equipment trend toward more highly standardized—with all the usual exceptions—we naturally focus on the soldier’s face as the primary source of uniqueness. Unfortunately however, faces are not always as unique as we might hope. Personally, I hesitate to identify any Union private in standard gear and typical facial features, or any civilian portrait. In a nation with an 1860 population of 15 million males, of which one in every five served in the conflict, the odds of a false positive remain just too high for me.

On the other hand, rare elements—or, more often, a rare combination of common elements—can quickly narrow down the possibilities. A distinctive feature per se, such as a Signal Corps badge or a Whitworth rifle, reveals almost at once which reference materials will be helpful, and those safely ignored. A combination of common features, such as a regiment numeral, a shell jacket and sergeant’s chevrons can lead us to an equally small pool of possibilities.

Either path leaves us at the mercy of the available reference imagery. Hopefully, it contains a photo with the face of the mystery soldier whose identity we seek. If not, we at least know what to look out for in the future.

A distinction should be made between contemporary rarity and rarity in modern archives. Some subjects, such as Union and Confederate generals, were relatively rare in practice—about 1,000 between both armies—but well represented in today’s reference sources. Others, such as USCT soldiers, unfortunately display the opposite pattern. While black soldiers represented a substantial portion of the Union war effort—about 180,000, or 10 percent of all enlistments—hardly any identified portraits survive. Comprehensive archives can offer at least two types of value with respect to rarity. In the best-case scenario, they contain the subject of your mystery photo. But almost as valuable, they may contain other soldiers that match your rare criteria, allowing you to apply the process of elimination more effectively.

You will next find some examples of statistics with respect to army branches, regiments, ranks and physical characteristics that can supplement facial similarity to narrow down identification.


Some branches of service rate much smaller than others. Overall, about 80 percent of Union soldiers served in infantry regiments, compared to 14 percent cavalry and 6 percent artillery. For Confederates, the proportion was similar: about 75 percent infantry versus 20 percent cavalry and 5 percent artillery. Specialized branches like engineers or sharpshooters are much smaller, with no more than a few thousand men each.

Branch proportions at the state level could be quite different from the above totals. In particular, proportions tended more balanced for states that contributed smaller numbers of men. For example, Colorado furnished three regiments each of infantry and cavalry. Therefore, a soldier from that territory was equally likely to serve in either.


States also furnished different numbers of soldiers. The six states of New York, Pennsylvania, Ohio, Illinois, Indiana and Massachusetts all contributed six-digits worth of men to the Union cause. New York alone contributed nearly 400,000, far outpacing the runner-up Pennsylvania, with a considerable 265,000. In contrast, states like Minnesota, Rhode Island and Kansas made significantly smaller contributions of less than 20,000 men each. All else being equal, if you don’t know where your Union soldier hails from, he’s probably a New Yorker. If you think he could come from Oregon, which raised a single infantry regiment, you’d better have good evidence.

Regrettably, detailed figures for Confederate state enlistment numbers are hard to come by. But we can make some rough estimates by comparing the number of military units organized. On the high end, the states of Virginia, North Carolina, Alabama, Georgia and Tennessee had in excess of 60 infantry regiments. On the low end, Florida and Kentucky raised about 10 regiments each, and Maryland just two.

It’s also helpful to account for when a regiment was mustered in, and for how long. Regiments formed later in the war or for shorter terms of service had fewer soldiers passing in and out of the ranks, and a tighter window of time in which to get photographed.

A company letter on hat brass can help narrow down the pool by a factor of 10, as most infantry regiments formed no more than 10 companies. Union heavy artillery units often had 12 companies, so the presence of “L” or “M” company letters can help differentiate.


Rank can serve as a strong indicator of uniqueness. It’s much harder to convincingly identify a private compared to a colonel, simply because privates are about a thousand times more common than colonels. Even within a single regiment, any average-looking private will likely have a doppelganger or two.

Field officers, namely colonels, lieutenant colonels and majors are among the most rare ranks, with a ratio of 1:1000 in their regiment. Most regiments had one of each at a time. Some had fewer—smaller regiments, especially Confederate ones, often lacked a full colonel. And some, had more—large heavy artillery regiments could have multiple majors.

Except for short-term or late-war regiments, multiple men, due to promotions, casualties and resignations, held most field officer ranks. For example, on the high end of the spectrum, six different men served as colonel of the 1st Michigan Infantry, a three-year regiment.

Other staff officers were equally rare. Each regiment had a surgeon, one or two assistant surgeons and a chaplain, whose distinct uniforms often make them easy to identify. Regiments also had one adjutant and one quartermaster, but they can be hard to visually distinguish from company officers.

A few non-commissioned officer ranks were also one in a thousand. Each regiment had a sergeant major, a quartermaster sergeant, a commissary sergeant and a hospital steward. All but the commissary sergeant had a unique rank insignia. Additionally, Confederate regiments had one principal musician, while Union regiments had two.

The adjacent ranks are relatively rare and easy to identify. The remaining ranks are at least an order of magnitude more common, and consequently harder to narrow down. A full-strength infantry regiment had 10 companies, each with its own captain, first lieutenant, one or two second lieutenants and a first sergeant, along with multiple sergeants and corporals, and dozens of privates. Given their exposure on the battlefield, these ranks also experienced significant turnover. Making a convincing identification for one of these ranks requires considerably more evidence.

Physical characteristics

Facial hair can provide some clues. A tongue-in-cheek, but purportedly accurate analysis of portraits of 123 Union and 95 Confederate commanders found that more than 90 percent had at least some facial hair. Among the 11 reported styles, including French Cut, Van Dyke and Muttonchops, the most popular was the Long Beard, evenly split between both armies, and accounting for about 24 percent of all officers in the sample. In contrast, Fred Adolphus’s sobering analysis of field photographs of 15 deceased Confederate enlisted men at Fort Mahone found all either clean-shaven or had close-cropped beards, with “no stylized mustaches, goatees, side whiskers or long beards.” Taken together, these studies support the intuition that, all else being equal, a soldier with elaborate facial hair is more likely an officer.

Height can also be a useful differentiator. According to William F. Fox’s analysis of more than one million Union enlistment records, the average height of a soldier was 5-feet-8 ¼ inches. Only 3,613 men out of the million he analyzed stood taller than six feet, three inches. Comparing the heights of objects or comrades pictured with them, and then consulting enlistment records, can identify taller men. For example, in a 2012 NPR article, Sam Small of The Horse Soldier identified a 14th Brooklyn soldier in a Library of Congress tintype by estimating his height from the rifle he stood beside.


The stats presented in this column offer a sample of the kind of information we hope to provide to all users of Civil War Photo Sleuth in the form of interactive visualizations. Experienced photo sleuths may have already internalized many of them. With practice, one can learn to evaluate at a glance, before conducting any research, whether a convincing identification is likely or even possible, and prioritize accordingly.

Ultimately, the software is just a tool, and identification remains a human process. Likewise, no magic number or confidence level exists for what is unanimously considered an airtight identification. Aside from finding an exact copy identified in a reputable archive, there will always be elements of subjectivity and intuition. A better understanding of rarity and proportion, however, can help us justify our assumptions to others, and guide our investigation down the most promising path.

We encourage you to pick up the torch to continue this investigation and, as always, submit other photo mysteries to be investigated as well as summaries of your best success stories to MI via email. Please also check out our Facebook page, Civil War Photo Sleuth, to continue the discussion online.

Kurt Luther is an assistant professor of computer science and, by courtesy, history at Virginia Tech. He writes and speaks about ways that technology can support historical research, education and preservation.

© Military Images Magazine. The contents of this page may not be reproduced in whole or part without the written consent of the publisher. Views expressed by the authors do not necessarily represent those of Military Images or Military Images, LLC.

Scroll Up