be a researcher, get millions of images that you do not personally have to view from a national public census registry or something to that effect, which shouldn't(?) be too difficult, have it analyzed by a picture recognition system that takes into account depth and contrast and all that kinds of shit, the different facial features will be assigned different geometrical, symmetrical, etc values in relation to one another. You will then have a million geometric representations of faces. Divide between simple to recognize things like skin color or size first, then it could go all the way down to an individual shape being too dissimilar from set A but similar to set X by geometric relational analysis. you're gonna end up with a fuckload of tl;dr factors that you probably will never actually analyze. but you can use factor analysis to take those divisions and put them back together into higher order factors. the highest order factor will be the one that is most internally consistent(?) and not innumerable. those are the different human races. having all of these various factors analyzed alongside the genome of the person would likely lead to a lot of insight on the human genome. we'd have humans properly instead of arbitrarily subclassed, and will have acquired a lot of depth-y research on genetics.