"Without those 32 'no data' entries, the results don't really help with any understanding of the market in any way."
Why? There is no reason to believe, for example, that the users of a specific camera removed EXIF prior to submission any more than users of other cameras.
"What would be more interesting as a data set is the 72+K pool of users."
Obviously not. Why would anyone want to know what camera was used to take an unremarkable photograph?
There is no need for users of any specific camera to be systematically more likely to have removed the EXIF. There were, eg, 11 users of Canon 5D III, and 11 users of D5 and D810. If only 5/39 images with no EXIF were taken with a 5DIII that would be clearly number one camera. Is it
so implausible that the judges might be biased toward images of (say) civil conflict, and that (say) Leicas are best suited to that context, and the photographers
did systematically remove those EXIFs to avoid endangering the people portrayed?
Plus there were two sub-categories in each of the four genre categories, "singles", and "stories" with up to 10 images. It is illegitimate to treat the multiple images in the "stories" as independent and add up the total number of "prize-winning images" for each camera. The correct metric is cameras used by prize-winning photographers. (So, yes, the images with no exif could be from three or four cameras, in which case they would not have a large impact on the results).
I am sure the judges would be pleased by your vote of confidence, but equating prize-winning with "remarkable" and not prize-winning with "un-remarkable" is unjustified. Even if you knew what criteria were used to judge, why would you care? It isn't the cameras used for the best images you care about, it is the cameras used for the images that are best in regard to things cameras make a difference to. If there was a category for "Best Photo Needing Fast AF" I might be interested in which cameras were used, but there isn't. If you look at the collection (
https://www.worldpressphoto.org/collection/photo/2018) you can see that the success of
some of the images
may have depended on the camera - eg, the Photo of the Year, where faster and more accurate AF
might have made
some difference, although positioning - by luck or choice - was more important. I have no idea which camera was used for that image, but it is of no interest to know what camera was used unless the excellence of the picture is due to something the camera contributed. For many of the winning images, there is no question that any camera would have done as well, so the camera used is irrelevant.
This is not to say that data from the whole pool
would be useful: I don't see why I would care what cameras any number of photojournalists find best suited to their work, any more than I care what motor vehicles taxi drivers find best suited to theirs.