You have voting records containing a free-text city field. The same city may appear in many forms (e.g., "NYC", "New York", "New York City"), and you must aggregate votes by canonical city reliably.
Design an approach to cluster or normalize city-name variants into canonical entities so votes aggregate correctly.
Describe:
Assume you can use authoritative gazetteers (e.g., national census/OSM/GeoNames) that list canonical city IDs, names, alternative names, and geographies (state/county/country), and that some contextual fields (e.g., state, ZIP) may be present in the voting data.
Login required