37
When modeling addresses in an application, naively followed the standard form when representing in the BD (i.e. separate tables for country, state, city, neighborhood etc. - joins everywhere), and in the user interface (html) I put a separate field for each part of that address (text box with autocomplete or combo box). At first I thought it was good that way, and I didn’t think much about it.
However, after doing a question in the UX.SE (relative to our standard of using "X Street" instead of "X Street") the answers led me to question whether it was worthwhile in practice to require such a level of detail in the representation. In addition to the additional complexity when searching/updating (as exemplified in that "pyramid"), do not know how the performance of the system will be when it contains a high number of addresses.
I’d like to know who you already have experience dealing with a large number of addresses, what practices would be recommended: leave everything normalized, use an open text field, condense some tables into one (eg.: cidade_estado_pais
) and separate others, etc. taking into account that:
- Few users will enter with many addresses, of other people (if it were every user entering with their own address once and that’s it, it wouldn’t justify investing in usability).
- If a part of the address already exists in the database (e.g., a previously registered street) the autocomplete can be used to speed up data entry; this would be more difficult if the address were an open field.
- Some data are easier to find and pre-popular (e.g., complete list of Brazilian cities), others are more difficult or more expensive - may be better than the user entering them on demand (but still allowing autocomplete).
- If a field is opened, it is more subject to duplication (e.g., "Av Foo", "Av. Foo", "Avenida Foo", "A. Foo", "Foo"); but duplication is not necessarily a problem...
- It is more difficult to aggregate in a denormalized field (e.g., if I want statistics by state, but Agrupei
cidade_estado_pais
in one field, I’ll have trouble).
Note: I know that an "optimal" solution varies from case to case (depending on the application), but I am not simply looking for opinions, in which every answer is valid - I would like to hear from those who have experience dealing with systems that contain a large number of addresses, for example to avoid problems that cannot be anticipated when their base is small, only appear after it has already grown in scale.
– mgibsonbr
I don’t understand what you’re asking. It’s about data modeling (database) or usability (user interface)??
– talles
@The question is about modeling, although the different representation decisions (normalize, not normalize) end up having a great impact on usability. In other words, I found it important to quote usability considerations here to help guide an appropriate response.
– mgibsonbr
https://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/
– Maniero