For AEM architects and developers concepts of data modeling should be second nature. As an example the differences between a hierarchy of data and references to data should be second nature. A clean demonstration of this would be a news feed. There are many ways to organize the posts that could go into a news feed but one simple method would be to have the list of posts as children of the news feed. This could come in the form of a news page with news post pages under it in a content hierarchy.
Another simple way of doing this would be to use a reference of some sort where the news feed itself contains references to news posts. Using these references the news feed could then find and display the posts. One benefit of this would be that the location of the posts would not matter. In the context of AEM this could come in the form of a content path or a tag and in the context of a relational database this could come in the form of a foreign key.
Another example of using hierarchy would be a navigation that displays its child and grandchild pages in a convenient way. Another example of a reference would be a simple link to a page. You might have a call to action style component that is configured to link to another page in which case the hierarchy of content does not matter but rather the existence of the reference. In either case these are fairly straightforward and simple approaches to content organization. There could of course be more involved approaches that could cover a wider array of business requirements using a search framework such as JCR queries or Solr.
Using references makes the relationship between the pages and components apparent in the data itself. However being tied to the data makes it harder to update this behavior later as the content may also need to change. Using hierarchy makes the relationship harder to see as it is defined in the business logic. All of this should of course be in the minds of the developers as they translate business requirements into data models and business logic. However to what degree should these concepts be in the minds of authors?
Manifold factors weigh into answering this question. To name only a couple would be the experience and skill level of the authors and the degree to which the product owners and subject matter experts want fine tuned control over the behavior of the system. Regardless of these circumstantial factors I would argue that at the very least a conceptual model of these ideas should be explained and taught to the content authors. Very often developers only want to explain the output of the system and not the data organization that makes it possible. Content authors often follow suit and do not care to develop within their own minds the conceptual model of the data that they are interacting with.
The reason that the content creators as well as the product creators need to have this conceptual model in their own minds is because they both join together to create the data that the world sees. This data is the real product, the true gem that is being mined. The tools developed that create and display the data are transient things that will come and go. If done properly the data that is being created in this process is what is most valuable but often the very thing that is most neglected. New features are exciting for both the developers and users while purposeful and controlled data creation is difficult. The former can be easily enhanced or replaced, the latter is in some sense irreplaceable.
Developers should be concerned with data integrity with respect to the limits they place upon the creation of content. However content creators should be concerned with data integrity as well with respect to how they use the tools that they have been given. Empowering them for this long term task is essential for optimizing the usage of any content management system. They should understand how the content that they see on their pages arrives there. In addition they should understand how their interactions with the content management system will affect not only what they see in the location that they are making the update but also how it drives content and functionality elsewhere. Giving this ability to the content creators of understanding how local changes affects global functionality should be on the forefront of anyone designing or implementing a content management solution.