intrusive collections

Each of my dbcdb web pages corresponds to an instance of a class called Entity. And each entity has a key, which is the number used in the web page. The class Collection represents the collection of all the entities.

Currently, each entity knows its key. As part of the change that I’m about to make, I’ll have to have each entity know the collection that it’s part of, too: right now, each series, for example, keeps a list of the volumes that are part of it, but I want that information to be found by querying the collection instead. After a bit of thought, I decided that the right way to handle this is to just stick the collection in the key (which is, fortunately, already a class instead of a plain int), rather than pass the collection in as another constructor argument.

Last night, though, I realized that my thoughts on this matter are somewhat inconsistent: in the past, I’ve held a strong conviction that members of a collection shouldn’t store any sort of indexing information, shouldn’t know that they’re part of a collection at all. So what are these keys doing there, and why am I making matters worse by having the members be able to get at the entire collection?

My reasons for that in the past were that it’s a violation of a proper separation of concerns, that it leads to potential inconsistency, and it makes collections inappropriately non-generic. Which are all good reasons – for example, I wouldn’t claim that it makes much sense for elements of an array to have to know their indices. (Nor would anybody else, but when it gets to lists, there’s less of a consensus.)

The collection in question doesn’t claim to be generic, so the third reason isn’t relevant. The second reason is, I think, not at all likely to be a problem in practice in this case, though I suppose I’ll make some member variables final just for the sake of documentation. The first reason is potentially a show-stopper, though. So: why did I make the key accessible from the entity in the first place?

It’s really only used for one reason: each entity has a method called pageLink that prints a representation of an HTML link to that entity. And the href refers to the key, so that method has to have access to the key somehow. But it could get it from an argument, perhaps. After all, I was considering passing in the collection when generating the list of volumes contained in a series, instead of having the series extract the collection from a key; maybe this whole argument is a sign that I should go down that route instead.

For the pageLink case, though, that won’t work: an entity never calls pageLink on itself, it only calls pageLink on other entities that it knows about. For example, a book has a data member storing its author, and it calls author_->pageLink() when generating the Author: line on its HTML page. But the book doesn’t know what the author’s key is, it just knows about the author entity. So either it has to ask the author for its key, store the author’s key instead of (or in addition to) the author entity, or ask the collection what the author’s key is.

The second suggestion just sounds weird to me, valorizing the keys over the entities. The third suggestion might be sensible in some contexts, but in this case I can’t see any concrete benefit to it. So I’ll stick with the first suggestion.

So: what’s the lesson? One, if you have to do a back mapping from elements of the collection to indices, then storing that back mapping in the element might make sense in a special-case collection. Two, don’t be dogmatic. Three, don’t feel too guilty about being dogmatic: strongly held instincts that can be justified and whose justifications can be compared against particular situations are okay.

Post Revisions:

There are no revisions for this post.

Published 3/29/2006 & Filed in Programming

the mac is here »
« art museums

malvasia bianca