For better or worse, I tend to think a lot about words when designing my system. A lot. I would say that close to 30% of my time is spent thinking about what to call something, what to name this table, what to name this field, this variable, or this application. What do you call this relationship, how do you describe how x interacts with y. There are several reasons for this.
It can make a big difference in how you model your objects in oop for starters. When you talk about has-a and is-a relationships, when you talk about cross-reference tables, and those sort of things, words really matter. It makes a huge difference when you use composition instead of inheritance, especially in the long term, so you want to put that thought in ahead of time.
It also makes a big difference in maintainability of code, both by you and the people who may come after you. By using the right words in your comments, documentation, and field names, it makes their lives much easier.
But sometimes, its really hard to think about the right words. And sometimes it doesn't matter nearly as much, but you still find it importaint to get it right. Here is just such a situation, one that so far I can't quite figure out, and was hoping someone else with at larger vocabulary will come along and help with.
There is two questions here, one about how to model a database structure in a table, and one on words, what to call something.
First the database structure question. Lets say you have two objects, that are on the same level, siblings for lack of a better term. You want to say that one is related to the other, and want to create a table to manage those relationships.
To use a real-world example, lets take an issue in a bug tracking database. You have issue A and issue B. They are similar, they both have to do with the same project, and they both affect the same area of the project. They are two seperate bugs, distinct, but they are related somehow. It is not a parent-child relationship, they are just "related". And lets say that all you really want to be able to do is reference one when you look at the other.
So we create a table like so:
tblIssueRelationships:
======================
issueRelationshipID
issueIDA
issueIDB
okay, so now you can write a record that looks like this to the table to tie the two together:
issueRelationshipID issueIDA issueIDB
1 1000 1001
great, so now, when I pull up issueIDA, I can easily include a call to a query like this:
SELECT
issueIDB relatedIssueID
FROM tblUssueRelationships
WHERE
issueIDA = 1000
and I will get 1001 as a relationship. Awesome. Except wait. What if I pull up issueIDB? If i run that same query, I wont get any records back... because 1001 is not in the issueIDA column.
Well what can we do? As far as I can see, I have two options, and each have thier disadvantages.
First, we could do a union query, first the query we have above, and then the converse of it:
SELECT
issueIDB relatedIssueID
FROM tblUssueRelationships
WHERE
issueIDA = 1001
UNION
SELECT
issueIDA relatedIssueID
FROM tblUssueRelationships
WHERE
issueIDB = 1001
Okay, now in both situations, we will get what we are after. But thats ugly... we have added seemingly "extra" logic to the query, which means we will sacrifice speed.
Our other option is every time we add a record to the table, we add its converse as well.
So we would instead have added two seperate records, like this:
issueRelationshipID issueIDA issueIDB
1 1000 1001
2 1001 1000
Remember, in this situation, A is related to B the same exact way that B is related to A.
Now, we only need to run the original query (from both angles) to get the disired effects:
SELECT
issueIDB relatedIssueID
FROM tblUssueRelationships
WHERE
issueIDA = 1001
Would also work if you ran issueIDA = 1000. Here, we are speeding up the query (over doing the union) but at the sacrifice of adding seemingly unnecessary records to the database. Every time you add a record you will be adding two instead. So you will be taking a speed hit on insert, update or delete, but you will be saving speed on your select (and in this situation, you are going to be selecting these records far more often). So what is the best option? Are there any other options here? [aside: for the record (no pun intended!), we are going to go with the second option, speed is going to be a major concern in this application later down the road, but we really want to know if you have a better idea!]
Now for the second question, about words. Lets take a different scenerio, this time, instead of issues, you have projects. You want to relate two projects together, except this time, they aren't siblings. Lets say you have project A and project B. Project A is a prerequisite of project B. So thats the relationship from project B's prospective, A is a prerequisite. But what do you call the relationship from A's prospective? B is not a postrequisite, A doesn't care if B ever happens or not. But they do have a relationship. If you were on a page listing out details about project A, and you wanted to show project B as a relationship, what would you call their relationship?
An example may be helpful. Lets talk about the database layout again. We would have a table that looked something like this:
tblProjectRelationships
=======================
projectRelationshipID
projectIDA
projectIDB
relationship
so, in this case, we are going to have a record that looks like:
projectRelationshipID projectIDA projectIDB relationship
1 'B' 'A' 'prerequisite'
2 'A' 'B' '____________'
okay, so 'B' has a prerequisite of 'A'. Now, what would the relationship between 'A' and 'B' be?
Alright, so maybe I am being a little anal-retentive about this, and I admit, I probably am. The fact of the matter here is, it doesn't matter to a hill of beans what I call the relationship between A and B, its only going to matter in my query. But eventually I or someone else is going to come along and have to look at this code.
So, any ideas? These little "word" problems are usually fun, but this one has me stumped!
Note: My email server still isn't working for whatever reason (mostly because im too busy thinking about words than working on it!) but I will be checking this entry frequently for a while to check for responses. You could also feel free to email me directly if you think you have an answer (ryanguill [ at ] gmail [dot] com). If I get some good answers I will write a follow up post in a couple days.
Posted on Mon. March 12, 2007 by Ryan Guill
#