Currently, I’m performing SQL queries that look something like this:
select id from books where author_id in
(select * from
((select 106 id) union all
(select id from compound_authors where author_id = 106))
as extended_authors);
I have a hard time believing that the select * from
bit is really necessary, but I can’t find a way to get rid of it without introducing a syntax error. The lesson, I suppose, is that I should learn SQL syntax, but does anybody know the right way to write this?
Post Revisions:
There are no revisions for this post.
That definitely looks funky, but it might be easier if you tried telling us what you are trying to do… Exactly what is the table layout and what are you searching for?
4/8/2006 @ 9:59 pm
I have an author that I’m interested in. (In the example above, it’s author 106). And I want to find all the books that were written either individually or jointly by that author. So I’m looking for rows in the table ‘books’; that table has an ‘author_id’ column, and I want to find rows whose author_id is either the specific author_id that I have in hand or the id associated to a compound author, one of whose members is the specific author I have in hand.
So I wanted to do ‘select id from books where author_id in ???’, where ‘???’ is a table with a single column (called ‘id’); one row in the table should be my specific author_id, and the other rows should be compound authors associated to that person.
Which expresses itself naturally as a union. My problem is that I don’t understand why I have to stick a ‘select * from’ in front of that union – why isn’t the union alone enough?
4/9/2006 @ 3:22 pm
It sounds like you have a 1-to-many relationship between books and authors, in which case I would simply not have an author field in the book. Sounds like you already have an author table, simply have that be all your author information.
That would make your searches simple inner joins, like:
… FROM books INNER JOIN authors ON ( books.id = authors.book_id );
Being a postgres user I would define that as a view, like:
CREATE VIEW bookauthor AS SELECT books.name as name, authors.id AS author_id FROM books INNER JOIN authors ON ( books.id = authors.book_id );
and be able to write stuff like:
SELECT DISTINCT name FROM bookauthor WHERE (author_id = 106);
but I don’t think MySQL has gotten around to supporting views, so you will probably have to do slightly more typing :-)
You can look for books where 106 is the sole author by doing something like:
SELECT name, COUNT( name ) AS count WHERE ((author_id = 106) and (count = 1 ));
I didn’t test any of this, so there are probably a ton of errors of various types, but I think the general approach should be good.
4/9/2006 @ 7:38 pm
Because there’s no book_id field in the author table – one author can be associated to many books, so the book-author relation is stored in the book table. (As this example shows, one book can also be associated to many authors, but not all authors are created equal in that case – the “joint author” takes priority.)
That view thing sounds great, though! And it looks like MySQL does support it now (maybe it’s new in 5.0, I don’t know). Wow. Looks like I have some refactoring in my future…
4/9/2006 @ 7:49 pm
ok, but the many to many problem is very similar. Create a book table, like before:
CREATE TABLE book (
id INTEGER,
title TEXT,
…
);
Create an author table, something like:
CREATE TABLE author (
id INTEGER,
name TEXT,
…
);
Then create an association table:
CREATE TABLE book_author_association (
book_id INTEGER REFERENCES( book.id ),
author_id INTEGER REFERENCES( author.id )
);
The stuff in my previous comment works for this as well. If you want to capture the author you can do another INNER JOIN to include the author fields you want.
4/9/2006 @ 9:21 pm
Why is that better than having the author a column in the book table? I think maybe I’m not expressing myself very well – while a book sometimes gets mapped to many authors, it always has one primary author: if a book is written by A and B, then I want to have HTML pages for the book, for “A and B”, for A, and for B. And I want the book’s HTML page to only point to the HTML page for “A and B” (though I can get from that latter page to the pages for A and for B), but I want the book to show up on the pages for “A and B”, A, and B.
Here’s an example, in case that helps.
4/9/2006 @ 10:06 pm
For a number of reasons.
First, it simplifies your system. No need to write stuff like what you were suggesting originally. Best reason I can think of.
Second, it is actually more effecient. One search is better than two.
Third, it is normalized. Normalized is good.
Fourth, it is more flexible. Want to assign different types of writers? Extend the association table with an involvement field. Want to find all books where someone is the primary author? Where someone was a researcher? It is suddenly easy to add to your search.
But, it is your system. People do things differently. Your original suggestion does not seem to be legal in postgres and I am not sure I completely understand what it is doing. I would try something like:
(SELECT id FROM books WHERE ( author=106)) UNION (SELECT id FROM compound_authors WHERE (author_id=106));
Which seems a lot simpler. You can hack your way around having 106 two places, but that is another matter.
4/9/2006 @ 11:13 pm