synonyms combined with tag table in a many to many relation database

I'm using a song database system to learn php with mysql on a many to many relationship database. Below is displayed how the three tables are organized:

Songs       Link        Tags
=======     =====       =======
Sid          Sid        Tid
Songname     Tid        Tagname

Now I was thinking what if there are tag synonyms? As i am calculating a matching percentage for each song combined with the tags eneterred. I don't want to enter 20 tags for 1 song, so i cover all posibilities, but therefore ruining the matching percentage because 4 tags were enterred and this resulting in 4/20 match aka 20% whereas it maybe had to be 4/8 (50%).

Is there a smart way to create a table for synonyms, making sure the tags are 'groupped', so that the enterred tags are first checked with this table and then this/these 'head tag(s)' get matched with the link table resulting in songname and a matching percentage of tags enterred / total amount of 'head tags' × 100%?

And on top of that how would you create a query that when a new song is enterred, new tags are enterred into a group they belong to and existing ones get skipped. While linking this head tag also to the new song?

I'm not sure if this idea is anything realistic, but I wonder how other systems would do this to still make sure you get an accurate match while also taking synonyms into account.

Answers


You could add a third row to table Tags:

Tags
--------------
Tid
Tagname
HeadTagId

So every Tag would refer to a HeadTag, and HeadTags would refer to themselves.

Regarding queries, I imagine two scenarios. Either

- tags are predefined: when a new song is entered, you just add one row to Songs and as many to Link as the tags it has. No need to touch table Tags.

or

- new tags can be defined by users: when a new song is entered, check if tags already exist and, if not, the user should be able to enter new tags, and define them either as head tags or choose an existing 'head tag' for them (i.e. a tag referenced in the third row of Tags).

In any case, to retrieve head tags for a given song, you would need something similar to:

SELECT HeadTagId FROM Tags
JOIN Link ON Tags.Tid=Link.Tid
WHERE Link.Sid=12345

and if you want the name of the head_tag directly (instead of the Id):

SELECT Tagname FROM Tags AS TagParents
JOIN Tags AS TagChildren ON TagParents.Tid=TagChildren.HeadTagId
JOIN Link ON Link.Tid=TagChildren.Tid
WHERE Link.Sid=12345

Edit: You don't need a many-to-many relation in this case, because each tag must either declare that it is a main tag or that it is a synonym, and point to the main tag. That is accomplished by the third row. To me it makes sense to include it in the same table, as it is a necessary attribute of each tag, and each tag will only have one such attribute.

So main tags refer to themselves, that is:

Tid: 1
Tagname: 'Classical'
HeadTagId: 1

And synonyms refer to main tags:

Tid: 2
Tagname: 'Classical music'
HeadTagId: 1

The only issue with this approach is that you don't have a straightforward list of main tags, but you could generate it with a simple:

SELECT DISTINCT HeadTagId FROM Tags

sure - part of your answer is just go a little further:

tag_synonym
-----------
Tid
Tid2

then you can query to see if any of the synonyms are linked as well as the original tag


Need Your Help