Logic Behind StackOverflow Question IDs
First of all, I should say that I think that this question does not belong on Meta.
StackOverflow URLs are inspirational to me, the way they are designed is very clever and they work really well but I can't seem to understand the logic being the IDs, here are some examples:
If you change the ID to ID++ (2121721) one could think it would fetch the next question. Wrong, instead you're redirect to another question whose ID seems to be totally unrelated (2121212):
Why does this happen? Is there some checksum algorithm on the IDs? Here is a sample trace:
2121720 -> exists 2121721 -> 2121212 2121213 -> exists 2121214 -> 2121155 2121156 -> exists 2121157 -> 2120884 2120885 -> exists 2120886 -> 2115014 2115015 -> 2114896 (what happened here?) 2114897 -> 2114799 2114800 -> 2114670 (what happened here?) 2114671 -> exists 2114672 -> 2110215 2110216 -> exists 2110217 -> 2106982 2106983 -> 2106955 (what happened here?)
The IDs seem to decrease. Can someone bring some light on this strange redirect behavior?
It looks to me like the IDs that redirect are actually IDs of answers - so when you put in 2121721, that ID actually belongs to an answer to the question with the ID 2121212. Notice how it redirects to one of the answers, not just to the question. The reason they decrease is that the answer was created after the question.
What this tells me is that the IDs are at least somewhat global. It may or may not mean everything is stored in the same table - though I wouldn't think so.
Stack Overflow uses one table to hold both questions and answers. The ID is the primary key in the table called Posts. The ID is incremented each time a question or answer is posted.
So, yes, the ID and ID+1 for two posts would likely point to two entirely different questions.
You can read about the database schema in Understanding the Stack Overflow Database Schema Database schema documentation for the public data dump and SEDE.1
In StackOverflow, questions and answers are both considered posts. If a record has a null ParentId field, then it's a question. Otherwise, it's an answer, and to find the matching question, join the ParentId field up to Posts.Id.
- Id - primary key, identity field from the original StackOverflow database.
- Title - the title of the question. Answer titles will be null.
- OwnerUserId - joins back to Users.Id. If OwnerUserId = -1, that's the community user, meaning it's a wiki question or answer.
- AcceptedAnswerId - for questions, this points to the Post.Id of the officially accepted answer. This isn't necessarily the highest-voted answer, but the one the questioner accepted.
1Technically the schema is for the Stack Overflow Creative Commons Data Dump which is not a 100% mirror of the live database, but gives you a reasonable subset of the data.