[ $davids.sh ] — david shekunts blog

🧠 Bagnetka v0.2 🧠

# [ $davids.sh ] · message #217

🧠 Bagnetka v0.2 🧠

Let's test our intuition about databases

UPD: 3 out of 4 phases of bagnetka have been solved by @zakharoffam and the last one has been solved by @gennadiixd

#bugnetka #db

  • @ [ $davids.sh ] · # 1157

    Rules like in the game "Danyetki": I give the premise, then you have to guess what happened. During the process, you can ask me questions to which I can answer "yes", "no", "irrelevant", "golang".

    This bug is in 2 stages:

    (1) There is a table:

    CREATE TABLE controller_data ( id SERIAL, data JSONB, controller_id INT, -- this is a foreign key created_at TIMSTAMP WITH TIMEZONE )

    The logic for working with it is as follows: if the controller sends new data, we delete the old data (DELETE) and insert the new data (INSERT). Don't ask why, let's assume it's a given.

    There are tens of thousands of controllers.

    After some time, data started only being deleted, but new data stopped being added. What happened and how would you fix it?

    (2) I will tell you the second phase of this bug after you find the answer to the first.

  • @ Kirill Arutyunov · # 1158

    Is this related to the created at field and time zones?

  • @ [ $davids.sh ] · # 1160

    No

  • @ Anton Zakharov · # 1161

    SERIAL overflowed.

  • @ [ $davids.sh ] · # 1162

    Spot on)

    Now, how would you treat this?

  • @ Anton Zakharov · # 1163

    sequence?

  • @ [ $davids.sh ] · # 1164

    I don't understand

  • @ Anton Zakharov · # 1165

    Either change to bigserial, or completely abandon id, controller_id is sufficient.

  • @ [ $davids.sh ] · # 1166

    Great options)

    Now phase 2: some developer did something (not what you wrote) and the appearance of new data worked, but for a short period of time

    What did they do?

  • @ Anton Zakharov · # 1167

    If a developer resets the SERIAL counter, they should have their hands ripped off. :)

  • @ [ $davids.sh ] · # 1168

    Hahahahaha, and you're right again!

    And now for the last and trickiest part: when I discovered this, the developer offered a fix that allowed us to restore the system to working order without changing the code and save us a lot of time for a proper fix.

    What did he suggest?

  • @ Anton Zakharov · # 1169

    Well, as an option, you can replace SERIAL with BIGSERIAL – there will be plenty of time to fix the code.

  • @ [ $davids.sh ] · # 1170

    In that case, in Node.js code, the numeric field will turn into a string, and this could lead to errors (though unlikely), so we abandoned this idea.

    Another fix is even trickier)

  • @ Anton Zakharov · # 1171

    With Node.js, yes, it's simpler with that.

  • @ Gennadii IT-K Khotovytskyi · # 1172

    Something like id BIGINT GENERATED ALWAYS AS IDENTITY? And/or something with negative numbers?

  • @ unknownfix · # 1173

    Truncate the table from time to time 💅

  • @ [ $davids.sh ] · # 1174

    Nah)

  • @ [ $davids.sh ] · # 1175

    Radical, but no)

  • @ Gennadii IT-K Khotovytskyi · # 1176

    I would say generate some kind of UUID in the id field. But if there's any type-sensitive logic somewhere in the code, this won't work =\ Another idea that comes to mind is to make it a composite ID like controller_id + (concatenation) a counter that will increment, but a separate one for each controller. This could be implemented via a stored procedure without changing the code, but it turns out to be quite complicated.

  • @ [ $davids.sh ] · # 1177

    Yes, UUID won't work.

    A composite PRIMARY KEY (id, controller_id) is a good idea, but it still has a chance of error because such an id + controller_id might already exist.

  • @ [ $davids.sh ] · # 1178

    Hint: The essence of "danetki" (yes/no riddles) is that with N questions, to which I can answer "yes," "no," "irrelevant," or "golang," a person who knows absolutely nothing about IT will be able to arrive at the correct solution.

    Therefore, in this game, asking questions is more important than offering answer options.

  • @ Gennadii IT-K Khotovytskyi · # 1179

    I meant not just +, but specifically string concatenation, like controller_id 93, id increment 14, resulting in 9314, but your answer smelled like the wrong direction.

  • @ [ $davids.sh ] · # 1180

    Yes, the probability of error is still too high)

  • @ Gennadii IT-K Khotovytskyi · # 1181

    try to somehow reuse the remote record ID?)

  • @ [ $davids.sh ] · # 1182

    Nah)

  • @ Gennadii IT-K Khotovytskyi · # 1183

    Okay, let's go with questions) Is this related to changing the data type of the id field?

  • @ [ $davids.sh ] · # 1184

    Unrelated)

  • @ Gennadii IT-K Khotovytskyi · # 1185

    Does the timestamp play a role in the proposed solution?

  • @ [ $davids.sh ] · # 1186

    No

  • @ Gennadii IT-K Khotovytskyi · # 1187

    And this data must be available at all times, meaning simply dumping everything from controller_data somewhere periodically is not an option?

  • @ [ $davids.sh ] · # 1188

    No

  • @ Gennadii IT-K Khotovytskyi · # 1189

    Does this method involve any logic like additional scripts or stored procedures?

  • @ [ $davids.sh ] · # 1190

    No

  • @ [ $davids.sh ] · # 1191

    We did something once that won us a lot of reliable time for a normal fix, and the system immediately started functioning afterwards.

  • @ [ $davids.sh ] · # 1192

    I suggest formulating in text what the final situation is and what the mistake was in the end. This can help to put the picture together and understand where to move forward.

  • @ Gennadii IT-K Khotovytskyi · # 1193

    So, to summarize, I have the following: Problem - a large number of records was overflowing the SERIAL id, a temporary fix was proposed which:

    • does not force code changes
    • does not change the id field type
    • all data always remains in the table
    • does not reuse ids of deleted records
    • does not include logic in the form of additional scripts or stored procedures
    • is in no way related to timestamps
  • @ Gennadii IT-K Khotovytskyi · # 1194

    Is this something that was done, was it done with a database? Meaning, the solution isn't in the application code?

  • @ [ $davids.sh ] · # 1195

    You missed the point: after the overflow, the developer reset the serial sequence to 1

  • @ [ $davids.sh ] · # 1196

    And regarding "reuse," one could say yes, but not in the way you've phrased it.

  • @ [ $davids.sh ] · # 1197

    Correct, even more so: it's 1 SQL query

  • @ Gennadii IT-K Khotovytskyi · # 1198

    Is it really just by mistake in the records, to reset?))

  • @ [ $davids.sh ] · # 1199

    No)

  • @ [ $davids.sh ] · # 1200

    The solution is temporary, but it will work

  • @ Gennadii IT-K Khotovytskyi · # 1203

    So, as I understand it, after resetting the counter, it didn't work for long because it quite quickly encountered an existing ID. This means we first need to "defragment" this sequence of IDs, renumbering existing records starting from one, and then reset the sequence to a number equal to the last ID?

  • @ [ $davids.sh ] · # 1204

    Great!

    We actually did the opposite: we took all the records and updated their IDs by putting them at the end, thereby allowing writing from the beginning.

    But your way is better.

  • @ [ $davids.sh ] · # 1205

    There is also a bonus section: write a defragmentation with a shift to the beginning / end in 1 SQL query