🧠 Bagnetka v0.2 🧠

# [ $davids.sh ] · message #217

🧠 Bagnetka v0.2 🧠

Let's test our intuition about databases

UPD: 3 out of 4 phases of bagnetka have been solved by @zakharoffam and the last one has been solved by @gennadiixd

#bugnetka #db

@ [ $davids.sh ] · # 1157

Rules like in the game "Danyetki": I give the premise, then you have to guess what happened. During the process, you can ask me questions to which I can answer "yes", "no", "irrelevant", "golang".

This bug is in 2 stages:

(1) There is a table:

CREATE TABLE controller_data ( id SERIAL, data JSONB, controller_id INT, -- this is a foreign key created_at TIMSTAMP WITH TIMEZONE )

The logic for working with it is as follows: if the controller sends new data, we delete the old data (DELETE) and insert the new data (INSERT). Don't ask why, let's assume it's a given.

There are tens of thousands of controllers.

After some time, data started only being deleted, but new data stopped being added. What happened and how would you fix it?

(2) I will tell you the second phase of this bug after you find the answer to the first.
@ Kirill Arutyunov · # 1158

Is this related to the created at field and time zones?
@ [ $davids.sh ] · # 1160

No
@ Anton Zakharov · # 1161

SERIAL overflowed.
@ [ $davids.sh ] · # 1162

Spot on)

Now, how would you treat this?
@ Anton Zakharov · # 1163

sequence?
@ [ $davids.sh ] · # 1164

I don't understand
@ Anton Zakharov · # 1165

Either change to bigserial, or completely abandon id, controller_id is sufficient.
@ [ $davids.sh ] · # 1166

Great options)

Now phase 2: some developer did something (not what you wrote) and the appearance of new data worked, but for a short period of time

What did they do?
@ Anton Zakharov · # 1167

If a developer resets the SERIAL counter, they should have their hands ripped off. :)
@ [ $davids.sh ] · # 1168

Hahahahaha, and you're right again!

And now for the last and trickiest part: when I discovered this, the developer offered a fix that allowed us to restore the system to working order without changing the code and save us a lot of time for a proper fix.

What did he suggest?
@ Anton Zakharov · # 1169

Well, as an option, you can replace SERIAL with BIGSERIAL – there will be plenty of time to fix the code.
@ [ $davids.sh ] · # 1170

In that case, in Node.js code, the numeric field will turn into a string, and this could lead to errors (though unlikely), so we abandoned this idea.

Another fix is even trickier)
@ Anton Zakharov · # 1171

With Node.js, yes, it's simpler with that.
@ Gennadii IT-K Khotovytskyi · # 1172

Something like id BIGINT GENERATED ALWAYS AS IDENTITY? And/or something with negative numbers?
@ unknownfix · # 1173

Truncate the table from time to time 💅
@ [ $davids.sh ] · # 1174

Nah)
@ [ $davids.sh ] · # 1175

Radical, but no)
@ Gennadii IT-K Khotovytskyi · # 1176

I would say generate some kind of UUID in the id field. But if there's any type-sensitive logic somewhere in the code, this won't work =\ Another idea that comes to mind is to make it a composite ID like controller_id + (concatenation) a counter that will increment, but a separate one for each controller. This could be implemented via a stored procedure without changing the code, but it turns out to be quite complicated.
@ [ $davids.sh ] · # 1177

Yes, UUID won't work.

A composite PRIMARY KEY (id, controller_id) is a good idea, but it still has a chance of error because such an id + controller_id might already exist.
@ [ $davids.sh ] · # 1178

Hint: The essence of "danetki" (yes/no riddles) is that with N questions, to which I can answer "yes," "no," "irrelevant," or "golang," a person who knows absolutely nothing about IT will be able to arrive at the correct solution.

Therefore, in this game, asking questions is more important than offering answer options.
@ Gennadii IT-K Khotovytskyi · # 1179

I meant not just +, but specifically string concatenation, like controller_id 93, id increment 14, resulting in 9314, but your answer smelled like the wrong direction.
@ [ $davids.sh ] · # 1180

Yes, the probability of error is still too high)
@ Gennadii IT-K Khotovytskyi · # 1181

try to somehow reuse the remote record ID?)
@ [ $davids.sh ] · # 1182

Nah)
@ Gennadii IT-K Khotovytskyi · # 1183

Okay, let's go with questions) Is this related to changing the data type of the id field?
@ [ $davids.sh ] · # 1184

Unrelated)
@ Gennadii IT-K Khotovytskyi · # 1185

Does the timestamp play a role in the proposed solution?
@ [ $davids.sh ] · # 1186

No
@ Gennadii IT-K Khotovytskyi · # 1187

And this data must be available at all times, meaning simply dumping everything from controller_data somewhere periodically is not an option?
@ [ $davids.sh ] · # 1188

No
@ Gennadii IT-K Khotovytskyi · # 1189

Does this method involve any logic like additional scripts or stored procedures?
@ [ $davids.sh ] · # 1190

No
@ [ $davids.sh ] · # 1191

We did something once that won us a lot of reliable time for a normal fix, and the system immediately started functioning afterwards.
@ [ $davids.sh ] · # 1192

I suggest formulating in text what the final situation is and what the mistake was in the end. This can help to put the picture together and understand where to move forward.
@ Gennadii IT-K Khotovytskyi · # 1193
So, to summarize, I have the following: Problem - a large number of records was overflowing the SERIAL id, a temporary fix was proposed which:
- does not force code changes
- does not change the id field type
- all data always remains in the table
- does not reuse ids of deleted records
- does not include logic in the form of additional scripts or stored procedures
- is in no way related to timestamps
@ Gennadii IT-K Khotovytskyi · # 1194

Is this something that was done, was it done with a database? Meaning, the solution isn't in the application code?
@ [ $davids.sh ] · # 1195

You missed the point: after the overflow, the developer reset the serial sequence to 1
@ [ $davids.sh ] · # 1196

And regarding "reuse," one could say yes, but not in the way you've phrased it.
@ [ $davids.sh ] · # 1197

Correct, even more so: it's 1 SQL query
@ Gennadii IT-K Khotovytskyi · # 1198

Is it really just by mistake in the records, to reset?))
@ [ $davids.sh ] · # 1199

No)
@ [ $davids.sh ] · # 1200

The solution is temporary, but it will work
@ Gennadii IT-K Khotovytskyi · # 1203

So, as I understand it, after resetting the counter, it didn't work for long because it quite quickly encountered an existing ID. This means we first need to "defragment" this sequence of IDs, renumbering existing records starting from one, and then reset the sequence to a number equal to the last ID?
@ [ $davids.sh ] · # 1204

Great!

We actually did the opposite: we took all the records and updated their IDs by putting them at the end, thereby allowing writing from the beginning.

But your way is better.
@ [ $davids.sh ] · # 1205

There is also a bonus section: write a defragmentation with a shift to the beginning / end in 1 SQL query