๐ฏโโ๏ธ P2P offline-first databases โ a loop or a dead end? ๐ฏโโ๏ธ
I like this strange creature, but there are nuances
#db #p2p #distributed
# [ $davids.sh ] ยท message #229
๐ฏโโ๏ธ P2P offline-first databases โ a loop or a dead end? ๐ฏโโ๏ธ
I like this strange creature, but there are nuances
#db #p2p #distributed
@ [ $davids.sh ] ยท # 1312
To put it as simply as possible โ each user / application can have its own local DB node and interact only with it, and this node will then synchronize with all other nodes in the background.
Examples: excellent list here
Advantages
โ Maximum response speed to user / application actions โ Ease of developing collaborative systems (more on this below) โ Ability to operate without the internet
Disadvantages
โ Limited set of available data structures โ Implementation of the synchronization algorithm is either simple but unreliable, or too complex with a high level of reliability but with a strong emphasis on "God save us" (GSUs) โ Very complex debugging
Features
โ Concurrency is resolved by using CRDTs (simplest explanation, if you find a simpler one, share it in the comments) โ Locks are implemented through data "reservations" on one of the nodes, with subsequent transfer of ownership to other nodes โ Proper configuration of what data a node can view / change / store is very important โ A "central" node owning all data can exist
Conditions for Use
โ When conflicts cannot occur โ i.e., absence of competition, for example, due to distribution of ownership at the architectural level: the database is part of an application where a warehouse worker needs to walk around and scan packages, entering some data. At any given moment, only 1 person (the worker) interacts with specific data, while the central storage contains all data from all workers.
โ When clients can resolve conflicts themselves โ i.e., "collaboration," not "competition." For example, Figma or Miro โ even if 2 people drag the same card simultaneously and someone did it incorrectly, they can fix it themselves when they see the error.
โ And, of course, distributed systems (i.e., no central storage at all), but here again, you either need to strictly distribute ownership, or rely on collaboration, or majority validation.
Summary
Such databases are needed ONLY in specific cases (low or no concurrency) when you are sure that they will bring more benefits (offline, speed, distribution) than drawbacks (complexity of development and debugging), and such cases are at most 10% of the total.
@ [ $davids.sh ] ยท # 1313
Want more distributed hardcore? Here you go
@ Vassiliy ITK Kuzenkov ยท # 1314
I really enjoyed the explanation of CRDTs here https://jakelazaroff.com/words/an-interactive-intro-to-crdts/
@ YURII VLADIMIROVICH ยท # 1315
And how did you encounter this question? (Motivation to figure this out)
@ [ $davids.sh ] ยท # 1316
When I worked at MegaFon, analysts would occasionally come to us and ask us to sketch out an architecture for roughly the same use case: "There's a factory/warehouse, and it has employees who need to record some data on Android phones they're issued. All of this happens in remote areas with practically no internet or intentionally without internet. Come up with an architecture." And all the diagrams always boiled down to a database of this kind (e.g., Cordova + PouchDB with synchronization to a local factory server running CouchDB).
And I myself tried to develop a board game simulator for the game Unmatched and used yjs for online collaboration.
@ YURII VLADIMIROVICH ยท # 1317
If you have any additional materials on these cases, please share them (for reading on implementation).
@ Vassiliy ITK Kuzenkov ยท # 1318
Oh, and did it work with Unmatched?
@ Artur G ยท # 1319
I think the scope of application is much wider.
Especially as different regulations emerge, it will be easier not to process data in-house without extreme necessity. Yes, and let the user be responsible for security in their own browser, etc.
Don't forget about Git. This is exactly it! CRDT structures can be used like Git.
I've been wanting to try making some application with automerge. I just haven't gotten around to it. ๐
@ [ $davids.sh ] ยท # 1320
I would start by reading use cases for CouchDB and especially Couchbase (because it's paid).
In my opinion, these are the most famous production solutions of this type.
And from reading their cases, you can then find more modern alternatives.
@ [ $davids.sh ] ยท # 1321
I've put together rendering using React + Canvas, created my own ECS engine, implemented deck spawning mechanics, cards, shuffling, a health bar โ basically, the necessary minimum, except for depth changes. And at the end, I integrated yjs.
And then I stopped)
I want to come back later and start writing articles along the way to systematize what I've achieved and learned.
@ [ $davids.sh ] ยท # 1322
There are definitely many more cases than what I described.
But given the problematic nature of debugging and the very high complexity of handling concurrency (which most systems need), it's unlikely there will be a mass-market solution for such systems.
Git falls under my second point, where people resolve conflicts themselves.