This project is an effort towards creating decentralized, privacy-preserving, asynchronous peer-to-peer collaboration protocols that allow us ownership and control over our data, and enable us to publish and subscribe to content and collaborate with others without censorship and opaque algorithmic bias, as well as to disseminate and discover relevant content using decentralized collaborative filtering techniques, while allowing offline search of all subscribed and discovered content.
We aim to shift the paradigm from centralized services providing limited access to locked-up data silos to open, decentralized protocols allowing full access to data stores that facilitate collaboration, offline search and backup.
This is realized through the research and development of peer-to-peer network protocols and their implementation as composable libraries and lightweight unikernels.
We design networks & systems that empower & respect users, and ensure sustainability of hardware, software, and human resources.
- Minimize hardware resources and software dependencies to reduce complexity and trusted computing base of systems while increasing their security, robustness, and scalability.
- Design software as composable and reusable libraries, sans I/OImplementing network protocols without assumptions on I/O, transport, or wire format enable reusability of these components. For more details see Writing I/O-Free (Sans-I/O) Protocol Implementations.
- Self-organization, self-optimization, self-repair of networks and systems.
- Data ownership
- Users should have full access to, and control over their own data and should be able to determine with whom they share it with.
- Protocols should respect user privacy and minimize the amount of information shared about users, the bare minimum that is required for them to function.
- End-to-end security
- Only the intended recipients should be able to read any piece of content stored or transmitted in the network, intermediaries may facilitate communication only by storing & forwarding encrypted data.
- Offline first
- Reading, editing, and searching previously accessed content should be possible offline.
Unikernels are specialized, single-address-space machine images constructed by library operating systems. They can be run as virtual machines on a hypervisor, or as sandboxed processes.
Their design aligns well with our design principles, and they are well-suited for constructing self-organized P2P systems.
A two-tier P2P architecture combines a stable core network with intermittently connected edge networks.
- Core network
- Stable, always-on nodes in data centres and homes interact using P2P protocols among each other, and also act as proxies for end-user devices.
- Edge networks
- End-user devices with limited resources interact with each other directly on the local network, whereas with remote nodes indirectly via proxies in the core network.
A proxy acts on behalf of a user in the P2P core network. It collects subscriptions and facilitates publishing, search, and discovery of content. One may use multiple proxies for redundancy and separating identities.
Setting up a proxy is a matter of running a unikernel with access to network and storage. One may set up their own, or obtain access to one from a community or commercial provider.
The most important building blocks of our decentralized collaboration protocol are mergeable data structures stored in replicated data repositoriesSee the paper Mergeable persistent data structures.
. A publish-subscribe protocol takes care of replication of subscribed content, while Conflict-free replicated data types (CRDT) enable conflict-free merges on these data structures. Applications render content from these data repositories and perform operations on them.
Search & discovery
The following modules are a curated list of protocols, which, when combined, result in a protocol suite that facilitates P2P collaboration according to our vision and design principles.
P2P gossip-based protocols
- P2P topic-based pub/sub (paper, code)
Privacy enhancements: instead of transmitting node profiles with full subscription sets in the clear, randomized Bloom filters using BLIP are employed together with a Bloom filter-based Private Set Intersection (BFPSI) protocol
- P2P hybrid dissemination (paper, code)
- P2P clustering & topology management (paper, code)
- Random Peer Sampling (paper, code)
To ensure uniformity of peer sampling, URPS is used together with CYCLON.
P2P data structures
- Uniform Random Peer Sampler (paper, code)
- Non-interactive differentially-private similarity computation on Bloom filters (paper, slides, code)
- Private Set Intersection based on Bloom filters (paper, code)
- Encoding layer for the Noise Protocol Framework (spec, code)