P2Pcollab

protocols for peer-to-peer collaboration

About

P2Pcollab is a collaborative effort towards creating decentralized, privacy-preserving, asynchronous peer-to-peer collaboration protocols and tools that allow us ownership and control over our data, and enable us to collaborate with others and publish, subscribe and discover content without censorship and opaque algorithmic bias.

We aim to shift the paradigm from centralized services providing limited access to locked-up data silos to open, decentralized protocols with replicated data stores, pushing data to edge networks where we can locally access, search, discover, and collaborate on data, even offline.

We realize this through the research and development of peer-to-peer network protocols and data models, and their implementation as composable and reusable libraries, as well as user interfaces that display and interact with the data.

Design principles

We design networks & systems that empower & respect users, and ensure sustainability of hardware, software, and human resources.

The following principles guide us to achieve this.

Self-*
Self-organization, self-optimization, self-repair of networks and systems.
Resiliency
Resilient networks & systems that do not assume an always-online global network, and can recover from network partitions and system failures.
Minimalism
Minimize software dependencies and hardware resources to reduce complexity and trusted computing base of systems while increasing their security, robustness, and scalability.
Composability
Design systems as composable and reusable components.
Data ownership
Users should have full access to and control over their own data and should be able confidentially share it with selected recipients.
Privacy
Protocols should respect user privacy and minimize the amount of information shared about users to the bare minimum that is required for them to function.
End-to-end security
Only the intended recipients should be able to read any piece of information stored or transmitted in the network, intermediaries may facilitate communication only by storing & forwarding encrypted data.
Offline first
Reading, editing, and searching previously accessed content should be possible locally, even offline.

Design overview

UPSYCLE: Ubiquitous Publish-Subscribe Infrastructure for Collaboration on Edge Networks

Topic-based publish-subscribe is a messaging pattern where publishers publish content to topics, or channels, to which subscribers can subscribe to and receive published messages. This messaging pattern is the basis of various group communication and database replication schemes.

The current internet stack is based on a centralized, host-based, one-to-one communication pattern, while most of our communications involve interaction within groups. IP multicast has been proposed as a group communication primitive, but it is not generally available to end-users due to security and scalability issues.

UPSYCLE is a P2P, relay-free publish-subscribe protocol suite that provides a scalable, public key-addressed, end-to-end secure group communication primitive which keeps user location & subscriptions private, and has the potential for widespread adoption.

It enables direct communication of mobile nodes on edge networks and also offers asynchronous communication with remote nodes on the internet via store-and-forward proxies that form a core P2P network and keep encrypted messages for the mobile nodes while they are offline.

A proxy acts on behalf of a user in the P2P core network. It collects subscriptions and facilitates publishing, search, and discovery of content. A user may use multiple proxies for redundancy or for separate identities. Proxies can be run by users, communities, or commercial providers.

The UPSYCLE protocol suite consists of peer sampling, subscription-based clustering, and reliable causal broadcast protocols.

DROMEDAR: Distributed Replicated Mergeable Data Repositories

Conflict-free replicated data types (CRDT) enable asynchronous, conflict-free collaboration on shared data structures, and make eventual consistency possible among a set of replicas.

DROMEDAR relies on UPSYCLE to ensure dissemination of authenticated and encrypted CRDT operations that form a tamper-proof log, and are stored in mergeable data repositories and replicated to authorized subscribers with access control on the allowed operations.

This enables decentralized collaboration on data repositories without relying on a centralized server for coordination, and allows resource-constrained mobile and IoT devices on edge networks to participate in the network.

Authorization and access control

Authorization is based on public-key cryptography, where the repository owner can grant access rights to members based on their public key. Each operation is signed and encrypted by its author and broadcast to all replicas subscribed to the repository. Before a replica can merge an operation, it needs to verify that its causal dependencies are merged already and that the author is allowed to perform the operation according to the CRDT access control rules (CRXS) defined by the repository owners.

Immutable objects

Next to mutable objects, DROMEDAR also allows storage of immutable objects using a content hash-addressed object store (CHAOS) that stores encrypted chunked objects in the repository. CHAOS objects are referenced from CRDTs, which allows access control on them using the same CRXS mechanism as for mutable objects.

Data models

With CRDTs stored in DROMEDAR repositories, we can model data structures necessary for DROMEDAR itself and other components of the system, e.g.:

  • membership & access control for DROMEDAR
  • subscriptions for UPSYCLE
  • user profiles for COMRADES
  • Linked Data models for DEGUSTER

DEGUSTER: Declarative Graph User Interfaces

Graphs can be modelled as CRDT sets that enables the use of Linked Data models for collaboration and publishing.

User interfaces display and interact with data from DROMEDAR repositories and modify them via CRDT operations.

DEGUSTER aims to empower users to define user interfaces declaratively that can display and modify graph data stored in mergeable repositories.

COMRADES: Collaborative Content & Community Recommendation, Discovery & Search

Gossip-based collaborative filtering protocols enable recommendation, discovery, and search of relevant pieces of content (e.g. news items, blog posts, etc) and communities (data repositories with content to subscribe to), based on users' obfuscated interest profiles.

This allows discovery of personally relevant information in both implicit networks formed around common interests and explicit networks based on community membership.

SHRUTHI: Self-Hosted Robust Unikernel Testing & Hosting Infrastructure

Unikernels are specialized, single-address-space machine images constructed by library operating systems that can be run as lightweight virtual machines or as sandboxed processes. They are efficient, robust, minimalist and thus well-suited for running self-organized, self-managing, and self-healing systems.

SHRUTHI is an effort to realize a unikernel-based testing & hosting infrastructure that can orchestrate large-scale tests for the evaluation of P2P protocols, and allow users and hosting providers to host P2P nodes that provide storage and proxy services.

Further reading

Source code & documentation
Software modules developed so far.
Cover image
L-system rules for the cover image.

Community & Contact

Funding

The project is supported by NLnet and the NGI0 Discovery fund of the EU's Next Generation Internet initiative.