ChiFS ("Chives") is a system for distributed file sharing, search and discovery on top of the Tor anonymity network.
- Easy sharing of complete directories
- Decentralized file browsing and search interface
- Curated: People share what they want and with their own file hierarchy, hubs can filter out stuff they don't want.
- Resilient: Being able to download a file from multiple sources
At this point, ChiFS is not much more than an assorted collection of notes, ideas and early alpha and beta-level implementations. It is not very useful in its current form.
I'm going to re-use a lot of existing technologies here to make this project easy to implement and easy to use. This architecture is probably going to sound stupidly simple: That's the intention.
The ChiFS network would consist of the following entities:
- Share: This is an onion service offering files and metadata over HTTP.
- Hub: An onion service indexing multiple Shares and providing a friendly browse/search web UI and API.
A Share would create and manage an index of its own files and expose this metadata as a file through its HTTP service. A Share then registers itself with one or more Hubs. The Hubs fetch this metadata and update their internal indices. Users looking for files would use a Hub for discovery, and then download the files directly from the Share(s) that have it.
Since Hubs are managed by people, they can curate what is being indexed and what isn't - thus filtering out malicious and illegal content according to policies set by the operator.
Share metadata should include a Merkle-tree based hash for each shared file, to offer secure failover if a file is in multiple Shares.
To make all this work, the following sub-projects will be needed:
Document the architecture and protocols used to make all the other ChiFS components work together. This includes:
- Description of the metadata format
- API specification and requirements for Share implementations
- API specification for Hub implementations
The Protocol section on this website serves the role of holding all protocol specifications and other documentation that may help in writing implementations. These specifications are tracked and can be discussed at the /chifs/chifs git repository.
A simple tool to create and manage a Share. This tool can index a directory, create/update the necessary metadata and register with Hubs. A Share can run in two modes:
- Server: A special ChiFS web server that shares a given directory. The web server would automatically manage and update the metadata.
- CLI: A tool that can create the necessary metadata from the command line or as a cron job. A regular web server (thttpd/Apache/nginx) can be used to publish the files.
An early implementation of a Share management tool can be found at chifs/chifs-share. It's written in Rust and supports both CLI and Server mode.
This may get complex, but by no means insurmountable. Some challenges include:
- Dealing with information on hundreds of millions of files while still providing a fast querying interface
- Active monitoring of known Shares for reliability and file updates
- Having proper moderation tools to implement anti-spam, anti-abuse and custom filtering rules
Experiments to get to a usable Hub implementation are being done at chifs/chifs-hub.
Since a Hub exposes a web interface to discover files and since all the downloads happen over HTTP, a web browser is sufficient to download from the ChiFS network. But a standard web browser is not a very good download manager:
- No automatic resumption(?) when downloading from a slow and unreliable Share.
- No way to download a single file from multiple Shares (neither in parallel nor as fallback).
- No integrity checks.
Special ChiFS client implementations will be needed in order to have reliable downloading from multiple Shares. These could come in various forms:
- A browser plugin(?)
- A CLI tool to download a specific file or directory
- A GUI download manager
A client could also provide an alternative interface to the search and discovery functionality of Hubs, so there could be an integrated ChiFS GUI download tool that does not require the use of a web browser.
Comparison with other projects
DC++: This is my main source for inspiration. The DC network has "hubs", which are essentially curated lists of users who share files. Hubs facilitate presence notifications and file search. Users themselves offer their shared files to other users in the network. An essential part of the DC network is that all shared files are hashed, which allows for fast discovery of multiple users sharing the same file, thus allowing for faster and more resilient file downloads. The use of Merkle-tree hashing gives clients the opportunity to verify smaller chunks of downloaded files, to detect and handle corrupted sources early on. DC also offers chat functionality, but that is out of the scope of ChiFS. DC does not protect the privacy of its users.
"Hidden Wikis" are one approach in Tor to aid discoverability. These are okay for finding web services, but do not really provide a good platform for browsing, finding and publishing files.
There are also file upload services and dropbox-like projects for Tor. These are, at the moment, centralized and isolated islands of files. ChiFS could serve as an index on top of such services.
Tor Search Engines: Ahmia, Torch, Not Evil. These are close to what I wish to accomplish, but they're not distributed, do not support easy browsing of files, and lack file hashes to make it easy to find alternative sources for the same content. Adding a new site to the search index is a manual process, I hope to automate this with tooling. These search engines do offer full-text search and index dynamic web pages, both of which are out of the scope of ChiFS.