28 Aug 2003 @ 19:34, by Roger Eaton
Underlying a voice of humanity there must be a globally distributed database -- this article is about the overall operation and structure of that database. A separate article will deal with how we build the voice of humanity on top of the database.
Philosophy
Keep it easy. Python. Home-made simplified peer-to-peer and security logic. No java. No UML.
Basics
Hubs, items, users, categories, ratings and linkages with rules are the basics. Hubs are single InterMix instances, normally one per PC, each carrying many categories of item and servicing one or more users. Items are xml documents with both category meta-data and built-in meta-data, plus either item contents or a pointer to item contents. Users apply ratings to items. Users, hubs and categories are tracked as items themselves in the database. Meta-data is indexed along with ratings so that items may be retrieved efficiently in rating order using meta-data criteria within a category. Hub owners and other trusted users manually create long-term linkages between categories on a single hub or between pairs of hubs, specifying what items and ratings should be transferred, and how often. For each linkage, items may be exchanged in both directions, while ratings are always transferred in one direction only thus creating a hierarchy where the linked category that receives the ratings is above the hub/category that sends the ratings for that particular linkage.
A hub/category that receives ratings will tend to be collectors of items in that category, and so will have many feeder categories and many more items in the category than the hub that sends the ratings. The lower hub/category will want to download only a few of the most highly rated new items from the higher. The effect will be that highly rated items will rise in the hierarchy and be distributed back down to larger and larger domains.
It is important to realize that there is no requirement that the hierarchies all reach to a single global summit. There are bound to be semi-private enclaves that do not connect to the global level. Some categories of items will be too specialized to be wanted at global levels. Other categories of items will be proprietary and therefore not eligible for upload except within a particular sub-network.
To set the process in motion, InterMix software will have a number of hubs seeded in the database already at install time. A facility will be provided so that each hub that distributes the software can easily pre-load its own list of favorite hubs.
Privacy and Security
The purpose of security is 1) to prevent mass deception through manipulation of ratings and 2) to prevent mass theft of user meta-data, such as age, sex or email address. Rating and user data need to be encrypted on disk, and all hub to hub transmission needs to be secure.
Sections of the voice of humanity (voh) network may decide to use different security methods, or none at all. Security needs to be modular so that with the proper plug-in, any two hubs can connect.
There is a usable python encryption package, PyCrypto possibly accessed through ezPyCrypto. Rather than implementing open-ssl, which looks difficult, it should be possible to use the "How SSL Works" document at [link] to dummy up something for starters. Later, a more professional security layer can be implemented if needed.
Each InterMix version/operating system combination will have a known configuration, which it must prove as part of the handshake between hubs. There is nothing to stop a clever programmer from passing off the proof of authenticity from a hacked version to the real thing in order to get the needed password, but this safeguard will reduce incidences of amateur hacking and make it difficult or impossible for a concerted stealth campaign to replace large numbers of hub engines with bogus versions and nobody be the wiser.
Ratings will be accompanied by a level of confidence, depending on whether the rater was identified by simple trust, by email and password or by certificate. Furthermore, ratings will be sent up the hierarchy by multiple paths for extra security against interception, and a sample of ratings received by each higher level will be rechecked with the originating hub, so no one hub can distort the ratings systematically as they pass through. Finally, for important items and categories, an entirely separate system of verification akin to exit polling for elections will be needed -- to be added later.
|
|