I know I am pushing the deadline here, I had some things come up this weekend I really wish hadn't.
Anyway, I was hoping for some last input on my proposal before I submit it, follows are the Abstract and Detailed portions of the application (not the entire application)
Abstract
========
I propose creating a series of plugins for the Coppermine that would add support
for distributed storage. This would increase the capacity of each coppermine
installation and reduce resource use on individual servers. This can be useful in
a number of cases, such as shared and vps hosting where resources are limited and,
under load, the server might be slow to serve up both the webpage and the images.
This system would have several methods of implementing distributed storage,
selectable in basic to advance configurations, all derived from existing
distributed methods.
1) Seperate server(s). (Best for Amazon's S3) (If >1, random server)
2) A round-robin service, each server going in turns.
3) A load service, similar to round-robin, but factors in the server's load.
Each server would be configurable to host only thumbnails, pictures, or combination.
There would be, essentially, two separate server clusters, thumbnail and pictures,
with combination belonging to both.
In addition, servers could be setup in several different ways. Unique, where they
are the only server with copies of the files (precludes round-robin, load service).
Mirrored, where each server in a server group replicates the others. And
Distributed+Mirrored, similar to raid-5, where some servers have unique files
in their group, and another group replicates them.
Finally, support for rule-based plugins. These would be plugins that have specific
mirroring rules, such as a certain user's photos always being placed on a specific
server. They would be implemented as a hook which would inform the plugin of
information regarding the picture and the the plugin returns a specific server or
nothing if the rule does not apply, and the system will use the normal settings.
Detailed
========
To distinguish a concept, I will refer to image server types (Amazon S3, FTP, etc)
that are part of the distributed system as 'providers'. The actual server entities
will still be referred to as 'servers'.
The primary plugin would be the one responsible for the entire distributed system.
It would handle the necessary hooks, such as 'picture_url', to implement itself in
the system. It would handle management of the different servers and the distribution
mechanism.
Primary functions:
1) Provide a mechanism for providers to register themselves.
2) Offer a clean interface for adding new servers. This would deferred to the
responsible plugin (as configuration requirements differ), but would be presented
through this plugin's configuration, to keep a consistent place of adding servers.
3) Allow the user to push files from one server to another.
4) Allow the user to delete servers.
5) Store deferred operations. Deferred operations being actions that could not be
completed on a remote server for some reason (disconnection, no write access, etc).
It would try again later. In the intermediate time, the picture(s) would be hosted
locally.
6) A non-trivial error system which would notify the admin as needed, such as: no
write access would be reported immediately, whereas disconnection would be reported
after a period of time.
7) Utilize the storage/rule plugin's interface to add/delete files.
8) Handle the distribution mechanisms and corresponding url modifications.
The setup for handling servers would be three clusters: Thumbnail, Picture, Both;
with their function corresponding to their name. Inside each cluster, are three
sub-groups: Mirrored, Distributed, Mirrored/Distributed. Mirrored/Distributed
having multiple sub-groups which indicate where the distributed portion ends, each
sub-group mirroring each other. Finally, there would be a category comprised of
servers not-used by the normal distribution mechanisms, but solely rely on the rule
based plugins to assign images to.
The storage plugins are responsible for connecting to, putting, and removing
files from their storage provider. When installed they register themselves with
the primary plugin. They are responsible for providing the interface to add a new
server, though their actual configuration pages should not have this functionality.
The storage plugins would be, as provided by this project, Amazon S3, FTP, and
Managed HTTP. Managed HTTP would be a php script installed on the remote server,
set up with a unique security key which would be required to communicate with it,
and would allow the primary plugin to add/delete files on the remote machine.
The rule plugins are responsible for handling special cases, or possibly even the
actual distribution mechanisms if the design allows for it. Essentially they
receive a notification each time a picture is added and are allowed to manipulate
its destination server group or set a single server if desired.
Finally, many users do not need this advance configuration, so instead a simple
interface will be created which will be the default presented. It would basically
be a list of servers that could be added to/deleted from, with a radio control
allowing them to select Seperate, Round-Robin, or Load with explanation. All the
servers would be preforming in mirrored mode, for simplicity. Though it might be
better to allow them to selected Distributed-Mirrored also, and present two lists
of mirrored server groups.
What I am looking for especially are bad ideas, ideas I am too terse on, etc. Thank you.