storage api
This commit is contained in:
122
blog/_posts/2013-03-01-a-generic-storage-interface.md
Normal file
122
blog/_posts/2013-03-01-a-generic-storage-interface.md
Normal file
@@ -0,0 +1,122 @@
|
|||||||
|
---
|
||||||
|
title: A Generic Storage Interface
|
||||||
|
layout: post
|
||||||
|
tags: asset php storage
|
||||||
|
description: Abstracting file storage, whether it's local or cloud.
|
||||||
|
---
|
||||||
|
|
||||||
|
Websites often have a lot of different assets and files for the various areas of a website - content management systems,
|
||||||
|
photo galleries, e-commerce product photos, etc. As a site grows, so does storage demand and backup requirements, and as
|
||||||
|
storage demands grow it typically becomes necessary to distribute those files across multiple servers or services.
|
||||||
|
|
||||||
|
One method for managing disparate file systems is to use custom PHP [stream wrappers][4] and configurable paths; but
|
||||||
|
some extensions don't yet support custom wrappers for file access. An alternative that I've been using is an object and
|
||||||
|
service-oriented approach to keep my application code independent from the storage configuration.
|
||||||
|
|
||||||
|
|
||||||
|
### Interface
|
||||||
|
|
||||||
|
At the core of my design, is the asset storage interface which looks something like:
|
||||||
|
|
||||||
|
{% highlight php %}
|
||||||
|
<?php interface StorageEngineInterface {
|
||||||
|
|
||||||
|
// store a file and return back a token that can be used to retrieve it
|
||||||
|
function store(SplFileInfo $file);
|
||||||
|
|
||||||
|
// retrieve a locally-accessible SplFileInfo based on the token
|
||||||
|
function retrieve($token);
|
||||||
|
|
||||||
|
// remove data from storage based on the token
|
||||||
|
function purge($token);
|
||||||
|
|
||||||
|
}
|
||||||
|
{% endhighlight %}
|
||||||
|
|
||||||
|
The storage engine is responsible for generating a reusable token that can be used for later retrieval. Generally, I
|
||||||
|
simply have it generate a UUID as the token, however tokens could have storage-specific meaning.
|
||||||
|
|
||||||
|
|
||||||
|
### Sample Storage Engines
|
||||||
|
|
||||||
|
I've used several base implementations:
|
||||||
|
|
||||||
|
* `LocalStorageEngine` - the simplest storage using a local/NFS filesystem
|
||||||
|
* `AWSS3StorageEngine` - using [AWS S3][1] for storage
|
||||||
|
* `SftpStorageEngine` - using PHP's [ssh2][2] module to access files on servers via SFTP
|
||||||
|
* `AtlassianConfluenceStorageEngine` - managing documents within [Confluence][3] wikis
|
||||||
|
|
||||||
|
Remote services like AWS S3 and SFTP can cause significant performance issues. To help with that, I use a
|
||||||
|
`CachedStorageEngine` implementation. It accepts two `StorageEngineInterface` arguments: one as the upstream engine, and
|
||||||
|
one as the local cache. For example:
|
||||||
|
|
||||||
|
{% highlight php %}
|
||||||
|
<?php
|
||||||
|
|
||||||
|
new CachedStorageEngine(
|
||||||
|
new AWSS3StorageEngine(new Aws\S3\S3Client(...), 'bucket.example.com', 'my-prefix'),
|
||||||
|
new LocalStorageEngine('/tmp/s3-bucket.example.com-cache')
|
||||||
|
);
|
||||||
|
{% endhighlight %}
|
||||||
|
|
||||||
|
And since `CachedStorageEngine` is just another implementation of `StorageEngineInterface`, it can be used
|
||||||
|
interchangeably within the application with performance being the only difference.
|
||||||
|
|
||||||
|
|
||||||
|
### Application Usage
|
||||||
|
|
||||||
|
Using dependency injection, each of the storage backends becomes an independent service, configured depending on the
|
||||||
|
application requirements. The application then has no storage-specific calls like `copy`, `file_get_contents`, `fopen`,
|
||||||
|
etc and the code looks something like:
|
||||||
|
|
||||||
|
{% highlight php %}
|
||||||
|
<?php
|
||||||
|
|
||||||
|
// storage service for photos
|
||||||
|
$storage = $dic->get('photo_storage')
|
||||||
|
|
||||||
|
// save a new photo
|
||||||
|
$photo = new PhotoRecord();
|
||||||
|
$photo->setAssetToken(
|
||||||
|
$storage->store($request->files->get('upload'))
|
||||||
|
);
|
||||||
|
|
||||||
|
// use the photo
|
||||||
|
$image = (new Imagine\Gd\Imagine())->open(
|
||||||
|
$storage->retrieve($photo->getAssetToken())
|
||||||
|
);
|
||||||
|
|
||||||
|
// delete the photo
|
||||||
|
$storage->purge($photo->getAssetToken());
|
||||||
|
$photo->delete();
|
||||||
|
{% endhighlight %}
|
||||||
|
|
||||||
|
Since `retrieve` will always return a [`SplFileInfo`][5] instance, it can be referenced and handled like a local file
|
||||||
|
(as demonstrated by the `open` call in the example.
|
||||||
|
|
||||||
|
|
||||||
|
### Complicating Things
|
||||||
|
|
||||||
|
The asset storage interface itself is fairly primitive, but it allows for some more complex configurations:
|
||||||
|
|
||||||
|
* by using dependency injection, it becomes extremely easy to switch storage engines since application code doesn't
|
||||||
|
need to change
|
||||||
|
* complex storage rules can be combined with meaningful tokens to, for example, store very large files on different
|
||||||
|
disks and using a token prefix to identify that class
|
||||||
|
* creating a fallback storage class which will go through a chain of storages searching until it's able to store or
|
||||||
|
retrieve a token
|
||||||
|
* internally deferring operations via queue manager (e.g. instead of storing files immediately to S3 and waiting for
|
||||||
|
upload time, write it locally and create a job to upload it in the background)
|
||||||
|
|
||||||
|
|
||||||
|
### Summary
|
||||||
|
|
||||||
|
By abstracting storage logic outside of my application code, it makes my life much more easier as a developer and as a
|
||||||
|
systems administrator when trying to manage where files are located and any relocations, as necessary.
|
||||||
|
|
||||||
|
|
||||||
|
[1]: http://aws.amazon.com/s3/
|
||||||
|
[2]: http://www.php.net/manual/en/book.ssh2.php
|
||||||
|
[3]: http://atlassian.com/software/confluence/overview/team-collaboration-software
|
||||||
|
[4]: http://www.php.net/manual/en/class.streamwrapper.php
|
||||||
|
[5]: http://us.php.net/manual/en/class.splfileinfo.php
|
||||||
Reference in New Issue
Block a user