I don't have much going on at www.swwomm.com; just a few static mockups and other documents (this blog is hosted by blogger.com). So I finally got around to transferring it from a paid host (lylix; no complaints other than that they have the gall to charge money for their services) to a free one: Google's App Engine (GAE).
I'm not sure why there isn't already a cookbook for transferring a static site to GAE, since it's pretty easy and painless — so here's my recipe.
Gotchas
But first, note that your lunch is not completely free — Google will 503 your ass if you exceed its (extremely generous) daily quotas for bandwidth and processing power.
Also, there are a few other GAE gotchas which apply to static content:
- Can't host a "naked" domain (ie
example.com
instead ofwww.example.com
). - Limit of 3000 files per app.
- No automatic directory listings.
- No automatic directory redirect (ie redirect from
/foo
to/foo/
). - No custom 404 page.
You can get around #3 and #4, however (as I describe below).
Setup
The first thing you need to do is get the GAE SDK. I'm using the version for python on linux.
With the SDK installed, you can start developing right away on your local dev machine. To deploy an app, of course, you need to sign up for a GAE account. They make you verify your account with an SMS message, so have your cell phone ready.
Hello World
Building a static app is really simple:
- Create a directory for your project (ie
myapp
). - Create a sub-directory named
static
(or really, whatever you want to name it) inside the project directory. This will be the web root. - Copy your static files into
static
. - Add a boilerplate
app.yml
to the project directory.
This is the boilerplate app.yml
:
application: myapp
version: 1
runtime: python
api_version: 1
default_expiration: "1d"
handlers:
# show index.html for directories
- url: (.*/)
static_files: static\1index.html
upload: static(.*)index.html
# all other files
- url: /
static_dir: static
The first line (application: myapp
) specifies the name of your app. This name doesn't matter for development on your local machine; but when you use the GAE dashboard to create a new app, you'll be prompted for a name. The name must be unique globally on GAE (ie it has to be a name no other GAE user has claimed for his or her app), and GAE uses it in the default url for your application (ie http://myapp.appspot.com/
). Once you've created the name and set up the app in the GAE dashboard, go back and change it here in app.yml
.
The default_expiration
setting is the default http cache-age for your static files; "1d"
= one day, "4h"
= 4 hours, etc. You can configure a separate expiration time for each url handler, but "1d"
is probably good for most static content.
The first url
in the handlers
section captures all urls which end with a trailing-slash. These are directories; with static content you usually either want to display the index.html
file of the requested directory, or, if there's no index.html
, just the bare directory listing. Unfortunately GAE doesn't support listing static files, so this handler always just tries to display the index.html
file in the requested directory.
The second url
in the handlers
section captures all other urls, and simply serves the requested static file.
Test It Out
At this point, you've already built a fully-functioning GAE app. You can test it out by running the test appserver (where $APPENGINE_HOME
is the path to the GAE SDK on your local box, and $MYAPP_HOME
is the path to your project directory):
$ $APPENGINE_HOME/dev_appserver.py $MYAPP_HOME
This boots up a test GAE appserver on port 8080. Enter http://localhost:8080/ into your browser address bar; it should load up the index.html
from the root of your static
directory.
Directory Woeage
But as I pointed out above, if you navigate to a sub-directory, and omit the trailing slash (ie http://localhost:8080/foo), you won't see the index.html
for that sub-directory — you'll just get a blank page (and Google's generic 404 page when deployed to GAE).
This we can fix, however, by implementing a trivial RequestHandler
in python. Create a file called directories.py
(or whatever the heck you want to call it) in your project directory, and dump this into it:
import cgi
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
class RedirectToDirectory(webapp.RequestHandler):
def get(self):
self.redirect(self.request.path + "/", permanent=True)
application = webapp.WSGIApplication([('/.*', RedirectToDirectory)], debug=True)
def main():
run_wsgi_app(application)
if __name__ == "__main__":
main()
This creates a RequestHandler
called RedirectToDirectory
; this class simply appends a trailing slash to the current url and redirects. The other non-boilerplate line is just below, where RedirectToDirectory
is registered to be used for all urls (/.*
) handled by this directories.py
script.
Next, drop in a url
entry for the directories.py
script into the handlers
section of your app.yml
(and yes, the order of the url
entries is important):
application: myapp
version: 1
runtime: python
api_version: 1
default_expiration: "1d"
handlers:
# show index.html for directories
- url: (.*/)
static_files: static\1index.html
upload: static(.*)index.html
# redirect to directories (/foo to /foo/)
- url: .*/[^.]+
script: directories.py
# all other files
- url: /
static_dir: static
This url
entry will capture all the urls which don't end in a trailing slash and don't have a file extension. It will handle requests for these urls by sending them to your directories.py
script, which in turn will redirect them back to your app — but with a trailing slash this time.
Directory Listings
But if your directory doesn't have a index.html
file, you're still SOL — unlike a normal webserver, you can't configure GAE to just display the directory listing. And you can't just implement a handler to do this — GAE apps don't have access to their static filesystem.
One possible workaround to this would be to store all your files as entries in the Big Table DB, and then serve them (and the directory listings) dynamically. This would require writing a bunch of code, however, when all you really want is just to serve some freaking static files already.
So I compromised and wrote a simple perl script which automatically creates static index.html
files for a configurable list of static directories. To make it work, create a directories
sub-directory in your project, and add to it four files:
make.pl
#!/usr/bin/perl -w
open LIST, "directories/list.txt" or die $!;
while (<LIST>) {
s/\n//; # strip newline
$title = $dir = $_;
$title =~ s!.*/!!; # strip path from directory name
# open directory index.html for writing
open INDEX, ">$dir/index.html" or die $!;
# dump header template into index.html
# replacing %title% with directory name
open HEAD, "directories/head.html" or die $!;
while (<HEAD>) {
s/%title%/$title/g;
print INDEX;
}
close HEAD;
# dump directory listing into index.html
open DIR, "ls -lh $dir |" or die $!;
while (<DIR>) {
s/\n//; # strip newline
# parse fields listed for each file
@fields = split / +/, $_, 8;
# skip lines that aren't file listings
# and also skip this index.html
next if ($#fields < 7 || $fields[7] eq 'index.html');
# print a table row for this file
print INDEX
'<tr><td class="name"><a href="' . $fields[7] . '">' . $fields[7] .
'</a></td><td class="size">' . format_size($fields[4], $fields[0]) .
'</td><td class="modified">' . $fields[5] . ' ' . $fields[6] .
'</td></tr>' . "\n";
}
close DIR;
close INDEX;
# dump footer into index.html
`cat directories/foot.html >> $dir/index.html`;
}
close LIST;
# format file size a little nicer than ls
sub format_size {
my($size, $perm) = @_;
# skip for subdirectories
return '-' if $perm =~ /^d/;
# add bytes abbr
$size .= 'B' if $size =~ /\d$/;
return $size;
}
This is the perl script. When you create it, make sure you make it executable:
$ chmod +x directories/make.pl
When you run it (after you've created the other three files), make sure you run it from your project directory:
$ cd myapp $ directories/make.pl
list.txt
static/foo
static/foo/bar
static/baz
This is the list of directories for which to auto-generate index.html
files. Please note that the make.pl
script will delete the existing index.html
files in these directories. So make sure that you list only directories which don't have a custom index.html
. (Plus this is another good reason to be using version control on your project.)
head.html
<html>
<head>
<title>%title% - My App</title>
</head>
<body>
<h1>%title%</h1>
<table>
<thead>
<tr><th>Name</th><th>Size</th><th>Modified</th></tr>
</thead>
<tbody>
This is the first part of template for the auto-generated index.html
files. The perl script will replace the instances of %title%
in this file with the directory name.
foot.html
</tbody>
</table>
</body>
</html>
This is the second part of the template.
So add to directories/list.txt
the paths of the directories you want to have listings, customize directories/head.html
and directories/foot.html
to your liking, and run directories/make.pl
. This will create your directory-listing index.html
files.
Deploy
And now you're ready to deploy. Make sure you've created the app in the GAE dashboard and updated your app.yml
with its appspot name; then upload it:
$ $APPENGINE_HOME/appcfg.py update $MYAPP_HOME
You'll have to enter your GAE credentials, and wait for a minute or two while your app boots up on GAE; when your app is deployed and ready to use, the script will let you know. If you screw up your credentials, delete your ~/.appcfg_cookies
file (which caches them), and try again.
If you've also signed up for Google Apps, you can use your (separate) Google Apps dashboard to configure a sub-domain of a domain you own to point to your deployed app (ie http://www.example.com/
). Otherwise, you have to access it via its sub-domain of the appspot domain (ie http://myapp.appspot.com/
).
Bon appétit!
this is good stuff!!
ReplyDeleteI've been looking for 'trailing slash' script for a while now
Yours working great! Thanks a lot!!
Thank you very much for the redirect script. Being not familiar with regular expressions, I was going nuts trying to figure out how to redirect from no slash to trailing slash. Thank you again!
ReplyDelete- Pj @ www.punjabirangmanch.com