Composr Tutorial: Installing on Google App Engine

Written by Chris Graham (ocProducts)
This functionality is currently not working. It works on the development server, but a GAE crash has prevented us making it work live. Google were investigating the issue at the time of writing. Many of the improvements made for this functionality are of general advantage on other cloud hosts (e.g. the task queue improves general scalability). We still love App Engine, so we can put the functionality through to a new testing cycle based on customer demand.



Composr can run on Google App Engine (GAE). GAE is a cloud hosting service designed for high scalability. Unlike almost all other cloud services, it does not require significant preparation and ongoing maintenance in order to scale (you do not need to set up Virtual Machine images, or manually start/stop server instances).

Most people don't realise how much work the normal "cloud" services are to properly utilise, as the marketing can be really very misleading, and successful customers are development teams rather than webmasters.

The easy Composr scalability with App Engine is possible because it is a locked-down environment. Essentially the Composr developers have done up-front work to ensure Composr meets certain restraints that allow Google to automatically copy things across different servers without it being noticeable.

All the above said, hosting a large scale cloud deployment is still not simple. Some restraints will affect you, and you will need to be comfortable working on the command line, and staging your changes on a local development machine. We'll save you having to have a programming/IT team to get Composr running, or the need to fork Composr, but it still requires some level of expertise.
However, if you're at the level of needing this kind of scaling rather than typical hosting or a dedicated server (or virtual machine, such as on Amazon Cloud via Bitnami) then you should be successful enough to invest in some expert set up.

This tutorial explains how the Composr GAE support works.


Limitations

The following are known limitations with GAE:
  • Individual non-bundled Composr addons, or third party frameworks, may or may not work. The Composr development tools can scan for most issues but generally most programmers will not be developing with App Engine's constraints in mind. Most professional developers will be happy to bring non-bundled third-party addons up to scratch. If possible/ideally we might try and contribute improvements back into "upstream" to the actual project, but sometimes we may feel it better to just fork our own version of the project.
  • The comments above extend to other applications you may wish to integrate. You need to be comfortable of their compatibility, or that you will get them updated, or that you will run them on another server.
  • LDAP will not work (LDAP would typically be used on a local corporate network anyway).
  • There is a 32MB upload limit, which means uploading large files is not realistic. This could be worked around, with investment.
  • FFMPEG will not run. You would need to use a third-party transcoding service, or syndicate to YouTube (note: that addon is not currently tested on GAE). Most users, however, will be fine with Composr's support for integrating videos that were separately uploaded to Youtube.
  • Outgoing web requests cannot exceed 10MB, which limits some external transcoding and syndication options.
  • There is a 32MB response size limit. This should not generally be a problem for most users, but worth noting.
  • You will need to maintain a local development site, and won't be able to do things such as themeing or addon management directly on the live site.
  • You won't be able to have the full set of Composr addons installed, due to size constraints. This should generally not be a problem, and we provide advice on it in this tutorial.
  • While we are making sure Composr runs well on GAE (i.e. stably, without special patches), initial set up, and some maintenance tasks, are inherently very complex. If you are not a programmer, you will need one to perform some things for you.
  • You may not use a third-party forum with Composr: you must use Conversr. You must also not be using a shared forum database.
  • If you have unusually large amounts of data (as a rough example, over 50,000 catalogue entries), it is possible some actions could time-out, such as deleting a catalogue with those entries in. Such high-volume use-cases may need custom solutions for individual sites (requiring custom programming), such as shifting certain actions into background maintenance tasks. This kind of problem would affect any hosting, but particularly GAE has hard-limits on how much a web request can do (which is a good thing).
  • The feature to e-mail in support tickets is not supported, although it could be added via an alternative mechanism.

The following Composr bundled addons will not work:
  • (All the addons mentioned for uninstallation in the Setup Wizard, further down in this tutorial, under "Setup Wizard: Step 4")

Costs

GAE is charged based on usage. Google provide documents that explain what is chargeable, and at what price.

It is hard to compare GAE costs to those of shared hosting. Typically a shared host will assume you won't max out your resource allocation fully (and if you tried, they'd ban you, often regardless of their advertised terms). Therefore you would need to compare averaged typical resource usage on shared hosting, to the costs on GAE.
Anecdotally, we have heard the costs are similar, but we have not tested this. It will of course vary greatly on a case-by-case basis.

In a sense it is a mute comparison, because shared hosting cannot scale and hence you have no real choice other than to use a cloud solution (where pricing is also complex) or your own/rented hardware.

From a cost-origin point of view, shared hosting pricing benefits from its relative simplicity (no high R&D costs, relatively unskilled support personnel) and the relatively low usage the average customer has (hence subsidisation is happening) – while cloud hosting benefits from its low-service nature and economies of scale. Both the shared hosting and cloud hosting markets are extremely competitive.

You should obviously keep track of your costs. And, obviously the Composr developers cannot put any guarantee on how much things might cost. Be advised: even if there is a bug that uses run-away resources, the Composr developers cannot be held liable for those costs. Google have options to put a ceiling on resource usage, blocking requests that go above this.

Basic set up

You will need to have a local development machine, in addition to your live application. Your initial site setup will be done locally, and then you will deploy it live. This "development to live" process will be your core workflow.

You will need to install the GAE PHP SDK on your local development machine. You must do this before Composr installation.

You will also need to install MySQL, as this doesn't come with the SDK. MySQL is equivalent to Google Cloud SQL (used by the live application).

Composr installation

(the exact tool/process may depend on what platform you are using – i.e. Windows/Linux/Mac. These instructions have been written on a Mac. If you have Windows you may struggle with the command line a bit. Cygwin is great.)

Image

The application launcher, after the application has been created.

I have created mine under my local Apache server root (where I already have a git checkout), but that is only for convenience.

The application launcher, after the application has been created.<br /><br />I have created mine under my local Apache server root (where I already have a git checkout), but that is only for convenience.

(Click to enlarge)

  1. Extract the Composr manual installation package to a directory on your development machine (not the quick installer!).
  2. Start up the GAE launcher.
  3. Add Composr in the launcher: simply add the directory you extracted Composr to, and it should automatically configure it from that.
  4. Select your application row in the list of applications, and click "Run".
  5. In your web browser, go to http://localhost:8080/install.php (your first application will be running on port 8080 by default, which also means you can also have Apache installed on the machine if you so wish.)
    Ignore any "There is no PHP GD extension on the server" error. The PHP SDK (at the time of writing does not have GD), but the live environment does.
  6. The installer will ask for various GAE-specific details. It will provide links that take you to where you can set these up. That said, things aren't always straight-forward, so you must read the "Live setup instructions / notes" below to tell you what you actually need to configure.
    A top-level view of what you'll be setting up:
    • App Engine application ID – The identifier for the live GAE application; Google will automatically also use this as the Google Cloud project name
    • Google Cloud Storage bucket name – The identifier for the live Google Cloud Storage (this is basically the live filesystem, which is stored separately to the application itself)
    • Database hostname – This includes the identifier for the Google Cloud SQL database's Google Cloud Platform project and the identifier for the Google Cloud SQL instance
    • Database name – The name of the Google Cloud SQL database, which you must create manually
    The default settings assume all the different identifier's are the same, but they don't have to be. These are separate services that are linked together for your live application to run. That said, this tutorial assumes you are using the same identifiers because it simplifies things considerably.

Live setup instructions / notes

Image

Creating a Google App Engine application.

Creating a Google App Engine application.

(Click to enlarge)

Image

Enabling necessary services.

Enabling necessary services.

(Click to enlarge)

When adding your GAE application, ignore any settings relating to authentication for users. Composr uses its own authentication, not Google authentication settings. That said, an administrator of the GAE app will automatically be logged in as the first admin on the Composr site (unless you choose to explicitly log in to a different user from the login screen).

You want your datastore to be set for "High Replication". This is the default.

You cannot create Google Cloud Storage buckets, or a Google Cloud SQL instance, without signing up for billing first. They are not available on the free tier. You sign up for billing under the Google Cloud Console. Note that billing must also be separately set up for Google App Engine, which is not managed from the Google Cloud Console (to be honest, Google's tools here are a bit of a jumble).

Before you may set up Google Cloud Storage or Google Cloud SQL, you must specifically enable the services for them.
At the time of writing, the beta version of the Google Cloud Console (the "new" one) does not work properly. You cannot create a Google Cloud Storage bucket in it because there are no service management settings in the user interface. If you try to create the bucket you'll get an odd error about the project being disabled. Therefore make sure you do not enable the beta version.

To enable Google Cloud Storage and Google Cloud SQL, go to 'Services' in the left-hand menu in the Google Cloud Console (under your project) and select 'On' for the Google Cloud Storage and Google Cloud SQL services.

Once enabled, options to configure the services will be shown on the left-hand menu.

Image

Setting Google Cloud Storage permissions (2).

Setting Google Cloud Storage permissions (2).

(Click to enlarge)

Image

Setting Google Cloud Storage permissions (1).

Setting Google Cloud Storage permissions (1).

(Click to enlarge)

Image

Enabling Google Cloud Storage (2).

Enabling Google Cloud Storage (2).

(Click to enlarge)

Image

Enabling Google Cloud Storage (1).

Enabling Google Cloud Storage (1).

(Click to enlarge)

When you go to Google Cloud Storage, you will see a link to go to "online browser". From there you can set up a bucket.

Annoyingly it takes you to the beta version of Google Cloud Console, which you'll need to exit out of when you're done (did I mention it is a jumble?).

You need to set permissions to public access, as URLs will be served out of the storage. Do this by ticking (checking) the bucket and clicking the "Bucket permissions" button that will appear.

Next, we need to grant GAE permissions on the bucket:
  • Click on Permissions on the left hand menu
  • Click "Add member"
  • Use the Email <application>@appspot.gserviceaccount.com (obviously substitute <application> for the real name of the application)
  • Click Add

Image

Creating a Google Cloud SQL database, within the instance.

Creating a Google Cloud SQL database, within the instance.

(Click to enlarge)

Image

Creating a Google Cloud SQL instance.

Creating a Google Cloud SQL instance.

(Click to enlarge)

Next set up your Google Cloud SQL instance. You should leave the "Replication mode" as "Synchronous", for stability reasons. You have to authorise your GAE application when you add the Google Cloud SQL instance. The instance does not itself contain a database, so when added click the 'SQL Prompt' tab and enter an SQL command:

Code (SQL)

CREATE DATABASE <application>;
 
(where <application> is the name of your application, or a different database name if you prefer)

Finishing off installation

Finish off installation as instructed. The remaining steps are very straight-forward, and the same as any normal Composr installation. You'll be asked to delete install.php when you're done. This is necessary.

The installer will have copied a custom php.ini and various *.yaml files into your project directory. The app.yaml file will have been customised. A _config.php file will have been built that contains configuration for both the development and live sites.

Your MySQL database will have been populated, and your local storage directory (data_custom/modules/google_appengine) will be starting to fill up with cache files and custom data.

Structural configuration

Continue through to the Composr Setup Wizard. Go through it as normal, but you will need to make some particular choices.

Setup Wizard: Step 4

The following addons are not supported/useful on GAE and should be uninstalled:
  • Installer (serves no purpose on any Composr site, once install has finished)
  • Code Editor (not supported)
  • Config Editor (not supported)
  • Backup (not supported/relevant, use the GAE database backup tools)
  • Uninstaller (not supported)
  • LDAP (not supported)
  • Themewizard (themeing on a live site is ill-advised, especially on the scale of the theme-wizard)
  • Setupwizard (not supported)
  • Stats (not supported due to the resource-usage-pattern, use Google Analytics)
  • Import (not supported)

Note that the purpose of recommending uninstallation is actually primarily because App Engine has some limits. We push up against these limits and hence need to get rid of redundant code to stay under them.

Even after removing these addons, you will still need to clear a bit more out to make sure you have fewer than 1000 files in the themes/default/templates directory.
Remove one or two of the following addons (your choice which):
  • Calendar
  • Chat
  • eCommerce
  • Shopping
  • Galleries
  • Pointstore
Also remove four of the following:
  • Authors
  • Banners
  • Downloads
  • Newsletter
  • Polls
  • Quizzes
  • Tickets
  • wiki
(Of course, you may remove more if you wish, or other addons)

If you find it annoying that you cannot have every Composr addon in simultaneous use when using GAE, you might consider that good sites are streamlined; very few (if any) websites should contain every Composr feature ;) .

Staging configuration

You should only do certain tasks on your local development machine:
  • Themeing
  • Addon management (installing/uninstalling)
  • Debranding
  • Adding/renaming/deleting zones
These tasks need doing locally because you cannot manipulate the filesystem on the live application. Once it is live, the main-structure/code of the website is expected to not change outside a deployment. It's a good habit to maintain generally for a large-scale site, as it enforces discipline.

You may also wish to consider doing the following locally, for reasons of performance:
  • Edit Comcode pages
  • Language editing
(remote Composr custom files will go through Cloud Storage, which may have increased latency than if stored directly in the deployed application)

Things like menu editing, or adding content, would primarily be done on the live site directly. With investment we would/will continue to improve Composr's staging tools (see the Repository tutorial), but at this time we cannot necessarily confirm they will work with GAE or fit into a smooth staging to live publishing process. Our primary aim has been getting the challenge of basic support working so that we can then work on smoothing over daily usage as we find useful.
You could consider regularly copying the live Cloud Database back to your staging site and testing new content locally, before repeating your actions manually onto the live site. This isn't as bad as it sounds for most kinds of content, as the actual clicks are minimal compared to the time actually producing quality content.

Deployment

We will assume that you have already set up your live application/storage/database, as those details were put into the installer. This section covers how to copy your application and data, from local to live, in the following order:
  • Copying your storage
  • Copying your initial database
  • Copying your application

In given code samples replace <application> with the name of your application. We assume you chose the same name for all your identifiers.

Storage

You may want to copy your initial custom files to live. If you have no relevant custom files (which is possible), there is no need to. To copy your files you will need to either:
  1. use gsutil with the files in data_custom/modules/google_appengine (that is your custom filesystem for Cloud Storage, which Composr has kept separated from your application files).
  2. Alternatively, you can include your custom files as a part of the application. You do this by merging the data_custom/modules/google_appengine folder back down into your main Composr folder. Note though that this does not work for the uploads directory, as uploads are always considered custom and Composr will not search the application directory when locating them.

If you are going to use gsutil, this is the process…
  1. Download and install gsutil
  2. …make sure you have followed the "How to Set Up Credentials to Access Protected Data" part of the above instructions
  3. Open a command prompt in your project directory
  4. Type cd data_custom/modules/google_appengine
  5. Type gsutil cp -R * gs://test-cms

Now consider your plans for world domination while your files upload (or, read ahead ;) …).

Database

(in the below instructions obviously substitute <application> for the real name of the application)

To copy your initial database to live, you need to follow a rather complex procedure of exporting, copying to live Google Cloud Storage, then importing from live Google Cloud Storage…
  1. Download and install gsutil (if you didn't already do it for "Storage", above)
  2. …make sure you have followed the "How to Set Up Credentials to Access Protected Data" part of the above instructions
  3. Open a command prompt in your project directory
  4. Take an SQL dump of your local MySQL database. Typically you will do this with:

Code (Bash)

mysqldump --databases <application> -uroot > dump.sql
Note that I found I had to use "–databases" to force a USE <application>; command, which the import seems to need.
  1. Run this command:

Code (Bash)

gsutil cp dump.sql gs://<application>
  1. Sign into the Google Cloud Console, enter your project, and go to the Cloud SQL section
  2. Enter the instance into which to import data
  3. Click the Import button at the top
  4. In the Import Data dialog box, enter details as follows:
  • gs://<application>/dump.sql
  • <application>
10) Click OK to start the import

If you're not sure it worked (because it doesn't say!), check the logs table for errors. The SHOW TABLES; SQL command is also helpful.

Application

To upload your live application:
  1. Open a command prompt in your project directory
  2. Type appcfg.py --oauth2 update .
  3. Go through the authorisation process when prompted

Upload may take around 30 minutes.
Note that you should ignore any "Could not guess mimetype for …" errors.

To view the live project, go to http://<application>.appspot.com in your web browser.

(obviously substitute <application> for the real name of the application)

Developers

Developers may be interested in GAE's Push-to-deploy system.

Notes and considerations

Note that the CRON bridge scheduler will have been automatically set up.

An appropriate PHP configuration has also been provided for you (no .htaccess file setup needed).

Late structural changes

If you forget to do something before you deploy, or if new development has happened, you can always redeploy later.

Any redeployment

Fortunately the appcfg.py command is quite efficient, so redeployment does not take long.
Note that Composr caches in Google Cloud Storage may be out-dated. When you redeploy you should go to http://liveurl/upgrader.php and empty the caches.

Changes to the addon set

If you are going to uninstall/install an addon, you should generally go through this process:
  • Close the live site temporarily
  • Export the live Cloud Database and import it into your development machine
  • Perform the action on your development machine
  • Do a code deploy
  • Copy the Cloud database back to live
  • Re-open the live site
This process is admittedly cumbersome and time-consuming, but not wholly unreasonable: if you are working at a multi-server scale this kind of thing should be a part of your development process anyway.

That said, down-time is bad. It may not strictly be always necessary. You can uninstall modules and blocks on the live site, then deploy the code after doing a full uninstall locally. Generally this will work, but it requires a programmers insight in terms of knowing exactly what uninstall code is being triggered. You'd still have to close the live site for a time, to stop the modules/blocks automatically reinstalling.

The process of automatic module/block installation/upgrading is actually key to the deployment process. You should not generally have to think about manually making Cloud Database changes because you can put migration code into blocks/modules. This is essentially the same thing as the Composr developers do between Composr versions (at no point [generally] does anyone need to manually go into their database to change it).

Composr major upgrades

Doing a full Composr upgrade is a little bit more thorny than being a developer and deploying some code enhancements.

Note that we would generally not expect non-programmers to be doing this kind of thing. Therefore please consult outside help if you are not one.

You could do an upgrade two ways…

Fully local upgrade (simplest, longest)

Go through this process:
  1. Close the live site temporarily
  2. Export the live Cloud Database and import it into your development machine
  3. Perform the upgrade on your development machine
  4. Do a code deploy
  5. Copy the Cloud database back to live
  6. Re-open the live site

Split upgrade

Go through this process:
  1. Perform the upgrade on your development machine
  2. Close the live site temporarily
  3. Do a code deploy
  4. Go into the upgrader on live, and perform all the steps except the file transfer one (as you have just done that when you redeployed)
  5. Re-open the live site

Configuration considerations

Domain name

Initially you will be set up on an appspot.com subdomain. You may use your own domain. Composr is set up to automatically detect your base URL from the access URL, so you do not need to worry about any configuration within Composr. Make sure though that you stop using the appspot.com URL fully after switching to your custom domain name.

If using SSL with your own custom domain, Composr's default configuration in _config.php will assume that you want to serve uploaded data via that domain. To do that you need to setup storage.<domainname> as per Google's documentation on setting a CNAME record.
Of course you would also need to set up your own SSL certificate.

Error pages

To enable a custom page for 404 errors, in app.yaml uncomment (i.e. remove the #s):

Code (YAML)

#- url: .*
#  script: 404.htm
 

Code (YAML)

- url: .*
  script
: 404.htm
 
And, then create 404.htm by saving out the result of /index.php?page=404

This rule works because it's the last rule, run after no dynamic page has been found.

You could create a 404.php if you need it to be dynamic, but it's probably not a good idea. It is possible you could get a lot of 404 hits from old broken URLs, or bots – a static page is much faster to execute (i.e. less costly to you).

Notes to programmers / architects

The GAE API installs a version of PHP, but it does not have most of the limitations of the live version of PHP. Therefore be careful that you apply all the coding standards. In particular, ensure any custom files are written to using Composr's get_custom_file_base function (which will be mapped to Google Cloud Storage live and data_custom/modules/google_appengine in development).

Be aware that filesystem latency may be considerably higher on GAE, as Cloud Storage may not be in the same machine as the actual running application. One hopes that the machine will be at least in the same data centre, with a fast non-congested link, but be mindful that there are reports of ~10ms latency. The same issue applies to Google Cloud SQL. Try and keep the number of filesystem and database queries you do down.
It is also advisable to minimise the number of filesystem and database writes you make, as these have to replicate across each copy of your filesystem/database.

Be advised that Google cannot provide automated "sharding" of data. This means that if you have a huge volume of data (many 100s of GB), you may come across limits. For smaller quantities of data, you may still find that scaling isn't as fast as one would like, while the data is cloned out. The moral of this isn't that there is a problem hosting regular sites on GAE, but that if you are successful in creating a huge new web property, don't expect data to automatically scale without having a proper development team to rearchitect your data model into shards. At a huge scale you will probably also want to move away from Google Cloud SQL. All this is beyond the scope of Composr as a product.

Concepts

Google App Engine
Google's auto-scaling application-hosting service
Application Instance
Your Composr-based application running on GAE
Google Cloud Storage
Google's cloud filesystem service (built on top of their no-SQL infrastructure)
Bucket
A configured cloud filesystem on Google Cloud Storage
Google Cloud SQL
Google's cloud version of MySQL, automatically set up to replicate across multiple machines, for redundancy/performance
Google Cloud SQL instance
A particular instance on Google Cloud SQL; within this instance you have one or more databases
GAE
An abbreviation of Google App Engine
Google Cloud Console
The tool for managing Google Cloud services, apart from GAE itself (which has its own separate management site)

See also


Feedback

Please rate this tutorial:

Have a suggestion? Report an issue on the tracker.