Monday, January 25, 2010

Features and pitfalls of Google App engine - My perspective

After migrating Pet Store(Sun's Web application) with JDBC calls to Google App engine with schemalass datastore entities and then re- developing the same as GWT application for Google App engine platform, here are the features and pitfalls from that perspective

Features
1.Components of Google App engine - Sandbox enviornment, DataStore & Services scales independently of each other.
2. All requests to Google App engine follow the same request-response model. Web requests/Task Queues/ Cron Jobs/XMPP Messages/Bulk-loader all are made to go via Google's front-end.
3.Server-less infrastucture
4.Supports Java Servlet(Including the features of filter, session (persistent by google), session listener, internationalization (using browser dependent locale).,JSP,JDO and JPA Specifications.
5.Internationalization is not a native feature, but can be added it by using a web application framework with internationalization features.
6.Easily integrates with Google infrastructure for authentication & authorization. And also sandboxing behavior of App engine runtime makes multiple applications to run in the same server without the behavior of one application affecting other
7.Logging jars and configurations are part of web development SDK and hence no time spent on building logger wrapper. And also Google App engine runtime provides logs of all web requests for a given application from administration console,which can help the administrator of tracking DOS attacks,bandwidth etc.
8.Separation of configuration files for each component eg. web application, indexes, tasks queues, cron job etc all have their configurations and can be configured/handled separately by team.
9.Application Caching (Memcache) support is provided as part of sdk and can be easily programmed to put/retrieve the objects from distributed cache.
10.Multiple versions of application can be tested at same moment of time and hence promotes the parallel development of different iterations for same application.
11.Easy separation of static content from application content by mentioning the same in app-config.xml & Google App engine automatically uploads the content into web server and app server respectively.
12.Google App engine replicates the data in multiple locations, and hence application support team don’t have the handle to backup and archive the data for redundancy.
13.Management tools to manage the resources used by application is being done by Google itself and report for the same can be accessed via Administration console.
14.Google Accounts Integration - App engine features integration with Google Accounts, the user account system used by Google applications such as Google Mail, Google Docs and Google Calendar.
15.Supports the concept of cron - to do batch processing jobs(but with request paremeters) & task queues to do processing aynchronously for any web request(outside the context of web reuqest).
16.The runtime environment also limits the amount of clock time, CPU use and memory a single request can take. App engine keeps these limits flexible & applies limits to those applications that use up more resources to protect shared resources from the runaway applications. But the response time for application can also determine the number of requests the application can handle dynamically.
17. Some of the open source applications are being written to synchronize the data between Google App engine datastore and relational database. For e.g., AppRocket is an open-source replication engine that synchronizes Google App engine datastore and MySQL database.
18.App engine includes a tool for uploading and downloading data via the remote API. The tool can create new datastore entities using data from a comma separated values (CSV) data file & can even create CSV files with data from the app’s datastore.
19.Google App engine provides two different interfaces - high-level & low-level for various services like datastore,memcache etc. High-level APIs make application portable but low-level APIs provides more features by the google platform.

Pitfalls-
1.Too many quotas imposed by Google App engine [bandwidth, datastore request/response size etc.] makes road towards quote driven/oriented architecture.
2.Bad entity designs can lead to index explosion or using lot of bandwidth & time while retrieving/updating/creating/deleting entity from datastore.
3.Java.io , JNI , multi-threading use is not supported currently in java runtime
4.Limited support of SQL by Java runtime as Google datastore is schema less. For eg. A query cannot use inequality filters on more than 1 property, Stored procedures/triggers are not being supported.
5.If using google infrastucture for authentication & authorization, right now it supports two roles - general & admin.
6.URL Fetch Process doesn’t support secure HTTPS communication, and hence app engine is not good use case when application needs to be communicate securely with other sites/ web services.
7.When the application creates new entities and updates existing ones using the datastore API, the call returns with success or failure after creating/updating entities along with updating every corresponding index. This makes queries very fast at the expense of the entity updates & hence affects the performance of applications.
8.With secure data connector, web applications can integrate with on-premise resources(only ones available on intranet).
9.Only content management systems build on top of App engine can be used by applications as most of the CMS typically uses sql databases.


Recommendations
1.With more insight into the Google Datastore’s indexes concept, would like to recommend that this architecture should be used by those application which needs faster access to data and not getting affected by how much data is in the system or how it is distributed across multiple servers.
2.GAE can serve traditional website content too (such as documents and images), but the environment is especially designed for real-time dynamic web applications.
3.Selection of Google App engine as platform is mainly driven by no-capital cost, pay-per-use model for resources used beyond free quota, scalability, manageability, server less on-premise infrastructure but limited by its sandbox capabilities and quotas, only web application support and on-going activities to add additional features in java runtime.
4.Ideal case for small web applications with less traffic for deploying their web applications on Google infrastructure.

No comments:

Post a Comment