How Google App Engine Datastore Works

(The following is an excerpt from "Beginning Java Google App Engine," published by APress)

Designing highly scalable, data-intensive applications can be tricky. If you've ever used hardware or software load balancing, you know that your users can be interacting with any one of a dozen or so web and database servers. A user's request may not be serviced from the same server that handled his previous request. These servers could be spread out in different data centers or perhaps in different countries, requiring you to implement processes to keep your data safe, secure, and synchronized. The hardware and software required to scale your application can also be complex and expensive, and may even dictate that you outsource or hire dedicated resources.

With App Engine, Google takes care of everything for you. The App Engine datastore provides distribution, replication, and load-balancing services behind the scenes, freeing you up to focus on implementing your business logic. App Engine's datastore is powered mainly by two Google services: Bigtable and Google File System (GFS).).

Bigtable is a highly distributed and scalable service for storing and managing structured data. It was designed to scale to an extremely large size with petabytes of data across thousands of clustered commodity servers. It is the same service that Google uses for over 60 of its own projects including web indexing, Google Finance, and Google Earth.

The datastore also uses GFS to store data and log files. GFS is a scalable, faulttolerant file system designed for large, distributed, data-intensive applications such as Gmail and YouTube. Originally developed to store crawling data and search indexes, GFS is now widely used to store user-generated content for numerous Google products.

Bigtable stores data as entities with properties organized by application-defined kinds such as customers, sales orders, or products. Entities of the same kind are not required to have the same properties or the same value types for the same properties. Bigtable queries entities of the same kind and can use filters and sort orders on both keys and property values. It also pre-indexes all queries, which results in impressive performance even with very large data sets. The service also supports transactional updates on single or application-defined groups of entities.

The first thing you'll notice about Bigtable is that it is not a relational database. Bigtable utilizes a non-relationship object model to store entities, allowing you to create simple, fast, and scalable applications. Google isn't alone in offering this type of architecture. Amazon's SimpleDB and many open-source datastores (for example, CouchDB and Hypertable) use this same approach, which requires no schema while providing auto-indexing of data and simple APIs for storage and access.

You can interact with Bigtable using either a standard API or a-low level API. With the standard API, either a Java Data Objects (JDO) or Java Persistence API (JPA)) implementation, you can ensure that your applications are portable to other hosting providers and database technologies if you decide to jump ship. This makes a good argument for App Engine as it prevents vendor lock-in. If you are certain that your
application will always run on App Engine, you can utilize the low-level API as it exposes the full capabilities of Bigtable. Both APIs achieve roughly the same results in terms of ability and performance, so it comes down to personal preference. Do you like working with low-level database functionality or abstracting this layer so that your experience is applicable across multiple datastore implementations?

The datastore provides full CRUD (create, read, update, and delete) access to entities in Bigtable and allows you to query against the datastore using a standard SQL-like query language called JDOQL. The syntax is enough like SQL to lull you into a sense of familiarity, but there are some differences when dealing with JDOenhanced objects. One notable exception is the lack of support for joins, which is
present in relational databases. However, this is understandable since the datastore is
non-relational.

Working with Entities

The fundamental unit of data in the datastore is an “entity,” which consists of an immutable identifier and zero or more properties. Once again, entities are schemaless and this allows for some interesting possibilities. Since entities are not required to have the same properties or types, your application must enforce adherence to your data model, whatever that may be at the time. A property can have one or more
values, embedded classes, child objects, and even values of mixed types. Entities are very flexible and are not defined by a database schema as in a relational database. At any point during the application life cycle you can add or remove entity properties. Newly created and fetched entities will utilize this new schema. Your application’s logic must be able to handle these changes.

App Engine uses the Java Persistence API (JPA)) and Java Data Objects (JDO) interfaces for modeling and persisting entities. These APIs, rather than the low-level API, ensure application portability. For your application, you’ll use JDO since the Eclipse plug-in generates your JDO configuration files. Of course, JPA is supported, but it requires some additional setup and configuration steps. If you are familiar with Hibernate or other object-relational mapping (ORM) ) solutions, JDO should be fairly easy to grok as these solutions share many features.

App Engine's JDO implementation is provided by the DataNucleus Access Platform, an open-source implementation of JDO 2.3. Again, the JDO specification is database-agnostic and defines high-level interfaces for annotating simple POJOs, persisting and querying objects, and utilizing transactions. Applications implementing JDO can query for entities by property values or they can fetch a specific entity from the datastore using its key. Queries can return zero or more entities and sort them by property values, if desired.

Classes and Fields

JDO uses annotations on POJOs to describe how these objects are persisted to the datastore and how to recreate them when they are, in turn, fetched from the datastore. The kind of entity is defined by the simple name of the class while each class member specified as persistent represents a property of the entity. The data class is required to have a field dedicated to storing the primary key of its corresponding entity.

Each entity has a key that is unique to Bigtable. Keys consist of the application ID, the entity ID, and the kind of entity. Some keys may also contain information pertaining to the entity group. Your application can generate keys for your entities, or you can allow Bigtable to automatically assign numeric IDs for you. In most cases it is easier to let Bigtable assign your keys so you don't have to write code to ensure that your keys are unique across all objects of the same kind plus entity group parent (if being used).

There are four types of primary key fields:

1. Long: An ID that is automatically generated by Bigtable when the instance is saved.

2. Uncoded String: An ID or "key name" that your application provides to the instance prior to being saved.

3. Key: A value that includes the key of any entity-group parent that is being used and an application-generated string ID or a systemgenerated numeric ID.

4. Key as Encoded String: Essentially, an encoded key to ensure portability and still allow your application to take advantage ofBigtable's entity groups.

If you want to implement your own key system, you simply use the createKey static method of the KeyFactory class. You pass the method the kind and either an application-assigned string or a system-assigned number, and the method returns the appropriate Key instance.

Comments

Thanks for a well thought out post.
I truly wanted to construct a note to express gratitude to you for the lovely facts you are giving out at this site. My rather long internet lookup has now been recognized with pleasant information to go over with my best friends. I 'd say that many of us website visitors actually are unquestionably endowed to exist in a notable website with very many special individuals with great things. I feel very much grateful to have encountered the webpage and look forward to many more excellent times reading here. Thanks a lot once more for a lot of things.
Howdy, I read your blog occasionally and i own a similar one and i was just wondering if you get a lot of spam comments? If so how do you prevent it, any plugin or anything you can advise? I get so much lately it's driving me mad so any assistance is very much appreciated.
RE
A value that includes the key of any entity-group parent that is being used and an application-generated string ID or a systemgenerated numeric ID.
RE
The service also supports transactional updates on single or application-defined groups of entities.
wholesale formal dresses at alldress.co.uk
The business loans suppose to be important for people, which are willing to organize their own career. In fact, that's very easy to get a bank loan.
This is definitely a nice site. I would definitely be coming back to it again.
Tory Burch is coming .
It is the true beauty ,I like it very much ,hope you can post more in the future time.p90x dvd
Your satisfaction is our #1 Priority! We offer a Hassle-Free 30 Day Money Back Guarantee!Here you can get it.Tory Burch Boots
I am glad to read this post, its an interesting one.Coach Satchel Bags
Hey,Loving your blog, awesome tips on this you have here. Iwould just like to ask you some questions privately, mind
I am very enjoyed for this blog. Its an informative topic. It help me very much to solve some problems. Its opportunity are so fantastic and working style so speedy. I think it may be help all of you. Thanks a lot for enjoying this beauty blog with me. I am appreciating it very much! Looking forward to another great blog. Good luck to the author! all the best!
The article is really awesome, and I got lots of valuable information from the article, it’s really very helpful for the visitors.
Hey,Loving your blog, awesome tips on this you have here. Iwould just like to ask you some questions privately, mind
ive had alot of troublw with working with the data bases pet supplies
Buy bulk Vending Locator, Vending Machine Locator. We have call center to provide you best location for your Vending Machine Locatorsin your area
this is very in trusted topic i like this
This site is wondeful.I am very love this blog.Thank you.
Hey, Loving your blog, awesome tips on ctoedge you have here. I would just like to ask you some questions privately, mind contacting me at livefaq@ decimaltofraction.com Thanks, Mark http://www.decimaltofract...
dizi izle I saw an article similar to this web pages. film izle I also share with you will find an. Thanks for all.
Abstracting away the database seems like a fine idea. Except in real world applications there are always performance limits and bottlenecks. How quickly do thes show up in App Engine Datastore and how easy is it to work around them? What is the role of a good DBA on this platform?

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <b> <i>

More information about formatting options