Monday, December 9, 2013

The Digital Enterprise - Part 3 - Leveraging Social Media

Continuing, off our discussion from Part 2, about using social media data in the context of enterprise applications. Let first look at the big picture of how social media analytics can be used in the context of an enterprise.

We talked earlier about the idea of leveraging the fact that end users are increasing using social media accounts like Facebook to login-in to company websites. Such a login can provide a trigger for requesting the user for sharing his/her social media profile details. These details can then be stored off and used as for offline analytics, to infer more about the end user. Getting to know the end user better in this fashion can then be translated into, targeted ad content as well as personalization for the user, to suite his current and future needs.

Refer to the architecture diagram below, which I will walk through in the discussion below:

  1. We see the end user / customer accessing a company (say Insurance Company's) website or mobile app
  2. The user is prompted to login to the company website using social account like Facebook account
  3. After a 3-legged OAUTH 2 based authentication success, the company web app receives the login success callback.
  4. Inside the login success callback for say facebook, a quick async ajax call ie POST is made to the Social Media Analytics Package (SMAP), which hosts our analytics application
  5. The entire user profile data (in this case Facbook user profile) is persisted in a document oriented database like MongoDB
  6. Similarly user profile data from twitter, linked-in, google-plus, etc can also be stored in the same MongoDB. Periodically,batches can be run,which will merge such aggregated social data based on keys such as Name, DOB, email addresses etc. Over a period of time, the Insurance company should have aggregated data for several customers and potentials, who have willingly registered on their website.
  7. Now the same Social Media Analytics Package (SMAP) application can act as a provider for Clients which query for additional information about prospects and customers. Such queries can be issued and responded to over REST interface. Thin client browsers as well as mobile apps can access user profile information over REST interfaces
  8. The SMAP REST interfaces can also be used by existing within enterprise CRM services, which can combine this social media user data with in-house CRM data, to produce consolidated profiles of users and prospects

The above architecture and high level design can be used to effectively, leverage social media based user profile information, to augment enterprise's own CRM databases.

On a side note, the same SMAP and MongoDB database can also be used to store website-level analytics data from Google Analytics, Facebook Insights, etc. Such data can be aggregated across large time periods to analyse a variety of user's trends and habits.


Friday, December 6, 2013

Simplifying REST with MongoDB

REST and MongoDB seem to be a match made in heaven. JSON is the popular format for transferring data across REST interfaces using the omnipresent and easy to use HTTP. BSON the data storage format of MongoDB is also a superset of JSON and is capable of accepting JSON input.

The natural expectation, then, is how can I use JSON in REST to directly store and retrieve data into MongoDB.

MongoDB's mongod, when started with the --rest option provides a bare minimum REST interface for querying MongoDB, using prototype URL of format

More information can be found at under heading Simple REST Interface

But, for REST based insert, update, upsert and delete, you need to front Mongo DB with one of the following third party servers:

REST Interfaces
DrowsyDromedary is a REST layer for MongoDB based on Ruby.
MongoDB Rest is an alpha REST interface to MongoDB that uses the MongoDB Node Native driver.
Mongodb Java REST server based on Jetty.
Kule is a customizable REST interface for MongoDB based on Python.

HTTP Interfaces
Sleepy Mongoose (Python)
Sleepy Mongoose is a full featured HTTP interface for MongoDB.

Often, using these upfront REST to mongo translators, means intoducing another moving part into your architecture and having to contend with issues like cross origin access. Also there is the requirement to customize the REST to MongoDB translation. Hence many prefer to write their own simple REST controllers which delegate calls to MongoDB.

Well, so did I and gradually realized that, the code can be simplified quite a bit.

With the assumption that you have conventional java based REST controllers set up in your web application using say Jersey, here is the controller class you would have to write:


public class VehicleController extends MongoBaseController{
    public String getEntityName() {
     return "vehicles";

    public String getKey() {
     return "vehicleId";

Writing just this simple code, you can get, the following CRUD REST APIs, storing into your MongoDB database, since it will expose the following services:

at URL http://localhost:10080/<My Web App>/restful/vehicles
Http PUT for inserting vehicles in MongoDB
Http POST for updating and upserting vehicles in MongoDB
Http DELETE for deleting vehiles in MongoDB

The critical functionality as you might have guessed lies nicelt encapsulated in the base class MongoBaseController, which can to common to any number of entities that we want to CRUD from REST to MongoDB


import com.ghag.mongodb.MongoCrudService;
import com.mongodb.BasicDBObject;
import com.mongodb.DBCollection;
import com.mongodb.DBObject;
import com.mongodb.util.JSON;

public abstract class MongoBaseController {
    abstract public String getEntityName();
    abstract public String getKey(); 

    public String insert(String entity){
     DBCollection coll = MongoCrudService.getConnection().getCollection(getEntityName());
     BasicDBObject row = (BasicDBObject)JSON.parse(entity);
  System.out.println("inserted entity "+row);
     return "SUCCESS_INSERT";
    public String upsert(String entity){
     DBObject inputRow = (DBObject)JSON.parse(entity);
     DBCollection coll = MongoCrudService.getConnection().getCollection(getEntityName());
     DBObject src = coll.findOne(new BasicDBObject(getKey(), inputRow.get(getKey())));
     if(src == null)
      src = inputRow;
  coll.update(src, inputRow, true, false);
  System.out.println("upserted entity "+inputRow);
     return "SUCCESS_UPDATE";
    public String delete(String entity){
  DBCollection coll = MongoCrudService.getConnection().getCollection(getEntityName());
  DBObject inputRow = (DBObject)JSON.parse(entity);
  DBObject src = coll.findOne(new BasicDBObject(getKey(), inputRow.get(getKey())));
  System.out.println("removed entity "+src);
     return "SUCCESS_DELETE";

So now, if you want to persist another entity say Book via REST and into MongoDB all you would need to write is a small class similar to

public class BookController extends MongoBaseController{
    public String getEntityName() {
     return "books";

    public String getKey() {
     return "bookId";

Thats it!

The new controller itself is free from any details of the structure of the Book entity, like its fields, data types etc.

That is truely combining the power of REST and MongoDB.

Mucho Eleganto! dont you think?


Tuesday, December 3, 2013

Getting started with Mongo DB

Once you are suitably convinced you do need a document oriented database for persisting your data, you can get started with using mongo db, the very popular document oriented database of our times.

While mongo db is the most similar to RDBMS as compared to other NOSQL databases, but there are a few surprising things you should know about mongo.

To recount a few:

Objects and arrays of objects can be used interchangeably in mongo db insert syntaxes

hence following is quite valid

db.mytablecoll.insert({name: ’Ganesh Ghag’, dob: new Date(1973, 6, 18, 18, 18), loves[’grape’,’watermelon’], hometown: 'Thane', gender: ’m’});

Update without Set

In its simplest form, update takes 2 arguments: the selector (where) to use and what field to update with.
db.mytablecoll.update({name: ’Ganesh Ghag’}, {hometown: 'Thane West'})
will replace the entire object aka row with {hometown: 'Thane West'}

to selectively update only one attribute of the row, you will need to use Set
db.mytablecoll.update({name: ’Ganesh Ghag’},  {$set: { hometown: 'Thane West'} } )

Multiple Updates surprise

The final surprise update has to offer is that, by default, it’ll update a single document.
so the following will only update the first row fulfilling the update criteria
db.mytablecoll.update({name: ’Ganesh Ghag’},  {$set: { hometown: 'Thane West'} } )

to ensure upsert functionality you will need a 3rd parameter "upsert" as true as shown below
db.mytablecoll.update({name: ’Ganesh Ghag’},  {$set: { hometown: 'Thane West'} }, true )

and to ensure all possible rows in collection are updated you need to set a fourth parameter "multiple" as true
db.mytablecoll.update({name: ’Ganesh Ghag’},  {$set: { hometown: 'Thane West'} }, true, true )

There is no "join" syntax in mongo db

All many to one and many to many relations are modelled using arrays of objects or embedded documents

Transactions in mongo

Mongo DB does not support transactions in classical sense, but has operators for atomic operations like $where and also 2-phase commit based manual transaction modelling

Support for geo-spatial queries

Mongo DB has direct support for geo-spatial indexex. This allows you to store x and y coordinates within documents and then find documents that are $near a set of coordinates or $within a box or circle.

If above mongo DB surprises are enough to keep you awake, wondering, at night, please visit Little Mongo DB book for an introduction to mongo db.