SQL Server vs MongoDB vs MySQL vs Hadoop and HBase NoSQL
Microsoft SQL Server is one of the mainstream databases used in most operational systems built using Microsoft technology stack. One of the biggest shortcoming is the inability to support horizontal scaling / sharding. So the next logical choices that are most nearest to SQL Server would be MySQL.
In case you are looking for horizontal scaling / sharding, that would mean that you are gearing up to deal with Big Data. MongoDB is the arguably the first logical step in NoSQL world, in case if someone is considering to experiment with NoSQL to handle BigData. At the stage, one is faced with the requirement to compare all these databases.
Many forms of analysis depends upon data from third-party, and platforms like Windows Azure Marketplace are based on the same principle.
Social Analytics is widely used to forecast the impact on the business and extract insights to counter the same. The interesting question here is, what is the data source that can be used to calculate / derive sentiments of customers related to the respective business ? A majority of this data would come from social / professional / collaboration forums. Examples of such sources are Facebook, YouTube, Twitter, LinkedIn, PInterest, IMDb, Blogs etc. Anyone would agree that the analytics derived from unstructured data created by the public interaction on social media can be expected to be much more close to precision than even any data mining algorithm. But the big question here is, the amount of data – very very very big data. On a daily basis, there are 400 million Tweets, 2.7 billion Facebook Likes, and 2 billion YouTube views. Even these figures might have been outdated today.
Say an organization is influenced by Sharepoint 2013 enhancements related to social media collaboration, and intends to add an ability to derive sentiment analysis in their client offering. Let’s say that as a starting source, Twitter is selected as the source of data, and all the public tweets for a particular product would be analyzed and the results would be stored for future use.
The first challenge is that according to a study, Twitter generates approximately 1 billion tweets in less than 3 days. So how to deal with processing such a huge amount of unstructured data and just consider the kind of infrastructure required to handle this processing. To proceed with the case study, let’s say that we live in the age of cloud and we just signed up on AWS and have beefed up a fat Amazon EMR that uses Hadoop and HBase NoSQL database.
Microsoft Certified Solutions Associate (MCSA)