- Cloudera distribution of Apache Hadoop ( CDH ): It’s the first commercial Hadoop Startup. offers core open distribution along with a no. of frameworks which include Cloud era search, Impala, Cloudera Navigator and Cloudera Manager.
- Pivotal HD : includes a number of Pivotal software products such as HAWQ (SQL engine) GemFire, XD (analytics), Big Data extensions and USS storage abstraction. Pivotal supports building one physical platform to support multiple virtual clusters as well as PaaS using Hadoop and RabbitMQ.
- IBM Infosphere BigInsghts : includes visualization and exploration, advanced analytics, security and administration. There is no other vendor which can give you the flexibility of working on a Bare Metal machine. But that comes at the price of scalability. Bare Metal machine can’t be scale up or down on the fly. IBM’s other products BigQuality, Bigintegrate, and IBM InfoSphere Big Match can be seamlessly integrated for a mature enterprise operations.
- Amazon Elastic MapRedue: comes with EMRFS which allows EMR to be connected with S3 and use it as a storage layer. The fact that S3 is the market leader in object storage and many enterprises are already using S3 for their Big Data storage, makes it an obvious choice.
But AWS EMR work with AWS data stores only and I really doubt if it can be integrated with other storage options.
- Azure HD Insight : Azure HD Insight uses HDP (Hortondataworks Platform) distribution which is designed for Azure Cloud. Enterprise Architects can use C#, JAVA and .NET to create configure, monitor and submit Hadoop jobs.
- Google Cloud Dataproc: has built in integration with Google Cloud Services like BigQuery and Big Table along with Dataproc. Unlike other vendors Google bills you in minutes.