From the Blogosphere
Ceph Day Amsterdam 2012 (Object and Cloud Storage)
Ceph is used for deploying object storage, cloud storage and managed services
By: Greg Schulz
Nov. 27, 2012 11:30 AM
Recently while I was in Europe presenting some sessions at conferences and doing some seminars, I was invited by Ed Saipetch (@edsai) of Inktank.com to attend the first Ceph Day in Amsterdam.
As luck or fate would turn out, I was in Nijkerk which is about an hour train ride from Amsterdam central station plus a free day in my schedule. After a morning train ride and nice walk from Amsterdam Central I arrived at the Tobacco Theatre (a former tobacco trading venue) where Ceph Day was underway, and in time for lunch of Krokettens sandwich.
Let's take a quick step back and address for those not familiar what is Ceph (Cephalanthera) and why it was worth spending a day to attend this event. Ceph is an open source distributed object scale out (e.g. cluster or grid) software platform running on industry standard hardware.
Ceph is used for deploying object storage, cloud storage and managed services, general purpose storage for research, commercial, scientific, high performance computing (HPC) or high productivity computing (commercial) along with backup or data protection and archiving destinations. Other software similar in functionality or capabilities to Ceph include OpenStack Swift, Basho Riak CS, Cleversafe, Scality and Caringo among others. There are also the tin wrapped software (e.g. appliances or pre-packaged) solutions such as Dell DX (Caringo), DataDirect Networks (DDN) WOS, EMC ATMOS and Centera, Amplidata and HDS HCP among others. From a service standpoint, these solutions can be used to build services similar Amazon S3 and Glacier, Rackspace Cloud files and Cloud Block, DreamHost DreamObject and HP Cloud storage among others.
At the heart of Ceph is RADOS a distributed object store that consists of peer nodes functioning as object storage devices (OSD). Data can be accessed via REST (Amazon S3 like) APIs, Libraries, CEPHFS and gateway with information being spread across nodes and OSDs using a CRUSH based algorithm (note Sage Weil is one of the authors of CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data). Ceph is scalable in terms of performance, availability and capacity by adding extra nodes with hard disk drives (HDD) or solid state devices (SSDs). One of the presentations pertained to DreamHost that was an early adopter of Ceph to make their DreamObjects (cloud storage) offering.
In addition to storage nodes, there are also an odd number of monitor nodes to coordinate and manage the Ceph cluster along with optional gateways for file access. In the above figure (via DreamHost), load balancers sit in front of gateways that interact with the storage nodes. The storage node in this example is a physical server with 12 x 3TB HDDs each configured as a OSD.
In the DreamHost example above, there are 90 storage nodes plus 3 management nodes, the total raw storage capacity (no RAID) is about 3PB (12 x 3TB = 36TB x 90 = 3.24PB). Instead of using RAID or mirroring, each objects data is replicated or copied to three (e.g. N=3) different OSDs (on separate nodes), where N is adjustable for a given level of data protection, for a usable storage capacity of about 1PB.
Note that for more usable capacity and lower availability, N could be set lower, or a larger value of N would give more durability or data protection at higher storage capacity overhead cost. In addition to using JBOD configurations with replication, Ceph can also be configured with a combination of RAID and replication providing more flexibility for larger environments to balance performance, availability, capacity and economics.
One of the benefits of Ceph is the flexibility to configure it how you want or need for different applications. This can be in a cost-effective hardware light configuration using JBOD or internal HDDs in small form factor generally available servers, or high density servers and storage enclosures with optional RAID adapters along with SSD. This flexibility is different from some cloud and object storage systems or software tools which take a stance of not using or avoiding RAID vs. providing options and flexibility to configure and use the technology how you see fit.
Here are some links to presentations from Ceph Day:
While at Ceph day, I was able to spend a few minutes with Sage Weil Ceph creator and founder of inktank.com to record a pod cast (listen here) about what Ceph is, where and when to use it, along with other related topics. Also while at the event I had a chance to sit down with Curtis (aka Mr. Backup) Preston where we did a simulcast video and pod cast. The simulcast involved Curtis recording this video with me as a guest discussing Ceph, cloud and object storage, backup, data protection and related themes while I recorded this pod cast.
One of the interesting things I heard, or actually did not hear while at the Ceph Day event that I tend to hear at related conferences such as SNW is a focus on where and how to use, configure and deploy Ceph along with various configuration options, replication or copy modes as opposed to going off on erasure codes or other tangents. In other words, instead of focusing on the data protection protocol and algorithms, or what is wrong with the competition or other architectures, the Ceph Day focused was removing cloud and object storage objections and enablement.
Thanks again to Sage Weil for taking time out of his busy schedule to record a pod cast talking about Ceph, as well 42on.com and inktank for hosting, and the invitation to attend the first Ceph Day in Amsterdam.
Ok, nuff said.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO All Rights Reserved
Cloud Expo Breaking News
Top Stories for Cloud Expo 2012 East
In this Big Data Power Panel at the 10th International Cloud Expo, moderated by Cloud Expo Conference Chair Jeremy Geelan, Govind Rangasamy, Director of Product Management at Eucalyptus Systems; Kevin Brown; CEO of Coraid, Inc.; Christos Tryfonas, CTO and Co-Founder of Cetas; and Max Riggsbee, CMO and VP of Products for WhipTail, discussed such topics as: Big Data has existed since the early days of computing; why, then, do you think there is such an industry buzz around it right now? How is Big Data impacting storage and networking architecture in data centers? How about the intersection of Big Data Analytics and Cloud Computing - how big a sector is that and why? What's the difference between Big Data and Fast Data? ... (more)
Best Recent Articles on Cloud Computing & Big Data Topics
The Arlington, Virginia-based National Science Foundation has just released its "Report on Support for Cloud Computing" - in response to the America Competes Reauthorization Act of 2010, Section 524. It is an absolute must-read for all concerned with current and future research projects in Cloud Computing.
"The volume of data we're generating now from machines pales in comparison to the volume of data we'll soon generate from our own bodies," says data security expert Dave Asprey. Writing in a Trend Micro blog, Asprey - who is one of the leaders in the emerging Quantified Self movement - explains his vision of a world in which personal biometrical data is shared via the cloud.
Cloud computing has caught the attention of business leaders around the world in every industry because of its enormous transformative potential. Visionary companies know that the value of the cloud is far greater than the current focus solely on technology and operating costs: when combined with a collaborative approach to designing processes, cloud computing will change how we do business.
Want to make sense of the hottest new concept in Enterprise IT? Want to understand in just hours what experts have spent many hundreds of days deciphering? Cloud computing is a technology that has rapidly evolving peppered with a lot of hype along the way. Customers find it hard to navigate through this and make sense of what aspects of this technology will give them real business benefit. Cloud Computing Bootcamp, led by our 2013 Bootcamp Instructor Larry Carvalho, is a great way to get a practical understanding of this technology. We offer multiple days of actionable insight into what vendor offerings are currently available and help you comprehend their strategy. The ever-popular Bootcamp, which is now held regularly around the world, is being held in conjunction with the 12th Cloud Expo, June 10-13, 2013, at the Javits Center, New York, NY.
Did you know that ninety percent of the data in the world has been created in the last two years? Every day, we create 2.5 quintillion (or 2.518) bytes of data, according to IBM. As corporations across all industries globally are struggling with how to retain, aggregate and analyze this mounting volume of what the industry refers to as Big Data, it also provides a unique opportunity for innovative startups that recognize the business prospects Big Data presents. Big Data is not just unlocking new information but new sources of economic and business value. Interactivity is driving Big Data, with people and machines both consuming and creating it. Digital companies focused on becoming good at aggregating and analyzing the data created by the end users of their product, who then provide their customers with solid insights taken from that data are at a distinct competitive advantage over others in the marketplace.
Industry-specific clouds are those PaaS, IaaS, and PaaS services that are tailored for a specific vertical, such as transportation, retail, finance, and health care. IDC sees a $65 billion market in these industry solutions for 2013, rising to $100 billion in 2016. The value of industry-specific clouds is that businesses within a vertical can connect to applications, processes, and databases that are pre-defined for that vertical within a public or private cloud. They can extend processes and databases into the business domain, versus defining the data and processes within a generic cloud-based platform. So, are industry specific clouds right for your business? What options are out there? How do you figure out the ROI?
SYS-CON Events announced today that Rackspace Hosting, the open cloud company, has been named "Platinum Plus Sponsor" of SYS-CON's 12th International Cloud Expo, which will take place on June 10-13, 2013, at the Javits Center in New York City, New York. Rackspace® Hosting (NYSE: RAX) is the open cloud company, delivering open technologies and powering more than 205,000 customers worldwide. Rackspace provides its renowned Fanatical Support® across a broad portfolio of IT products, including Public Cloud, Private Cloud, Hybrid Hosting and Dedicated Hosting. Rackspace has been recognized by Bloomberg BusinessWeek as a Top 100 Performing Technology Company, is featured on Fortune's list of 100 Best Companies to Work For and is included on the Dow Jones Sustainability Index. Rackspace was positioned in the Leaders Quadrant by Gartner Inc. in the "2011 Magic Quadrant for Managed Hosting." Rackspace is headquartered in San Antonio with offices and data centers around the world.
10th International Cloud Expo, held on June 11-14, 2012 at the Javits Center in New York City, featured four content-packed days with a rich array of sessions about the business and technical value of cloud computing led by exceptional speakers from every sector of the cloud computing ecosystem. The Cloud Expo series is the fastest-growing Enterprise IT event in the past 10 years, devoted to every aspect of delivering massively scalable enterprise IT as a service. We invite you to enjoy our photo album of the show - we'll be adding new images all week.
Ulitzer.com announced "the World's 30 most influential Cloud bloggers," who collectively generated more than 24 million Ulitzer page views. Ulitzer's annual "most influential Cloud bloggers" list was announced at Cloud Expo, which drew more delegates than all other Cloud-related events put together worldwide. "The world's 50 most influential Cloud bloggers 2010" list will be announced at the Cloud Expo 2010 East, which will take place April 19-21, 2010, at the Jacob Javitz Convention Center, in New York City, with more than 5,000 expected to attend.
Cloud computing is becoming one of the next industry buzz words. It joins the ranks of terms including: grid computing, utility computing, virtualization, clustering, etc. Cloud computing overlaps some of the concepts of distributed, grid and utility computing, however it does have its own meaning if contextually used correctly. The conceptual overlap is partly due to technology changes, usages and implementations over the years. Trends in usage of the terms from Google searches shows Cloud Computing is a relatively new term introduced in the past year. There has also been a decline in general interest of Grid, Utility and Distributed computing. Likely they will be around in usage for quit a while to come. But Cloud computing has become the new buzz word driven largely by marketing and service offerings from big corporate players like Google, IBM and Amazon.
SYS-CON Events announced today that Dell Inc. has been named "Silver Sponsor" of SYS-CON's 12th International Cloud Expo, which will take place on June 10-13, 2013, at the Javits Center in New York City, New York. For more than 28 years, Dell has empowered countries, communities, customers and people everywhere to use technology to realize their dreams. Customers trust Dell to deliver technology solutions that help them do and achieve more, whether they're at home, work, school or anywhere in their world. Learn more about Dell's story, purpose and people behind its customer-centric approach.
One of the most compelling promises of the cloud is that you can pull out a credit card and be working in minutes. No purchase orders to fill out, no equipment to wait for on the loading dock. Just instant access to the resources you need, when you need them. But accessibility comes at a price, and an unintentional consequence may be that you create yet another orphaned identity silo. Enterprise IT has spent years consolidating its mishmash of directories, only to discover that cloud now threatens to turn back their hard-won victories. In his session at the 12th International Cloud Expo, Scott Morrison, CTO and Chief Architect at Layer 7 Technologies, will look at strategies to incorporate identity into cloud applications. Enterprise identity or social login can both be a part of your go-to-cloud strategy, but you must plan for this upfront, rather than try to retrofit identity and access control at a later date.
Cloud Expo, Cloud Expo East, Cloud Expo West, Cloud Expo Silicon Valley, Cloud Expo Europe, Cloud Expo Tokyo, Cloud Expo Prague, Cloud Expo Hong Kong, Cloud Expo Sao Paolo are trademarks and /or registered trademarks (USPTO serial number 85009040) of Cloud Expo, Inc.
The World's Most Influential Blogs