In 1996, I joined the CLARiiON team to work on a new Storage Resource Management product. It was a management software leap to go along with the leap forward from SCSI to Fibre Channel. We looked at everything that was “wrong” about the existing solutions, took into account new requirements based on the scalability of the new hardware, updated our products to use the leading edge technologies, and created something entirely new – Navisphere. It was a huge splash for CLARiiON, and helped define everything I think of as successful in a software project.
Fifteen years later, I’m writing about a new big splash for EMC in the SRM space – ProSphere 1.0. I’ll stop you right here and tell you that you need to go read Chuck’s post on the product. I can’t out-do his deep-dive into the industry angles and why it’s such a big deal, so I won’t even try.
What I will tell you is why working on this product was so different from any other product I’ve touched at EMC, and why I’m so proud to be able to announce it here. Just like fifteen years ago, it was a chance to take a look at everything “wrong” while also still looking in new directions at the same time the industry is making another scale leap with Cloud environments. This has been some of the hardest work I’ve done here at EMC. But seeing it get out the door is making it all worth it.
EMC ControlCenter has a decade-plus roller-coaster history. It’s been through some challenging times, on many fronts, but we’ve never stopped trying to improve it for our customers. We often talked about things we’d like to do differently, given the chance. By the middle of 2008, a vision for the future was firming up – and we had a name for it. “SRM 7,” reflecting our goal to avoid “growing” ControlCenter to a new major version after 6.x.
We handed out T-shirts with a stylized “7” inside a diamond shape (which might evoke a certain superhero). The (perhaps dangerous) implication? SRM 7 was going to fly in and save the day. We were all a bit skeptical. But today we’re announcing the first shipping release of that product – ProSphere 1.0.
Why do I think it’s such a big deal?
We didn’t rebuild ControlCenter
Early on, we faced a critical decision – do we rebuild ControlCenter piece by piece, or do we build a new solution from the ground up? We knew part of the issue with ControlCenter was feature creep. We wanted to focus on critical customer use cases and build the application that could do those, and resist the temptation to build a giant unwieldy Swiss Army Knife. That philosophy bled into everything. We didn’t build a giant infrastructure that could address all our needs; we architected an extensible solution and implemented enough of it to get us through the use cases we were attacking. We avoided writing “just in case” code to support possible future features. We didn’t build what we could reuse. We knew we couldn’t ever finish this in time if we wrote it all from scratch, so we pulled in proven components from other shipping EMC products and integrated them.
Further than that, we didn’t want to be the single best interface for every deep use case – we knew you wanted to use the right tools for those jobs. So we made a plan — find those tools, link to them, launch them in context, and make your sign-in transparent – and in the process eliminate thousands of lines of code which need to be tested, debugged, upgraded, and so on. It’s a win on every front.
You’re never going to have a “lean” piece of software to do the giant job of managing the storage for your entire enterprise. But ProSphere is downright svelte compared to ControlCenter, and we intend to keep it that way.
It was ok to look outside
Another shift we made in ProSphere was to look outside our traditional sources of software components. We didn’t want to write, maintain, and test unneeded software. We didn’t want to architect and design unneeded components. We pulled in open source software, we used open standards, and we got creative. It was a learning curve for the development team, but in the end we have a product that communicates using known web technologies and patterns, and which we hope will serve as the foundation for an open, extensible management solution.
This extends deep into the product’s DNA, not just a superficial claim about what our APIs look like. Eliminating traditional agents, using HTTP to communicate between our machines, going to a virtual appliance model, using industry-standard system monitoring … all these things make the system higher quality and more extensible with less code to carry.
We got Agile
One of our early decisions was to abandon the waterfall software method we had more or less down to a science in ControlCenter, and replace it with an enterprise-scale Agile development approach. In my personal opinion, two excellent things came out of that decision: we “forced” our development managers to take ownership of use cases within the product, and we created cross-functional (development, quality engineering, documentation, product management, user experience design) teams to tackle those use cases in close collaboration with each other. In my fifteen years of software development, this is the closest cooperation I’ve seen between functions on a project team of this scale.
We built a safety net
Last, but most certainly not least, this product has the most aggressive safety net I’ve encountered. Every day, thousands of automated tests run against the system in various forms – unit tests against the code, integration tests against the REST interfaces, automated UI tests against the finished product in a test environment (deployed automatically after every finished build), and even more automated UI tests against the finished product in a “real” environment. Individual developers had access to a simple web-based tool to deploy a build, patch it, and run a suite of hundreds of automated regressions against it prior to delivery – all from their desks, without ever seeing an installation screen.
We combined this with static analysis tools constantly analyzing the source trees to check for bugs, security lapses, stylistic violations, and undesirable complexity, with web-based dashboards for everyone to see. Finally, there’s a dedicated team doing manual run-throughs of customer use cases in a variety of real-world environments. You take all that and combine it with an organizational mandate not to tolerate technical debt, not to tolerate little bugs accumulating in the system, and you’ve got a team that understands what quality means and has an extremely low rate of regressions.
Part of what took us so long was building all that scaffolding, and changing the culture of our development and test organizations to get us here. And now that it’s built (not that it’s ever “done!”), it will pay for itself for years to come.
I’ve been involved on this product for about three years now … the same amount of time I’ve been a father. And like any Proud Papa, I can’t wait to show off.
We built a solid foundation – a highly deployable, serviceable, and usable application – and concentrated our efforts on a small, tight family of use cases using that foundation. We’re already working on what’s next. I think we’ve changed the game here. I’m proud to be a part of that. I was just one manager, with one small team, working on little bits and pieces of this giant project. I’m grateful to my team for consistently seeing the big picture and working nonstop to get us there. I can’t begin to explain how proud I am of the work they did.
It hasn’t been easy; a lot of blood, sweat, and tears are flowing in the hallways. But nothing worth achieving is easy, and there are a lot of smiles today as we finally take the wraps off this.
ProSphere 1.0 is just the first step. It’s your turn now. Check it out. Tell us what we’re doing right, what we’re doing wrong. Help us take this to the next level. I promise I’ll listen and do my part to make sure your voice is heard.