William Scherlis is a professor of computer science at Carnegie Mellon University and director of the Institute for Software Research there. He specialises in software assurance, software evolution and technology to support software teams. He has a long association with NASA and the US Department of Defense. Scherlis spoke with Gary Anthes about progress in software development.
The performance of hardware - be it microprocessors, storage systems or networks - has increased exponentially over the years. Why has progress in software been so slow?
Sometimes people think we are at a plateau with software. But I'd like to refute that. It's making enormous strides, and on a pretty steady basis.
It's been 50 years since John Backus, the inventor of Fortran, wrote his seminal paper on "automatic programming" to describe the translation from Fortran to machine code. Why "automatic programming"? Because at the time, Fortran seemed so highly descriptive and problem-oriented. Now the old Fortran seems very low-level and mechanistic. This same cycle has happened in the world of information systems and databases - remember the "automatic programming" promise of (fourth-generation programming languages)?
Fortran and 4GLs were enormous advancements, but with these kinds of advances, our ambitions correspondingly increase. So the commoditisation and standardisation never completely take over - the market drives us to create new value, and so we're improving tools, languages and processes at the top end just as quickly as we "routinise" and automate at the low end. The magic of software is that, because there are no limits of physics, we can keep advancing the technology to meet our ambitions. I call this the "endless value spiral."
What are some more recent advances?
Object-oriented programming was a similar leap forward, and that is manifest in C++ and Java and C#. Object- oriented programming has allowed us to do things that previously we couldn't do, and one of the most important is building software frameworks - application servers, e-commerce frameworks like J2EE and .Net, and the ERP frameworks like SAP's NetWeaver. More recently, there has been a parallel development in an area called functional programming and a language at Microsoft called F#. They have built that into their .Net Framework.
There are also giant-scale programming models for high performance. Google and Yahoo both use the "MapReduce" model to handle parallel processing across multiterabyte data sets, rather than SQL, for example.
What's so important about those things for the future?
The framework technologies allow the very clean separation of infrastructure provisioning from infrastructure usage. To a bank CIO, that means that it's possible to outsource the infrastructure and in-source just exactly that software development that differentiates the bank from some other bank. In the bad old days, you had to roll your own for the infrastructure as well.
A software framework like .Net or J2EE provides a rich array of services but also binds you to a particular architecture and control flow. If you are going to build a website to sell T-shirts, you can wholly appropriate large chunks of infrastructure and simply fill in the blanks to sell T-shirts. The work you need to do is very much lower than it used to be.
Are service-oriented architectures part of this future?
SOA, construed in the broadest possible way, is really a vision of a future whereby we rethink the way we compose various services into overall capabilities. The vision is to "regularise" the taking of pieces and putting them together. The SOA idea is to construct a set of protocols - ways to transact - that support the very rich, flexible frameworks model. That flexibility is important because using these frameworks and ERP models is a little like having a spinal cord transplant. It's an enormous commitment. So SOA is appealing because it gives you a sense that you can plug and play. You can swap [application code] in and out relatively easily.
So with all this software reuse, will our bank CIO still need a programming staff?
He'll always need a programming staff. That's one of the lessons of Backus' automatic programming. The moment you automate, you need to move on to the next level of value, because you are competing with your peers.
But the CIO will need a different kind of programmer, one who can configure all the pieces and then write that little bit of code that does the special thing that differentiates the bank. The idea of the guy sitting alone in a cubicle writing code on a clean sheet of paper won't happen anymore. The model will be somebody who is broadly aware of a constant flow of new capabilities, assessing them, assessing risk and then putting everything together.
Do we need still more programming languages?
We are constantly getting surprised by languages. Before Java showed up, everybody thought we were done. The community resists change because it seems to be very costly and risky to move to a new language. But, yes, I think we will continue to make leaps forward in languages, and the leaps will have to do with how easy it is to express powerful thoughts as programmers. Some of the leaps will come from the language itself, but others have to do with the tools and models that surround the language.
What are some examples of these tools and models?
Microsoft, for example, is at the forefront of something called deep program analysis. They are deploying it internally in a very aggressive way. They are tools that will analyse your code and tell you important things about it. In security, for example, it will say, "Here's a potential buffer overflow," or, "Here's where you are not checking for a null reference." These code analysis tools and models are often slowly brought into the language itself.
Are software developers ready to exploit the concurrent processing possible in multicore chips?
I think it will be a lot of work to get where we need to be, partly because we have a culture of quality that's based on testing and [code] inspection. When you run in a concurrent environment, your tests are not repeatable, because you have intermittent problems. A problem might occur with a probability of one in a million, so if you run a test case, you have a very small chance of catching it. But if it runs every millisecond on every blade in your datacentre, the error might occur every few minutes.
A technology that I have been working on for some years is called analysis-based verification. Instead of running some code repeatedly, hoping you'll catch an error, you do a mathematical analysis that can tell you something about the full universe of all possible runs. That's been a dream for many years, and the challenge has been to make it scalable to big, messy programs and also to be usable by working developers. We have written software to do this and have tested it on commercial software at scale. It can say, "Here in your concurrent code is where a deadlock could occur," for example.
Are there other important trends under way in software?
The whole outsourcing and globalisation phenomenon is widely misunderstood. People think it's all about cost, and cost certainly was an initial driver. But it's turned into a story about agility, about access to expertise, and about the technological and business enablers.
It's also about organisational enablers. For example, there's Conway's Law, which says that a piece of software will reflect the organisational structure that produced it. My colleague [Carnegie Mellon computer science professor] Jim Herbsleb has taken that bit of folklore and turned it into a science of how you can align your software architecture with your organisational structure to minimise coordination across organisational interfaces. That's an amazingly cool thing.
Perhaps the biggest issue going forward is how to manage enterprise-scale architectural innovation. Across the industry, we've become really good at small team development. But the secret sauce at so many companies is the managed architectural innovation that frames the activity of the many small teams that operate seemingly independently. Amazon is a great example.