More than 100 IT professionals have been working for two years on bringing the 210-year-old Census technologically up-to-date, with the most obvious advance being the creation of the first online version of the survey.
An online survey raises a whole load of security questions for the organisers to address, on top of accessibility and data processing issues.
The convenience of an online survey should make it a popular way to submit answers, yet based on experiences in other countries, the organisers of Census 2011 only expect a quarter of the population to fill in their survey online. The five-year censuses of Australia, New Zealand and Canada, revealed an online response rate of between seven percent (New Zealand) and 18 percent (Canada).
“We are expecting about 25 percent to do it online, but we have allowed a greater capacity than that,” says Ian Cope, Census operations director.
He refuses to say exactly how much more capacity, only that the database can handle “significantly more” than the projected 25 percent, and that if more people fill it online “that’s great”.
But it is not the capacity that concerns Cope.
“The day-by-day profile is more of an issue. It’s more to do with when people are on, rather than how many. The database can probably cope with 100 percent [of the UK population submitting online],” he explains.
The organisers are continuously monitoring the usage of the Census website, which in itself was a new development for this year, on a minute-by-minute basis.
They expect to see peak demand on Census Day, 27 March, and has a contingency plan in place to manage the traffic if the demand exceeds expectations.
“If the website proves to be popular, it will let people who have already started [filling in surveys] to finish them, and new people would be told to come back later when there is sufficient capacity,” says Cope, quickly adding: “We are not expecting to need to do that – we designed it, but we don’t know that it is going to be necessary.”
By now, all of the 26 million households in the UK should have received their paper questionnaire, which also provides the details needed to fill in the survey online – an internet access code made up of 20 alphanumeric characters.
According to tests, a family of four takes an average 40 minutes to complete the survey online, which Cope says has intelligence built into it, so that if, for example, one of the respondents is a child, it will automatically skip the employment questions.
The organisers have also tried to ensure the accessibility of the website, building it to government compliance standards for easy usage. For instance, they tried out screen readers with the Royal National Institute of Blind People (RNIB).
“We did usability testing with a wide range of people. Young, elderly, ethnic, people with visual impairments, and we made a whole lot of changes around that. It’s amazing what people don’t know,” says Cope.
“I observed one [test case] where the button to click [to sign into the survey] was on the right-hand side and people couldn’t see it, because they read from left to right. So we moved it to the top left corner.”
The online census interface was developed by WTG Ltd, which is also responsible for the census online help facility. One of the big changes is that translation booklets are now available to download in more than 50 languages, compared to just 24 languages, previously.
The security for a mammoth project like the census is crucial, and it is for this reason that no cloud computing services, with their “inherent risks” have been used at all, Cope says.
The census system is standalone and bespoke, with servers located in a data centre in Manchester. Steria is providing systems administration, while Cable and Wireless is responsible for hosting and managing the internet data capture platform, including DDoS (distributed denial-of-service) monitoring.
“We’ve got a lot of firewalls, we are complying with government security standards, such as ISO 27001, and there has been lots of testing around the security applications. It is encrypted with SSL, similar to online banking,” says Cope.
A census rehearsal with 100,000 households also took place in October 2009.
“It’s a bit like the Olympics,” Cope says. “You only get one chance to get it right. You have to test it in development and on real people.”
The organisers worked with Logica for security advice, and although the company did a lot of testing with developers, the organisers commissioned their own attack and penetration testing “just to make doubly sure”.
The result was ‘The Independent Information Assurance Review”, published last month, which Cope believes is a strong endorsement for the Census team’s work around the security of the online questionnaire.
“The review team are satisfied that the Information Assurance (IA) measures put in place for this relatively new aspect of census operations are appropriate and can be expected to be effective,” the report says.
“We are confident that they are capable of delivering their IA objectives and that information will be held in secure environment and that will be handled in line with best practice and government standards. The public can be assured that the information they provide to the 2011 Censuses will be well-protected.”
The security measures extend to the decommissioning of the IT equipment, which Cope says will be done securely to government standards. Also, depending on their condition at the end of the Census, servers will be either destroyed or overwritten.
Technology is also enabling questionnaire tracking, and innovation around recruitment of the field force.
This year, every paper questionnaire features a barcode. The Royal Mail scans these barcodes, visible through a window of the sealed envelope, using flatbed scanners at 24 of its sorting offices, and the data is used to track how many questionnaires have been returned. This automatically updates an address list that the field force uses, so that they only chase households that have not completed the questionnaire.
The forms are then sent to the data capture centre in Manchester, where the government has 10 high-speed scanners which use Optical Character Recognition and Optical Mark Recognition software. These machines can scan each 32-page questionnaire in a quarter of a second, and 600 million sides of A4 are expected to be scanned in total.
Once the data has been captured and given a standard classification number, which is referred to as ‘coding’, the database is then passed to the England and Wales census headquarters in Hampshire, where the data undergoes further validation and quality assurance. The processing is expected to be completed in December 2011.
The Office of National Statistics has developed the aggregate ‘downstream processing’ system (that is, the cleansing, quality assessment and anonymisation of the gathered data) in house, using Oracle databases with SAS routines built in. Statistical analysis is only carried out on the anonymised data.
Finally, technology has also improved the efficiency of the census staff recruitment processes, which is being operated by Capita.
A total 35,000 staff are being recruited for four weeks, and according to Cope, 98 percent of applications for these positions were made online.
There is also an online assessment and e-learning tool, and when the staff have been hired, they will submit their timesheets online.