With the advent of the recent failure of RBS banking systems, there is a question on the lips of many security professionals, not to mention the aware members of the public, asking: “Are today's banking systems and the applied management fit for purpose?” The initiator of this particular article was born out of the recent debacle of the failing of RBS systems in June 2015, which saw the meltdown of supporting computers – this, notwithstanding a past pledge issued by the bank to invest £150m a year to deliver resilience into their operational estate. However, yet again the user base suffered the impact out of 600,000 failed transactions, which again imposed real-hardship on people who do not enjoy the backfill from amassed annual bonuses to keep them afloat. Here, the pain of impact was yet again felt at the very bottom of the economic ladder, with those awaiting tax credits and economic support in the guise of disability allowances taking the big hit. On this occasion, Mr. McNamara, who serves as the RBS IT Chief went on record in an attempt to diffuse this situation with a comment, saying, “It is not feasible to run 100% faultless systems.” He also added that “Technology will on occasion fail." When it comes to what is termed as ‘occasional failures’ to gain advantage from a historical comparator, we only need to reflect on the short-term – to a glitch which occurred at RBS in 2012, when 6.5 million customers were locked out of their accounts for days. This resulted in the bank suffering a fine of £56m, with the regulators observing that the bank had failed to keep its systems up-to-date, despite spending £1bn a year on system refresh. So, based on the aforementioned comments of McNamara, I am hopeful that the associated frequency of anticipated failures is not assessed at an acceptable cycle of an approximated 2.5 years between big show stopper outages. Another big-time critical systems failure that impacted the public was on 20 October 2014 with the Bank of England (BoE), who suffered an outage of the RTGS (Real-Time Gross Settlement) for approximately nine hours. Following restoration of the service later the same day, the Governor of the Bank of England Mark Carney apologised for any problems caused by the outage and announced an independent review (PDF). Post this downtime, which the BoE described as an inconvenience to hundreds of thousands of payments, including homebuyers waiting for money to be transferred to pay for their new homes, this adverse event had an enormous real-world, real-time impact implicating people. However, the real concern here would seem to be that of a lack of process in the area of Change Management – so, maybe the BoE should have followed the well tried and tested ISO/IEC 27001 directives at A10.1.2, A.10.1.4 and A12.5.2, representing just three examples which may help them in the future when sticking their logical-spanners into the innards of a system upon which so many depend, and let us not forget PCI-DSS. And to bring a real-world opinion out of qualified legal services comment to the outline the facts, here are the words of Holly Smith, who is a practicing Solicitor with Buckles LLP:
“The system downtime of CHAPs resulted in failures to meet contractual deadlines. Most contracts contain interest payments and penalties for late payments and parties had to rely on goodwill of their counterparties not to enforce such penalties. A survey conducted by the Law Society indicated that 30 percent of residential transactions were unable to complete until the next day or later – this system outage resulted in a very real impact on both businesses and individuals alike.”
Of course, there have been many other failures of such financial services critical systems but using these examples to cover what is, in my opinion a very condensed period in which such unacceptable implied acceptability of outages occurred, I think that the point has been driven home. With the backdrop of the above public examples, we may start to appreciate a picture of what may be considered tolerable by the banking institutions, but which may also be considered alien to the end-user public and small businesses. However, let us be very clear here, there are many more other areas of hidden concern that are out of the public gaze within the private confines of the covert corporate world where one may notice many other unacceptable practices which can verge on both the criminal, fraudulent, and of course, non-compliant behaviours. For instance, if you were to learn that at least two 'High Street Banks' have suffered from ghost transactions manifesting from untraceable, unauthorised transfers, and loss of an unrecovered £50m from their internal systems, would you believe it could really happen? Well, it did! And what about the bank which had a policy of not using cloud services, but then discovered that an entire PCI-DSS related data set had been regularly hosted en-route at an ISP sitting just off the M1, mixed in with other data-sets? Or maybe we should think about the financial institution who hosted a wide open insecure SAMBA share at the very heart of its operations through which – you guessed it, in-clear Client Data, and PCI-DSS assets transited every single day. But then, please forgive me as I digress, and these are only security issues, so a little off focus in our current context, other than such practices tend to infer a lack of control, governance, compliance, and what may be considered to be the application of Cowboy Methodologies to look after their clients (your) interest. But let us get back on track to the main topic, and that is the anticipation of assurance around the resilience to accommodate recovery post a system failure or outage. Now it is in this area where I feel it is not just a matter of corporate negligence, of bad practice which are in play, but is more a case of the fraudulent selling of services which did not exist, and as such, what does amount to criminal acts out of a well-known agency placed within the service portion of the financial services. In this case, a 'High Street Bank' had taken steps to assure that, in the case of a particular data set relating to credit references, they had contracted a Midlands-based financial services global brand to provision a backup service at a cost of around £1m+ per annum out of their East Midlands-based data centre. The problem was, however, that with this service, which had been sold under contract, never actually existed, and when the Director of Operations was asked what he would do in the case of this service being called into play, he responded, “We will have to go looking to lash something up." In the meantime, he added, "We are £1m+ better off year-on-year, so let’s just go with it and hope!" But it’s not just about the systems but also about the integration, or lack of. Here, I am considering the many call centres where their operatives have to suffer the daily pain of jumping from old style VT systems, various main frame screens, and Wintel applications in pursuit of provisioning a joined-up-service to the calling-client, not to mention reliance on legacy system which still run Windows NT SP6a to service critical operations – and lets leave XP to one side, as it may all just get to be far too depressing. The ultimate conclusion here must surely be that these are critical systems. They support the everyday lives of individuals, right up to big businesses, the economy and by implication, ‘we’ should expect and demand more when it comes to just how often they are allowed to fail. It may be on a risk assessment basis when we look to the multiple-mechanics of playgrounds, such as Alton Towers, mixed in with the physical attributes of intermixed people – there is a natural expectation that incidents will occur on a regular basis, yet they do not! Notwithstanding the devastating, press grabbing implications that such events naturally attract when they manifest into reality, they are rare when you look at the operational cycles of failure. Maybe that is why when one may hear the unsympathetic comment made post a catastrophic fail of a banking system, “At the end of the day, nobody died.” I am left wondering, if they did, would it serve as a game changer to drive a change of mindset? In the meantime, we may expect more of the same, more prospective failures, and the rewarding of those incumbents who are, have, or will be overseeing the same old in the future years. Editor’s Note: The opinions expressed in this guest author article are solely those of the contributor, and do not necessarily reflect those of Tripwire, Inc. Title image courtesy of ShutterStock