{\huge Is Computer Security Possible?}

The breaks keep on coming... \vspace{.5in}

Holly Dragoo, Yacin Nadji, Joel Odom, Chris Roberts, and Stone Tillotson are experts in computer security. They recently were featured in the GIT newsletter Cybersecurity Commentary.

Today, Ken and I consider how their comments raise a basic issue about cyber security. Simply put:

Is security possible?

In the column, they discuss various security breaks that recently happened to real systems. Here are some abstracts from their reports:

The last is an attempt to make attacks harder by using randomization to move around key pieces of systems data. It seems like a good idea, but Dan Boneh and his co-authors have shown that it can be broken. The group is Hovav Shacham, Eu-Jin Goh, Matthew Page, Nagendra Modadugu, and Ben Pfaff. We talk about the first item at length, plus a fifth item by Odom on the breaking of a famous hash function.

Security Break-Ins

With all due respect to a famous song by Sonny Bono and Cherilyn Sarkisian, ``The Beat Goes On'': I have changed it some, but I think it captures the current situation in cybersecurity.

The breaks go on, the breaks go on Drums keep pounding A rhythm to the brain La de da de de, la de da de da

Laptops was once the rage, uh huh History has turned the page, uh huh The iPhone's the current thing, uh huh Android is our newborn king, uh huh

[Chorus]

Insanity?

A definition of insanity ascribed to Albert Einstein goes:

Insanity is doing the same thing over and over again and expecting different results.

I wonder lately are we all insane when it comes to security. Break-ins to systems continue; if anything they are increasing in frequency. Some of the attacks are simply so basic that it is incredible. One example is an attack on a company that is in the business of supplying security to their customers. Some of the attacks use methods that have been known for decades.

Ken especially joined me in being shocked about one low-level detail in the recent ``Cloudbleed'' bug. The company affected, Cloudflare, posted an article tracing the breach ultimately to these two lines of code that were auto-generated using a well-known parser-generator called Ragel:

if ( ++p == pe ) goto _test_eof;

The pointer {\tt p} is in end-user hands, while {\tt pe} is a system pointer marking the end of a buffer. It looks like {\tt p} can only be incremented one memory unit at a time, so that it will eventually compare-equal to {\tt pe} and cause control to jump out of the region where the end-user has control over HTML code being generated. Wrong. Other parts of the code make it possible to enter this test with {\tt p > pe} which allowed undetected access to unprotected blocks of memory. Not only was it a memory leak but private information could be exposed. The bug was avoidable by writing the code-generator to give:

if ( ++p >= pe ) goto _test_eof;

But we have a more basic question:

Why are such low-level bits of 1960s-vintage code carrying such high-level responsibility for security?

There are oodles of such lines in deployed applications. They are not even up to the standard C++ library which gives only {\tt ==} and {\tt !=} tests for basic iterators but at least enforces that the iterator must either be within the bounds of the data structure or must be on the {\tt end}. Sophisticated analyzers help to find many bugs, but can they keep pace with the sheer volume of code?

Note: this code was auto-generated, so we not only have to debug actual code but potential code as well. The article makes clear that the bug turned from latent to actual only after a combination of other changes in code system patterns. It concludes with ``Some Lessons'':

The engineers working on the new HTML parser had been so worried about bugs affecting our service that they had spent hours verifying that it did not contain security problems.

Unfortunately, it was the ancient piece of software that contained a latent security problem and that problem only showed up as we were in the process of migrating away from it. Our internal infosec team is now undertaking a project to fuzz older software looking for potential other security problems.

While admitting our lack of expertise in this area, we feel bound to query:

How do we know that today's software won't be tomorrow's ``older software'' that will need to be ``fuzzed'' to look for potential security problems?

We are still writing in low-level code. That's the ``insanity'' part.

SHA Na Na

My GIT colleagues also comment on Google's recent announcement two weeks ago of feasible production of collisions in the SHA-1 hash function. They fashioned two PDF files with identical hashes, meaning that once a system has accepted one the other can be maliciously substituted. They say:

It is now practically possible to craft two colliding PDF files and obtain a SHA-1 digital signature on the first PDF file which can also be abused as a valid signature on the second PDF file... [so that e.g.] it is possible to trick someone to create a valid signature for a high-rent contract by having him or her sign a low-rent contract.

Now SHA-1 had been under clouds for a dozen years already, since the first demonstration that collisions can found with expectation faster than brute force. It is, however, still being used. For instance, Microsoft's sunset plan called for its phase 2-of-3 to be enacted in mid-2017. Google, Mozilla, and Apple have been doing similarly with their browser certificates. Perhaps the new exploit will force the sunsets into a total eclipse.

Besides SHA-2 there is SHA-3 which is the current gold standard. As with SHA-2 it comes in different block sizes: 224, 256, 384, or 512 bits, whereas SHA-1 gives only 160 bits. Doubling the block size does ramp up the time for attacks that have been conceived exponentially. Still, the exploit shows what theoretical advances plus unprecedented power of computation can do. Odom shows the big picture in a foreboding chart.

Open Problems

Is security really possible? Or are we all insane?

Ken thinks there are two classes of parallel universes. In one class, the sentient beings first developed programming languages in which variables were mutable by default and an extra fussy and forgettable keyword like {\tt const} had to be used to make them constant. In the other class, they first thought of languages in which identifiers denoted ideal Platonic objects and the keyword {\tt mutable} had to be added to make them changeable.

The latter enjoyed the advantage that safer and sleeker code became the lazy coder's default. The mutable strain was treated as a logical subclass in accord with the substitution principle. Logical relations like {\tt Square} ``Is-A'' {\tt Rectangle} held without entailing that {\tt Square.Mutable} be related to {\tt Rectangle.Mutable}, and this further permitted retroactive abstraction via ``superclassing.'' They developed safe structures for security and dominated their light cone. The first class was doomed.