Cloud & Cryptanalysis: how the cloud could modify the hacking customs in a close future
Why the Cloud will become one of the main data security issues? Study conducted by HTTPCS cybersecurity experts.
Everyday and everywhere, people is debating about cryptographic systems and their security. To decide what level of security one needs for one’s data, it is first necessary to be aware about cryptanalysis methods that are used nowadays. Indeed, one of the most important issues in security is to know how many resources an attacker would be able to deploy in order to decrypt any data.
After a presentation of brute forcing, we will discuss about the improvments in computation efficiency by the last years, namely CPU power, FPGA dedicated machines, GPU parallel calculators and especially the recent use of the Cloud. The object here is to explain how the Cloud could radically facilitate the way of hacking cryptosystems.
What is a brute force attack?
The first idea coming in mind when talking about brute force is a robot trying to login on a server exhaustively with every password possible, or at least with a list of common passwords. Fortunately, this way of doing is rarely efficient because most of servers are equipped with firewalls which would blacklist such an attacker after some failures. More, that method would be extremely slow and very noisy on the server.
More commonly, brute forcing consists in retrieving a plain password from a hash result, previously got using another flaw, such as an SQL injection, an XSS vulnerability, or more easily a safe listenning of a communication between a client and a server.
There are several methods to do a brute force, some preferring time efficiency, or memory economy and others doing a compromise between both.
How is a hash function made?
One of the goals of a hash function is to store private data, such as passwords, in a blurred way. Indeed, we don’t want an attacker who would have any access to our database to obtain the clients’ plain passwords. Let’s consider a simple hash function converting a password in a hash result composed of the two last characters of this password :
The main attribute of a hash function is its one-way characteristic, in other words, it is simple to find the hash result from a password (here just take the two last characters), but it has to be very hard to retrieve the password from a hash given (here from “wd”, we can’t retrieve the password “mypasswd”).
For a hash function given, the size of the result is always fixed. Given that there are much more possible passwords (because the size can vary) than possible hash results, a hash function always admits collisions (several passwords giving the same hash), for our example :
However, a hash function is expected to minimize the number of collisions and make as harder as possible to find any collision. Actually, a hash function is considered good if an attacker can’t retrieve any possible input leading to a hash result given. For our example, an attacker can simply choose “passwd” if he has the hash result “wd”. Therefore, our hash function is not really good. Thankfully, there are plenty of good hash functions which respect this expectations, among the most famous ones : MD4, MD5, SHA1, SHA256, SHA3, SHA512.
Now, let’s think about it from the perspective of an attacker. We want to brute force a hash result, assuming we know which hash function is used. The simplest way is to compute the hash result corresponding to each possible password, until finding the good one.
Okay, in this way we should theoritically retrieve the plain password (or a collision of it). However, if there are lots of possible passwords, the computation could last too much time. For exemple, if we assume that the password is composed of exactly 9 alpha-numeric characters (lower or upper), it raises to 62 possibilities per character, or 629 possibilities, what roughly corresponds to 254. Assuming we can compute 32 millions of hashes by second (≈ 254hashes/s), this exhaustive attack would last approximately 500 millions of seconds (≈ 229 s), that is about 17 years (this example uses an Intel i5-5300U CPU, 2.3 GHz, with 2 cores).
Another way would be to do this previous computation one time with a big calculator, and store the pairs (plain, hash) in a table. So when we want to retrieve a plain password, we just have to look for the hash result in the table and take the corresponding plain.
With this method, the problem is to store the table, which contains 254 pairs of words (plain, hash) with the previous example. Considering that a password represents 9 characters (so 9 bytes) and with a hash size of 16 bytes (e.g. MD5), we get 254 times 25 bytes to store, that is more than 258 bytes, or roughly 250,000 TBytes !
These issues lead us to other methods for brute forcing. One of them consists in creating dictionnaries containing the most common passwords with all the imaginable declinations, staying with 9 characters (Figure 3).
With that method, we definitely reduce the cost of a brute force attack. But obviously, this method is not exhaustive, if a client uses an uncommon password (e.g. : ag14d23a4), it will have a small probability to be in our dictionnary.
There is actually another way to do, making an efficient compromise between time and memory, but staying with an exhaustive research. Imagine a non-exhaustive dictionnary composed of 2 stored columns (passwords in the first and hash results in the other), but with hidden intermediate columns which are not stored but can be computed when required by applying intermediate functions, and finally all these columns compose an exhaustive table, called Rainbow Table. With an ingenious way, Rainbow Tables allow attackers to significantly reduce memory and time complexity, for our precedent example, the quantity of stored data change from 250,000 TBytes to 690 GBytes, that is a factor gain of roughly 5,000.
Although these last method, which permits to do a good compromise betwenn time and memory, the brute force complexity stay important, in particular if you want to hack a bigger password than 9-character size. That leads to wonder what have been the best ways to accelerate computations in the last years.
Recent Improvments in attacking methods
The Moore’s Law, enunciated in 1975 by Gordon Moore (co-founder of Intel), announced that CPU (Central Process Unit) capacity would double every two years. Through the last forty years we can verify that this law has been approximatively true. Obviously, this evolution allows attackers to double their efficiency every two years. However, new methods using parallel systems have considerably improved even more these performances.
FPGA (Field-Programmable Get Array) dedicated machines have been the first ones to use parallel computation. In terms of performances, FPGA allow to be closest to the machine language, ignoring the useless tools on your computer. A famous example of that machines is Copacobana, of which we can see a machine (Figure 4).
FPGA was used until the end of the 2000’s, in particular to do exhaustive searches on keys. Capable to compute up to 264 operations in a reasonnable time, it was able to find a DES key (56-bit security) in less than one week. For example, we can see how is mounted the FPGA machine for an exhaustive search on a DES key with 4 search units, from “Cryptanalysis with Copacobana” (Figure 5).
GPU (Graphical Process Unit) parallel calculators allow a great gain of performance in big computations. GPU uses the graphics card of a PC, so it is accessible to everyone who have a graphics card. Here, the method is to attribute the onerous parts of a process to the GPU, which will execute this workload in parallel using its thousands of dedicated cores. The rest of the process stays executed by the CPU, what permits all the GPU power to be concentrated on the heavy part. OpenCL and CUDA are two languages which permit to program such systems, the first is open source while the second is proprietary of NVIDIA.
Sagitta is a company which propose powerful multi-GPU machines programmed with CUDA. For example, it is possible to buy Brutalis (Figure 6), composed of eight Nvidia GTX GPUs, two Intel Xeon E5-2600V3 CPUs, and 768 GBytes of registered ECC memory, for roughly $18,000. To compare with our computer, which computes 32 millions of MD5 hashes per second, Brutalis can compute approximately 130 billions of hashes per second, that is 5,000 times over.
Use the Cloud, what for?
Nowadays, we can choose in the Cloud instances of machines working with GPU, hence attackers are using more and more GPU via the Cloud to improve their performances in cryptanalysis and save money. For example, in order to execute the drown attack in TLS using the SSLv2 protocol, researchers have compared performances of an optimized GPU parallel execution with their proper material and via the Cloud.
With their material (32 Nvidia GTX 980 cards, and 8 AMD R9 290X cards), the attack lasted 18 hours and cost $18,000. Via the Cloud, using Amazon EC2, with 200 instances : 150 g2.2xlarge instances, each of which containing one high-performance NVIDIA GPU and 50 g2.8xlarge instances, each containing four of these GPUs, the attack took 8 hours and cost $440.
The main default of using the Cloud is that you have to pay for the instances every time you execute an attack, but it is almost 100 times cheaper than buying a proper GPU cluster (for same efficiency). Moreover, we remark the uselessness of permises to stock a big calculator as well as all the costs and technical contraints generated by it (electricity, rent, etc…).
Those reasons lead to consider the Cloud as the substitute of proper big calculator in the future. One of the main challenges would be in that case to secure the communications with the server as best as possible in order to prevent any ill-intentionned listener from getting informations.