This project has moved and is read-only. For the latest updates, please go here.
4
Vote

Keyfile processing weaknesses and solution

description

According to common knowledge
https://en.wikipedia.org/wiki/Cyclic_redundancy_check
"CRCs are specifically designed to protect against common types of errors on communication channels, where they can provide quick and reasonable assurance of the integrity of messages delivered. _However, they are not suitable for protecting against intentional alteration of data."

Also read
https://www.privacy-cd.org/downloads/truecrypt_7.0a-analysis-en.pdf
Part
  1. An Attack on TrueCrypt Keyfiles
    as a precaution:
    '
    It were three major weaknesses in the TrueCrypt keyfile algorithm which facilitated the attack. The
    first weakness was the application of the cryptographically insecure CRC-32 algorithm. Its input
    can easily be crafted to yield any desired output. The second weakness was the linearity of the
    operations by which the pool bytes were changed, that is the additions. This made the attack a
    simple application of linear algebra. The third weakness was the local nature of changes to the
    pool. Every byte of the keyfile only changes four bytes of the pool. This enabled us to do the attack
    successively making one byte of the pool after the other zero.
    '
    '
    The use of keyfiles is insecure. They doesn't weaken the security supplied by the password
    used in conjunction with a keyfile but if a weak or even empty password is used with key
    files you are no longer secure.
    '
    None of those have been adressed so far while the document is dated August 2011!
Truecrypt 7.1.a implementation for keyfile processing is:
take 64 letters of the original password, filled with CRC32 of successive bytes of the keyfile.

first (A)
#define KEYFILE_POOL_SIZE   64
#define KEYFILE_MAX_READ_LEN    (1024*1024)
for (i = 0; i < bytesRead; i++)
    {
        crc = UPDC32 (buffer[i], crc);

        keyPool[writePos++] += (unsigned __int8) (crc >> 24);
        keyPool[writePos++] += (unsigned __int8) (crc >> 16);
        keyPool[writePos++] += (unsigned __int8) (crc >> 8);
        keyPool[writePos++] += (unsigned __int8) crc;

        if (writePos >= KEYFILE_POOL_SIZE)
            writePos = 0;

        if (++totalRead >= KEYFILE_MAX_READ_LEN)
            goto close;
    }
then.... (B)
/* Mix the keyfile pool contents into the password */
for (i = 0; i < sizeof (keyPool); i++)
{
    if (i < password->Length)
        password->Text[i] += keyPool[i];
    else
        password->Text[i] = keyPool[i];
}
I see the following isues:
  1. Because of KEYFILE_MAX_READ_LEN only one MB of each file is processed, while most reasonnable supply of semirandom yet 100% of personal origin files are... photo RAW or JPG files. this asks for processing length at least of several MB, like 40MB, in order to use their length.
  2. CRC32 is spreading bits around, but is not a perfect random number generator, I feel that something that gives 256bits i.e. SHA512 would more appropirate. Otherwise we will be implementing more or less random bits of CRC32 in periodic pattern along the pool, while keyfile content is also often periodic: there are QWORD aligned structures, data blocks, and most values have lower bits only.
  3. Section B is completely defeating the sense of using LONG keyfiles, because you can PRECOMPUTE entire directory of keyfiles, then mix them cheaply with (B) in your password list during bruteforce attack.
In my opinion:
The length of bytes being processed should be completely removed, if the attacker finds a directory of files 1KB, 1KB, 20MB, 500MB, then another 300 files of 1MB, let him not assume he can ignore 500MB file if the first month of dictionary-based hashing using now-known-to him keyfiles has failed. The reason of this is that it will be possible to have a simple, often used password for less sensitive data that is effectively PROTECTED by stronger keyfile. Also, this would reinforce two-factor strategy that says: you can record a password, you can record a screenshot, but even with evil maid attack, you must copy ALL suspected keyfiles what might be a GB and cannot pass undetected. At present it is not necessary, as you can precompute digest of all files on the computer and send to remote computer only those 64 bytes (#define KEYFILE_POOL_SIZE 64).
I think, one should compute a keyfile in the following manner:
Define bof[file_length/64] as a semicode meaning '64 successive bytes of keyfile'
Then compute:
HASH(..... HASH (HASH( HASH(password, salt) ) += HASH(bof[0]) )+=HASH(bof[1]) ) .....+=HASH(bof[last]) )
This way we ensure that:
-operation is single-directional because HASH is by definition single directional and cryptographically strong
-it starts with password, therefore you cannot precompute a digest of potential keyfiles independently on the password
-since this time we use keyfile in order to strenghten weak or maybe even empty password, we must include SALT into play so that during every password change from 'simple' to 'simple' you still get password reinforcement with keyfile.
-security of this depends on HASH, which should be secure 64 bytes (1024bits) of output. I propose to use dual cascade of SHA512 and Whirlpool, in order to ensure that if any of those algorithms has a flaw (also implementation-wise, see heartbleed SSL bug), there will be no total obsolescence of the strength of keyfile processing. Additionally when implementing the cascade we must ensure that when a short (less than 64 byte) keyfile is used, both Whirlpool and SHA512 are being used on its data
There was a flaw mentioned above where CRC32 HASH output was byte-aligned with POOL LENGTH. I propose the following implementation: we take a hash with input an output lenght of 64 bytes (equal to max password length, as today). 61 is the largest prime smaller than 64 and their common denominator is 3904 bytes. If we keep hash pool buffer length at 64, pre-fill it with any arbitrary nonzeros (since most theoretical attacks on hash is facilitated with 0 input, this woudl cut misguided speculations), we should fill that buffer in circular way by 61 characters at once then apply a hash on it with the hash of the password. This will ensure that:
  1. files shorter than 64 bytes will have two hash iterations,
  2. more importantly, any periodic pattern based on 64 bytes or 8 bytes will not coincide periodically with first byte input of the hash, with reasonnably long period
  3. each input file bit change will affect all 64 bytes of keyfile entropy pool.
The final goal is that the user is able to supply remaining bits of entropy using something that cannot be easily captured and transmitted over a network, nor hidden in a deserted sector using evil maid attack, yet is reasonnably fast to cumpute locally but ONLY starting from a password.
I remind you that a very well made (only fully random!) 20 characters a-z,A-z,0-9 is only log2((24+24+10)^20)=119 bits of entropy. keyfile is our method to supply the remaining bits in order to match overall 256 bit cipher strength, besides salt.
My favourite idea is to use something like USB with its own keyboard and short (not necessarily very secure PIN) like Carbide or Aegis Padlock. This could be used as keyfile holder that is beyond reach of operating system of the machine - it would be then a true two-factor authentication but would work ONLY if keyfile hashing is done seriously as presented. Even when operating system can access this USB content, transmiting it or storing it would be very hard because of sheer volume of those files.

How similar software deals with keyfile digest pre-computation attack:
-KeePass uses SHA256 on entire keyfile, SHA256 on password, then SHA256 on concatenated password and keyfile. This is also weak because you can make 256bit (32 byte) digest of all keyfiles.
-TrueCrypt, VeraCrypt needs 64 byte digest coming from CRC32
-Diskryptor makes SHA512( SHA512(block[0]) , block[1]) etc so at the end you still can precompute 64byte digest of a keyfile, yet it is best implement open source keyfile processing method.
All open source solutions are therefore vulnerable to pre-computed digest attacks.

comments

idrassi wrote Mar 21, 2015 at 3:38 PM

Thank you for sharing this interesting analysis of the keyfile mixing algorithm. Indeed, it present some weaknesses.
Your ideas go the correct way although this will add a huge processing time depending on the used keyfiles and the mixing algorithm is not intended to have one-way features. The idea is to have a diffusion algorithm for the keyfile bits. Moreover, the password/keyfile mix will be fed to PBKDF2 which performs the cryptographically secure key derivation, so doing extra hashes on the password/keyfile outside it is not needed.
As such, a good alternative can be to use AES primitive to do the job instead of CRC.

In TrueCrypt (and VeraCrypt), effective keyfile length is limited to 1MB in order to have a maximum processing time per keyfile and I think this idea should be preserved for efficiency (although it could also be configurable leading to a specific format for each specified limit).

Also, avoiding the possibility of per-computation by the attacker is also another interesting security measure. This definitely has to be implemented. On quick modification that would introduce this feature is to include the password in the CRC32 computation:
    for (i = 0; i < bytesRead; i++)
    {
        crc = UPDC32 (password->Text[writePos] + buffer[i], crc);
        crc = UPDC32 (password->Text[writePos + 1] , crc);
        crc = UPDC32 (password->Text[writePos + 2] , crc);
        crc = UPDC32 (password->Text[writePos + 3] , crc);

        keyPool[writePos++] += (unsigned __int8) (crc >> 24);
        keyPool[writePos++] += (unsigned __int8) (crc >> 16);
        keyPool[writePos++] += (unsigned __int8) (crc >> 8);
        keyPool[writePos++] += (unsigned __int8) crc;

        if (writePos >= KEYFILE_POOL_SIZE)
            writePos = 0;

        if (++totalRead >= KEYFILE_MAX_READ_LEN)
            goto close;
    }
Any comments?

I'm waiting for the next report of the Open Crypto Audit project concerning cryptographic primitive as I expect them to tackle the keyfile mixing algorithm. I would be interesting to have this own feedback on this subject.

kbosak wrote Mar 21, 2015 at 3:53 PM

'In TrueCrypt (and VeraCrypt), effective keyfile length is limited to 1MB in order to have a maximum processing time per keyfile and I think this idea should be preserved for efficiency (although it could also be configurable leading to a specific format for each specified limit).
'
Let the user configure it by choise a proper keyfile! User interface may create a new thread processing the keyfile whe first selectign it, so the user can press cancel if the tiem is unacceptably long. This will be self-scaling, instead of adding primitive constants all around.

'I'm waiting for the next report of the Open Crypto Audit project concerning cryptographic primitive as I expect them to tackle the keyfile mixing algorithm.'
What are you waiting for? All alternative software have somethign better and keyfile rpocessing has been bashed in August 2011 in the document above. This looks much more serious than theoretical discussion about 64byte password limit. I woudl say it goes in line with official statement: 'Using TrueCrypt is not secure as it may contain unfixed security issues' IT DOES. HERE YOU HAVE IT. KEYFILE PROCESSING.

idrassi wrote Mar 21, 2015 at 5:24 PM

Thank you for the link. I know this weakness and things are evolving in VeraCrypt step by step.

When I said I'm waiting for the audit report it was not as if I need an external validation of the known weakness but rather to see if they bring any new ideas to enhance the keyfile mixing algorithms. As I said in my answer, you idea is secure but a more simpler approach will be as secure since we only need to diffuse keyfile bits, hence my proposed modification to include the password in the computation and use AES function to perform diffusion.

Concerning your latest idea to dynamically compute the limit using the keyfile and adding a cancel button, I don't see how this could work since we don't encore any information about this in clear in the header. When mounting a volume, we need to know how much data need to be read.
Basically, your new proposal means the we read all the keyfile(s) with no limit. This is what was proposed in your first message and my approach is to let the user define his own upper limit for effective keyfile content for efficiency purposes.

Any comment on the use of AES primitive? Do you see any problem with that?

kbosak wrote Mar 22, 2015 at 4:28 PM

I was thinking about AES - but no, don't use it. AES, by design, is a two-way mapping IF you have a key for decoding. Cryptographic Hash Function is a one-way ticket no matter what additional info you have. Therefore hash is the way to go. Since we don't know which hash might fall theoretically in the years to come, a cascade of two is welcome since it will stop speculations: 'did NSA planted something in SHA and if Whirlpool is only perceived as secure only because it is nonstandard and less tested'.

kbosak wrote Mar 22, 2015 at 4:32 PM

I am thinking about at which stage SALT is introduced. IMO we must assume the user might want to use very simple password like 'chicken', change it 20 times from the same password to another, yet, every such change should make potential keyfile digest obsolete even when your attacker knows:
-your first password
-that the password didn't changed, and was entered again as the same(maybe by mistake, naybe 4 years later after the first version of the password has been sniffed)
-the keyfile didnt changed
but the salt has changed.
This means SALT must be intoduced into the hashlist EARLY, before keyfile.

kbosak wrote Mar 22, 2015 at 4:35 PM

Why is this all important: you can execute such attack on keyfiles with online antivirus very easily.