MarpX Privacy

What is in an Encrypted File?

  1. Size of the files -- original, encrypted, and decrypted:
  2. Run a byte-by-byte comparison:
  3. Examine the byte distribution of the encrypted file:
  4. Look directly for patterns:

Size of the files -- original, encrypted, and decrypted:


Assignment7.pdf is an instance of a PDF file that has been encrypted, then later decrypted. Here it is in a DOS listing before it was encrypted. DOS spells out sizes to the exact byte count. Assignment7.pdf from the example above is 1,425,184 bytes.



The encrypted version, assignment.pdf.enc, is 771,414 bytes (a 46 percent saving in archive space). The decrypted file assignment7.decrypted.pdf is exactly the same size as the original -- 1,425,184 bytes.

Run a byte-by-byte comparison:


fc stands for "file compare"; it's a standard DOS program for comparing files. It shows that the original and the later decrypted files are identical.

Examine the byte distribution of the encrypted file:


Bytes is a tiny Marpex Inc. utility program to assess the distribution of bytes in any file whatsoever. The /d argument tells the program to provide detail -- the byte value in printable form or in octal (base 8), the byte value in hexadecimal (base 16), the frequency, percent occurrence, and first eight positions of each byte value from 0 through 255.


Here's the top of the list...


and a section in the middle...



and the end of the list. Inspection of the entire list shows every byte value making up 0.4 percent of the file.



The frequencies are very tightly grouped. The mean frequency is 3013. The standard deviation indicates that 95 percent of the frequencies lie between 2905 and 3121 (minus or plus 108). In other words, the distribution is exceptionally even. There is no accepted standard measure of randomness or entropy of a file, but nearly even distribution like this suggests there is little in the way of patterns remaining within the encrypted file. Our little "pattern pulverizers" have done their job well.

Look directly for patterns:


ByteSeq is another Marpex Inc. utility that takes the time to look for every recurrence of any four byte pattern within a file. In the 771,414 bytes in assignment.pdf.enc, the ByteSeq program looked at each of the 771,411 four byte patterns, and counted instances in which a pattern occurred at least three times. That's a lot of work, but the program is efficient; it took four seconds. Its finding:



No patterns, never an instance in which any pattern occurred more that twice. This is typical of MarpxPrivacy encrypted files in this size range. Larger files may show very occasional instances of four byte patterns; that's all.


One more way of looking at patterns -- a byte dump. Here is a sample chosen at an arbitrary starting point within assignment.pdf.enc.



The 8 digits to the left are the offset. 16 bytes are shown in hexadecimal on each line, repeated to the right with printable characters showing and binary bytes shown as periods. No matter where you try within an encrypted file, this is the kind of meaningless patternless distribution you will see.


What's the point of all this? A patternless file leaves a hacker without clues on how to trim the steps required to apply a brute force technique to decipher a file made private using MarpxPrivacy.


MarpX Privacy