This post has to do with encrypting data in a storage server. When the storage server in question was architected and implemented the data at rest and in transit were raw (unencrypted). The main reason was that clients and servers where deployed in the same facility.
Years when by and the requirements called for encrypting data in transit while the data at rest was left raw. Encrypting data in transit could be performed by using HTTPS, secure sockets or encrypting the data when retrieved from the servers and decrypting by clients when received. Storing data would be the opposite. The initial decision, given that the client and server where under control, was to encrypt transmissions using the Advanced Encryption Standard (AES) which was designed by Vincent Rijmen and Joan Daemen back in 1998.
A couple years later, the requirements shifted to encrypt the data at rest and use HTTPS for network transfers. The current version of the storage server operates like this.
At the time when the data was stored unencrypted, clients could request via an extensive Application Programming Interface (API) and associated Command Line Interface (CLI) the retrieval of complete or partial files. One could specify a data offset and a data length and could retrieve such section of the file. Things have changed now that the data is kept encrypted at rest.
Following are some console screen captures that illustrate different the retrieval of the same file. Let’s start by storing a text file which would allow us to better understand what is going on. The client is running on a Windows machine.
C:\>casstore c:\temp\TextFiles\guid_list.txt casstore <<< STORING files using CASOpen(), CASWrite() and CASClose() line: 625 14feb5f0be5a26e887b05c1155ea99f7 - 0.09 seconds
The casstore CLI is invoked to store the text file c:\temp\TextFiles\guid_list.txt using the following APIs: CASOpen(), CASWrite() and CASClose(). Given that the file is quite small each of the APIs should have been called once. You can see the POSIX resemblance in the design and implementation of the API.
When done, a Global Unique Identifier (GUID) is returned. Please note that a GUID in the storage system is not the same as a GUID in the Windows world. A GUID is a 16-byte value displayed as a 32-byte string. Each object in the store has assigned a unique GUID. Objects are not editable. If a user needs to edit a file, the file should be retrieved, edited and then stored. The resulting file will have a different unique GUID.
C:\>casretrieve 14feb5f0be5a26e887b05c1155ea99f7 c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt casretrieve <<< RETRIEVING bifile using CASOpen(), CASRead() AND CASClose() line: 628 casretrieve <<< c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt - 0.03 seconds
We used the casrterieve CLI to retrieve the bitfile (that is what a stored file is called). Similar to how the file was stored, the bitfile is retrieved by using CASOpen(), CASRead() and CASClose(). We have specified the name of the file to be retrieved with an extension. This makes it easy to manipulate using Windows utilities which seem to rely on file extensions.
C:\>type c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt 14feb5f0be5a26e887b0570d03190011 14feb5f0be5a26e887b0570d03190012 14feb5f0be5a26e887b0570d03190013 14feb5f0be5a26e887b0570d031a0014 14feb5f0be5a26e887b0570d031b0015 14feb5f0be5a26e887b0570d031c0016 14feb5f0be5a26e887b0570d031d0017 14feb5f0be5a26e887b0570d031d0018 14feb5f0be5a26e887b0570d031e0019 14feb5f0be5a26e887b0570d031f001a 14feb5f0be5a26e887b0570d0320001b 14feb5f0be5a26e887b0570d0320001c 14feb5f0be5a26e887b0570d0321001d 14feb5f0be5a26e887b0570d0321001e 14feb5f0be5a26e887b0570d0321001f 14feb5f0be5a26e887b0570d03220020 14feb5f0be5a26e887b0570d03220021 14feb5f0be5a26e887b0570d03220022 14feb5f0be5a26e887b0570d03230023 14feb5f0be5a26e887b0570d03230024 14feb5f0be5a26e887b0570d03240025 14feb5f0be5a26e887b0570d03240026 14feb5f0be5a26e887b0570d03240027 14feb5f0be5a26e887b0570d03250028 14feb5f0be5a26e887b0570d032a0029 14feb5f0be5a26e887b0570d032a002a 14feb5f0be5a26e887b0570d032b002b 14feb5f0be5a26e887b0570d032c002c 14feb5f0be5a26e887b0570d032d002d 14feb5f0be5a26e887b0570d0331002e 14feb5f0be5a26e887b0570d0332002f 14feb5f0be5a26e887b0570d03320030 14feb5f0be5a26e887b0570d03320031 14feb5f0be5a26e887b0570d03330032 14feb5f0be5a26e887b0570d03330033 14feb5f0be5a26e887b0570d03330034 14feb5f0be5a26e887b0570d03330035 14feb5f0be5a26e887b0570d03330036 14feb5f0be5a26e887b0570d03340037 14feb5f0be5a26e887b0570d03340038 14feb5f0be5a26e887b0570d03340039 14feb5f0be5a26e887b0570d0334003a 14feb5f0be5a26e887b0570d0334003b 14feb5f0be5a26e887b0570d0334003c 14feb5f0be5a26e887b0570d0334003d 14feb5f0be5a26e887b0570d0334003e 14feb5f0be5a26e887b0570d0335003f 14feb5f0be5a26e887b0570d03350040 14feb5f0be5a26e887b0570d03350041 14feb5f0be5a26e887b0570d03350042 14feb5f0be5a26e887b0570d03350043 14feb5f0be5a26e887b0570d03350044 14feb5f0be5a26e887b0570d03350045 14feb5f0be5a26e887b0570d03360046 14feb5f0be5a26e887b0570d03360047 14feb5f0be5a26e887b0570d03360048 14feb5f0be5a26e887b0570d03370049 14feb5f0be5a26e887b0570d0337004a 14feb5f0be5a26e887b0570d0337004b 14feb5f0be5a26e887b0570d0337004c 14feb5f0be5a26e887b0570d0337004d 14feb5f0be5a26e887b0570d0337004e 14feb5f0be5a26e887b0570d0337004f 14feb5f0be5a26e887b0570d03380050 14feb5f0be5a26e887b0570d03380051 14feb5f0be5a26e887b0570d03380052 14feb5f0be5a26e887b0570d033a0053 14feb5f0be5a26e887b0570d033b0054 14feb5f0be5a26e887b0570d033b0055 14feb5f0be5a26e887b0570d033b0056 14feb5f0be5a26e887b0570d033c0057 14feb5f0be5a26e887b0570d033c0058 14feb5f0be5a26e887b0570d033c0059 14feb5f0be5a26e887b0570d033c005a 14feb5f0be5a26e887b0570d033d005b 14feb5f0be5a26e887b0570d033e005c 14feb5f0be5a26e887b0570d033f005d 14feb5f0be5a26e887b0570d0340005e 14feb5f0be5a26e887b0570d0340005f 14feb5f0be5a26e887b0570d03410060 14feb5f0be5a26e887b0570d03420061 14feb5f0be5a26e887b0570d03430062 14feb5f0be5a26e887b0570d03440063 14feb5f0be5a26e887b0570d03450064 14feb5f0be5a26e887b0570d03460065 14feb5f0be5a26e887b0570d03470066 14feb5f0be5a26e887b0570d03480067 14feb5f0be5a26e887b0570d03480068 14feb5f0be5a26e887b0570d03490069 14feb5f0be5a26e887b0570d034b006a 14feb5f0be5a26e887b0570d034b006b 14feb5f0be5a26e887b0570d034b006c 14feb5f0be5a26e887b0570d034b006d 14feb5f0be5a26e887b0570d034b006e 14feb5f0be5a26e887b0570d034c006f 14feb5f0be5a26e887b0570d034c0070 14feb5f0be5a26e887b0570d034c0071
To make sure that the retrieved file matches the file we stored, the file has been printed. The store has CLIs that may be used to exercise single or sets of APIs. Some perform a byte by byte comparison between the file to be stored and retrieved file. For simplicity we will not print the original file (they do match).
C:\>casretrieve 14feb5f0be5a26e887b05c1155ea99f7 c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt -do 0 -dl 136 casretrieve <<< RETRIEVING bifile using CASOpen(), CASRead() AND CASClose() line: 628 casretrieve <<< c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt - 0.06 seconds
The last console screen capture illustrates the casretrieve CLI with additional arguments. We have added –do which stands for data offset and –dl which stands for data length. In other words, we are going to retrieve a partial file starting at zero offset (beginning) and a length of 136 bytes. Why 136 bytes? As you have probably figured out, each GUID is represented by 32 hexadecimal characters. In Windows a line terminator is represented by \n\r (two bytes) making a line of text 34 bytes long. Let’s verify our assumption by printing (typing) the retrieved file.
C:\>type c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt 14feb5f0be5a26e887b0570d03190011 14feb5f0be5a26e887b0570d03190012 14feb5f0be5a26e887b0570d03190013 14feb5f0be5a26e887b0570d031a0014
As expected, the first four GUIDs were returned and all is well.
C:\>casretrieve 14feb5f0be5a26e887b05c1155ea99f7 c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt -do 34 -dl 102 casretrieve <<< RETRIEVING bifile using CASOpen(), CASRead() AND CASClose() line: 628 casretrieve <<< c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt - 0.04 seconds
The last command line is very similar to the previous one. This time we have provided a data offset of 34 bytes (will start reading at byte 33 in the encrypted file) and will return 102 bytes which is equal to 34 * 3. The storage server should return 102 bytes associated with the second, third and fourth GUIDs. Let’s take a look by printing the retrieved file:
C:\>type c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt ï#╧┬╦²ò⌠y╦ çOµÖ9½ä╢FsJσNq,¼D}F5>t╥&æ ±║0íΣ╔ë┘÷±▒¬╝≥ú√«>«è4Xδ░≡╗╨JñXò≈ÖlÜ╨
Whoops, some type of gibberish was returned by the storage server. What happened?
The answer is based on the fact that AES (also known as Rijndael) is a block cipher. Each block is 16-bytes long. Before starting encryption the server initializes the algorithm and then feeds raw blocks of the same size. When done the file might be up to 15-bytes longer if a single byte was left from the original file. When the file is unencrypted, the same key is used starting at the first block and progressing until the last block is processed. The last block will return the exact number of bytes as were in the original file. All this is handled by the storage server when storing and retrieving files. In this last case, we specified to start decrypting at the 32 byte in the file which would return gibberish.
The way to make the –do –dl flags work will be to detect that the bitfile is encrypted and start decryption at the first byte. When the requested byte shows up, the API would start sending bytes back until the specified data length is reached. At this point in time such feature has not been developed yet. As soon as customers request it, and the request is approved, the code will be updated to match requirements.
During my professional career I have exchanged messages with three very well known cryptographers. I have read a dozen or so books authored by them. I am a firm believer that encryption algorithms should be left to cryptographers and be tested by computer scientists. The rest of us should learn to develop algorithms to encrypt and decrypt data keeping it secure. In general that is not an easy feat.
If you have comments or questions regarding this post, please leave me a note bellow. I will try to respond as soon as possible.
Enjoy;
John
Follow me on Twitter: @john_canessa
John, you should take a look at the NIST TCSEC NSA Class A1 cybersecurity rules for secure operating systems and kernel Class A1 is the only code that has never been hacked. It was developed by a friend of mine Dr. Roger R. Schell, ( Ph.D. MIT) former Deputy Director of NSA computing and Cybersecurity. Roger Retired and started a company sells GEMSOS operating system and kernel. The kernel can be microkernel or a hypervisor in some instances. GEMSOS has never been attacked successfully by any witted aggressor or adversary. It has operated at CIA, DOD, Lockheed, Grumman, Boeing, and others.