Encrypted Store

This post has to do with encrypting data in a storage server. When the storage server in question was architected and implemented the data at rest and in transit were raw (unencrypted). The main reason was that clients and servers where deployed in the same facility.

Years when by and the requirements called for encrypting data in transit while the data at rest was left raw. Encrypting data in transit could be performed by using HTTPS, secure sockets or encrypting the data when retrieved from the servers and decrypting by clients when received. Storing data would be the opposite. The initial decision, given that the client and server where under control, was to encrypt transmissions using the Advanced Encryption Standard (AES) which was designed by Vincent Rijmen and Joan Daemen back in 1998.

A couple years later, the requirements shifted to encrypt the data at rest and use HTTPS for network transfers. The current version of the storage server operates like this.

At the time when the data was stored unencrypted, clients could request via an extensive Application Programming Interface (API) and associated Command Line Interface (CLI) the retrieval of complete or partial files. One could specify a data offset and a data length and could retrieve such section of the file. Things have changed now that the data is kept encrypted at rest.

Following are some console screen captures that illustrate different the retrieval of the same file. Let’s start by storing a text file which would allow us to better understand what is going on. The client is running on a Windows machine.

C:\>casstore c:\temp\TextFiles\guid_list.txt
casstore <<< STORING files using CASOpen(), CASWrite() and CASClose() line: 625
14feb5f0be5a26e887b05c1155ea99f7 -  0.09 seconds

The casstore CLI is invoked to store the text file c:\temp\TextFiles\guid_list.txt using the following APIs: CASOpen(), CASWrite() and CASClose(). Given that the file is quite small each of the APIs should have been called once. You can see the POSIX resemblance in the design and implementation of the API.

When done, a Global Unique Identifier (GUID) is returned. Please note that a GUID in the storage system is not the same as a GUID in the Windows world. A GUID is a 16-byte value displayed as a 32-byte string. Each object in the store has assigned a unique GUID. Objects are not editable. If a user needs to edit a file, the file should be retrieved, edited and then stored. The resulting file will have a different unique GUID.

C:\>casretrieve 14feb5f0be5a26e887b05c1155ea99f7 c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt
casretrieve <<< RETRIEVING bifile using CASOpen(), CASRead() AND CASClose() line: 628
casretrieve <<< c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt -  0.03 seconds

We used the casrterieve CLI to  retrieve the bitfile (that is what a stored file is called). Similar to how the file was stored, the bitfile is retrieved by using CASOpen(), CASRead() and CASClose(). We have specified the name of the file to be retrieved with an extension. This makes it easy to manipulate using Windows utilities which seem to rely on file extensions.

C:\>type c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt
14feb5f0be5a26e887b0570d03190011
14feb5f0be5a26e887b0570d03190012
14feb5f0be5a26e887b0570d03190013
14feb5f0be5a26e887b0570d031a0014
14feb5f0be5a26e887b0570d031b0015
14feb5f0be5a26e887b0570d031c0016
14feb5f0be5a26e887b0570d031d0017
14feb5f0be5a26e887b0570d031d0018
14feb5f0be5a26e887b0570d031e0019
14feb5f0be5a26e887b0570d031f001a
14feb5f0be5a26e887b0570d0320001b
14feb5f0be5a26e887b0570d0320001c
14feb5f0be5a26e887b0570d0321001d
14feb5f0be5a26e887b0570d0321001e
14feb5f0be5a26e887b0570d0321001f
14feb5f0be5a26e887b0570d03220020
14feb5f0be5a26e887b0570d03220021
14feb5f0be5a26e887b0570d03220022
14feb5f0be5a26e887b0570d03230023
14feb5f0be5a26e887b0570d03230024
14feb5f0be5a26e887b0570d03240025
14feb5f0be5a26e887b0570d03240026
14feb5f0be5a26e887b0570d03240027
14feb5f0be5a26e887b0570d03250028
14feb5f0be5a26e887b0570d032a0029
14feb5f0be5a26e887b0570d032a002a
14feb5f0be5a26e887b0570d032b002b
14feb5f0be5a26e887b0570d032c002c
14feb5f0be5a26e887b0570d032d002d
14feb5f0be5a26e887b0570d0331002e
14feb5f0be5a26e887b0570d0332002f
14feb5f0be5a26e887b0570d03320030
14feb5f0be5a26e887b0570d03320031
14feb5f0be5a26e887b0570d03330032
14feb5f0be5a26e887b0570d03330033
14feb5f0be5a26e887b0570d03330034
14feb5f0be5a26e887b0570d03330035
14feb5f0be5a26e887b0570d03330036
14feb5f0be5a26e887b0570d03340037
14feb5f0be5a26e887b0570d03340038
14feb5f0be5a26e887b0570d03340039
14feb5f0be5a26e887b0570d0334003a
14feb5f0be5a26e887b0570d0334003b
14feb5f0be5a26e887b0570d0334003c
14feb5f0be5a26e887b0570d0334003d
14feb5f0be5a26e887b0570d0334003e
14feb5f0be5a26e887b0570d0335003f
14feb5f0be5a26e887b0570d03350040
14feb5f0be5a26e887b0570d03350041
14feb5f0be5a26e887b0570d03350042
14feb5f0be5a26e887b0570d03350043
14feb5f0be5a26e887b0570d03350044
14feb5f0be5a26e887b0570d03350045
14feb5f0be5a26e887b0570d03360046
14feb5f0be5a26e887b0570d03360047
14feb5f0be5a26e887b0570d03360048
14feb5f0be5a26e887b0570d03370049
14feb5f0be5a26e887b0570d0337004a
14feb5f0be5a26e887b0570d0337004b
14feb5f0be5a26e887b0570d0337004c
14feb5f0be5a26e887b0570d0337004d
14feb5f0be5a26e887b0570d0337004e
14feb5f0be5a26e887b0570d0337004f
14feb5f0be5a26e887b0570d03380050
14feb5f0be5a26e887b0570d03380051
14feb5f0be5a26e887b0570d03380052
14feb5f0be5a26e887b0570d033a0053
14feb5f0be5a26e887b0570d033b0054
14feb5f0be5a26e887b0570d033b0055
14feb5f0be5a26e887b0570d033b0056
14feb5f0be5a26e887b0570d033c0057
14feb5f0be5a26e887b0570d033c0058
14feb5f0be5a26e887b0570d033c0059
14feb5f0be5a26e887b0570d033c005a
14feb5f0be5a26e887b0570d033d005b
14feb5f0be5a26e887b0570d033e005c
14feb5f0be5a26e887b0570d033f005d
14feb5f0be5a26e887b0570d0340005e
14feb5f0be5a26e887b0570d0340005f
14feb5f0be5a26e887b0570d03410060
14feb5f0be5a26e887b0570d03420061
14feb5f0be5a26e887b0570d03430062
14feb5f0be5a26e887b0570d03440063
14feb5f0be5a26e887b0570d03450064
14feb5f0be5a26e887b0570d03460065
14feb5f0be5a26e887b0570d03470066
14feb5f0be5a26e887b0570d03480067
14feb5f0be5a26e887b0570d03480068
14feb5f0be5a26e887b0570d03490069
14feb5f0be5a26e887b0570d034b006a
14feb5f0be5a26e887b0570d034b006b
14feb5f0be5a26e887b0570d034b006c
14feb5f0be5a26e887b0570d034b006d
14feb5f0be5a26e887b0570d034b006e
14feb5f0be5a26e887b0570d034c006f
14feb5f0be5a26e887b0570d034c0070
14feb5f0be5a26e887b0570d034c0071

To make sure that the retrieved file matches the file we stored, the file has been printed. The store has CLIs that may be used to exercise single or sets of APIs. Some perform a byte by byte comparison between the file to be stored and retrieved file. For simplicity we will not print the original file (they do match).

C:\>casretrieve 14feb5f0be5a26e887b05c1155ea99f7 c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt -do 0 -dl 136
casretrieve <<< RETRIEVING bifile using CASOpen(), CASRead() AND CASClose() line: 628
casretrieve <<< c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt -  0.06 seconds

The last console screen capture illustrates the casretrieve CLI with additional arguments. We have added –do which stands for data offset and –dl which stands for data length. In other words, we are going to retrieve a partial file starting at zero offset (beginning) and a length of 136 bytes. Why 136 bytes? As you have probably figured out, each GUID is represented by 32 hexadecimal characters. In Windows a line terminator is represented by \n\r (two bytes) making a line of text 34 bytes long. Let’s verify our assumption by printing (typing) the retrieved file.

C:\>type c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt
14feb5f0be5a26e887b0570d03190011
14feb5f0be5a26e887b0570d03190012
14feb5f0be5a26e887b0570d03190013
14feb5f0be5a26e887b0570d031a0014

As expected, the first four GUIDs were returned and all is well.

C:\>casretrieve 14feb5f0be5a26e887b05c1155ea99f7 c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt -do 34 -dl 102
casretrieve <<< RETRIEVING bifile using CASOpen(), CASRead() AND CASClose() line: 628
casretrieve <<< c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt -  0.04 seconds

The last command line is very similar to the previous one. This time we have provided a data offset of 34 bytes (will start reading at byte 33 in the encrypted file) and will return 102 bytes which is equal to 34 * 3. The storage server should return 102 bytes associated with the second, third and fourth GUIDs. Let’s take a look by printing the retrieved file:

C:\>type c:\temp\14feb5f0be5a26e887b05c1155ea99f7.txt
ï#╧┬╦²ò⌠y╦ çOµÖ9½ä╢FsJσNq,¼D}F5>t╥&æ
±║0íΣ╔ë┘÷±▒¬╝≥ú√«>«è4Xδ░≡╗╨JñXò≈ÖlÜ╨ 

Whoops, some type of gibberish was returned by the storage server. What happened?

The answer is based on the fact that AES (also known as Rijndael) is a block cipher. Each block is 16-bytes long. Before starting encryption the server initializes the algorithm and then feeds raw blocks of the same size. When done the file might be up to 15-bytes longer if a single byte was left from the original file. When the file is unencrypted, the same key is used starting at the first block and progressing until the last block is processed. The last block will return the exact number of bytes as were in the original file. All this is handled by the storage server when storing and retrieving files. In this last case, we specified to start decrypting at the 32 byte in the file which would return gibberish.

The way to make the –do –dl flags work will be to detect that the bitfile is encrypted and start decryption at the first byte. When the requested byte shows up, the API would start sending bytes back until the specified data length is reached. At this point in time such feature has not been developed yet. As soon as customers request it, and the request is approved, the code will be updated to match requirements.

During my professional career I have exchanged messages with three very well known cryptographers. I have read a dozen or so books authored by them. I am a firm believer that encryption algorithms should be left to cryptographers and be tested by computer scientists. The rest of us should learn to develop algorithms to encrypt and decrypt data keeping it secure. In general that is not an easy feat.

If you have comments or questions regarding this post, please leave me a note bellow. I will try to respond as soon as possible.

Enjoy;

John

Follow me on Twitter: @john_canessa

One thought on “Encrypted Store”

  1. John, you should take a look at the NIST TCSEC NSA Class A1 cybersecurity rules for secure operating systems and kernel Class A1 is the only code that has never been hacked. It was developed by a friend of mine Dr. Roger R. Schell, ( Ph.D. MIT) former Deputy Director of NSA computing and Cybersecurity. Roger Retired and started a company sells GEMSOS operating system and kernel. The kernel can be microkernel or a hypervisor in some instances. GEMSOS has never been attacked successfully by any witted aggressor or adversary. It has operated at CIA, DOD, Lockheed, Grumman, Boeing, and others.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.