Yesterday I spent time attempting to port a C DLL that I wrote some time ago to generate MD5 digests for a storage server. At the time I used as a base code provided by RSA Data Security, Inc. and designed and implemented a set of functions that could be directly called in applications / servers that would require generating MD5 digests for strings and files. When done I package the results into a DLL. The library has been in production for a long time. I used the C programming language for performance and the code was built for 32-bit processors.
A snippet for the md5.h file follows:
SENCOR_EXPORT int __stdcall MD5Compare ( char *fileName, char *otherFileName, unsigned char digest[MD5_DIGEST_LEN], BOOL *identical ); SENCOR_EXPORT int __stdcall MD5DigestToString ( unsigned char digest[MD5_DIGEST_LEN], char md5[MD5_DIGEST_STRING_LEN] ); SENCOR_EXPORT int __stdcall MD5Init ( MD5_CTX *context ); SENCOR_EXPORT int __stdcall MD5Final ( unsigned char digest[MD5_DIGEST_LEN], MD5_CTX *context ); SENCOR_EXPORT int __stdcall MD5Print ( unsigned char digest[MD5_DIGEST_LEN] ); SENCOR_EXPORT int __stdcall MD5Update ( MD5_CTX *context, unsigned char *input, unsigned int inputLen );
As I have mentioned in prior posts, I am currently working porting some storage server APIs from C to C++ using a 64-bit architecture. Yesterday I spent time attempting to get the original C code to compile, build and run without success. Last night I woke up and decided to use the C++ Boost libraries. I then went back to sleep.
This morning I checked my current version of Visual Studio 2017 and noticed that I did not have the latest version of boost installed so I had to do it. It is always a good idea to verify that all is well so I also wrote some code. This post is intended to document what I did in case some developer out there needs some simple steps and code to verify that the MD5 hash works.
I would like to note that if I was writing a storage server from scratch and did not care for backwards compatibility, I would have considered a stronger hash like SHA (SHA-0, SHA-1, SHA-2 or SHA-3). You can read more about these secure hash family here.
The first step was to download the Boost library into my Windows 10 machine. The C and C++ code for the storage server resides on my Windows machine, so it makes sense to write and test in it.
I read How to Use BOOST in Visual Studio 2017 and paid attention to the following paragraph:
“After installing one of the versions of the C++ Boost library you need to know how to use it in VS2017. Remember, only Boost 1.64 and up works with Visual Studio 2017.”
My version of Visual Studio follows:
Microsoft Visual Studio Enterprise 2017 Version 15.9.11 VisualStudio.15.Release/15.9.11+28307.586 Microsoft .NET Framework Version 4.7.03056 Installed Version: Enterprise Visual C++ 2017 XXXXX-XXXXX-XXXXX-XXXXX Microsoft Visual C++ 2017
I then located the Boost website for downloads. At the time it seems that the latest version is 1.70.0 which is higher than the recommended 1.64. After reading the page, for simplicity, decided to check the pre built windows binaries link in that page. Found the folder with the 1.70.00 and took a look. It seems that the boost_1_70_0-msvc-14.1-64.exe installer is what the doctor ordered. I downloaded it to C:\Temp\boost_1_70_0-msvc-14.1-64.exe. Once the download completed (took around 10 minutes), I started the installer. Took the installer default options which installed the software in C:\local\boost_1_70_0 as illustrated by the following screen capture:
C:\local>dir Volume in drive C is OS Volume Serial Number is 26E8-87B0 Directory of C:\local 05/24/2019 06:13 AM <DIR> . 05/24/2019 06:13 AM <DIR> .. 05/19/2016 04:08 PM <DIR> boost_1_61_0 05/24/2019 06:20 AM <DIR> boost_1_70_0 0 File(s) 0 bytes 4 Dir(s) 588,331,474,944 bytes free
It seems that a previous version 1.61.0 was installed sometime ago and according to the article it was not a recommended version. I will set Visual Studio to use the new version of Boost.
The next step is to use the md5 libraries. I did so by starting to add some header files which initially could not be found. I located the post How to add boost library 1_65 or 1_64 to Visual Studio 2017 project in Stack Overflow. I then followed the instructions:
1. Open the View -> Other Windows -> Property Manager menu. 2. Click on the project and select the Microsoft.Cpp.x64.user item. 3. Right click and select the Properties menu. 4. Open Common Properties and select the VC++ Directories item. 5. Added the directories where I installed boost to Include Directories: C:\local\boost_1_70_0 C:\local\boost_1_70_0\boost 6. Add the directory where you builtboost libraries to Library Directories C:\local\boost_1_70_0\lib64-msvc-14.1 7. Click on <OK> in order to get all the way out.
I then wrote a simple test program to verify that the code would generate correct MD5 hashes. The entire C++ code generated on Visual Studio 2017 IDE follows:
#include "pch.h" #include <iostream> #include <algorithm> #include <iterator> #include <boost/uuid/detail/md5.hpp> #include <boost/algorithm/hex.hpp> using boost::uuids::detail::md5; using namespace std; /* * Convert MD5 digest to string. */ string digestToString(const md5::digest_type &digest) { // **** convert to char[] **** const auto charDigest = reinterpret_cast<const char *>(&digest); // **** output string **** string str; // **** convert char[] digest to string **** boost::algorithm::hex(charDigest, charDigest + sizeof(md5::digest_type), std::back_inserter(str)); // **** return digest string **** return str; } /* * Test scaffolding. */ int main() { // **** general purpose string **** string str; // **** welcome message **** cout << "MD5 using boost!!!\n\n"; // **** prompt for first string **** cout << "main >>> str: "; // **** loop processing input **** while (getline(cin, str)) { // **** check if we are done **** if (str.compare(string("-1")) == 0) { cout << "main <<< bye bye\n"; break; } // **** display string **** cout << "main <<< str ==>" << str << "<==\n"; // **** generate MD5 digest **** md5 hash; md5::digest_type digest; hash.process_bytes(str.data(), str.size()); hash.get_digest(digest); // **** convert to string and display MD5 digest **** cout << "main <<< digest ==>" << digestToString(digest) << "<==\n"; // **** prompt for next string **** cout << "main >>> str: "; } }
The idea is quite simple. One can enter a string with some text and the program will generate the associated MD5 hash. Note that the MD5 hash is also called a digest. That is a binary string. When using it we typically need to convert it to a string. Such operation is perform by the digestToString() function. I am very familiar with this code as you saw on the md5.h header file snippet earlier in this post. At that time I implemented a similar production function MD5DigestToString() which produces the same results.
The following code snippet written in C is part of a set of tests that are used to test the operation of the md5.c module:
case 75: // **** prompt the user for the file name **** printf(">>> fileName [%s]: ", fileName); // **** get the response from the user **** if (fgets(buffer, BUFSIZ, stdin) == NULL) { EventLog(EVENT_ERROR, "UtilityFunctions <<< fgets line: %d file ==>%s<==\n", __LINE__, __FILE__); retVal = WAR_INTERNAL_ERROR; goto done; } // **** check for blank responses **** buffer[strlen(buffer) - 1] = '\0'; if (*buffer != '\0') strcpy(fileName, buffer); // **** get the size of the file (in bytes) **** status = FileGetSize( fileName, &fileSize); CDP_CHECK_STATUS("UtilityFunctions <<< FileGetSize", status); // **** compute the MD5 digest **** status = GetMD5FromFileName( fileName, fileSize, digest); CDP_CHECK_STATUS("UtilityFunctions <<< GetMD5FromFileName", status); // **** convert the MD5 digest into a string **** status = MD5DigestToString( digest, digestString); CDP_CHECK_STATUS("UtilityFunctions <<< MD5DigestToString", status); // **** inform the user what is going on **** EventLog(EVENT_INFO, "UtilityFunctions <<< digestString ==>%s<== line: %d\n", digestString, __LINE__); break;
Note that the code reads the contents of the file and generates the associated MD5 digest. In the C++ code we just entered a string and computed the MD5 hash.
A run of such code follows:
C:\>type c:\temp\hello_world.txt hello world [75] >>> 75 >>> fileName []: c:\temp\hello_world.txt UtilityFunctions <<< digestString ==>5EB63BBBE01EEED093CB22BB8F5ACDC3<== line: 24738 C:\>type c:\temp\that_is_all_folks.txt that is all folks [75] >>> 75 >>> fileName [c:\temp\hello_world.txt]: c:\temp\that_is_all_folks.txt UtilityFunctions <<< digestString ==>6738E0FE00CCA8869D6B502E86BD36A8<== line: 24738
We can then use a web site to compare results. I found MD5 Has Generator and used it to test as follows:
Your Hash: 5eb63bbbe01eeed093cb22bb8f5acdc3 Your String: hello world main >>> str: hello world main <<< str ==>hello world<== main <<< digest ==>5EB63BBBE01EEED093CB22BB8F5ACDC3<== Your Hash: 6738e0fe00cca8869d6b502e86bd36a8 Your String: that is all folks main >>> str: that is all folks main <<< str ==>that is all folks<== main <<< digest ==>6738E0FE00CCA8869D6B502E86BD36A8<==
As you can see, the same text produces the same MD5 hashes.
At this point I will finish this post. I will not cover the code that I will be developing for the storage server APIs in C++.
For ease of use, if interested, you may find the C++ code for this post in my GitHub repository.
If you have comments or questions regarding this or any other post in this blog, or if you would like me to help with any phase in the SDLC (Software Development Life Cycle) of a product or service, please do not hesitate and leave me a note below. Requests for help will remain private.
Keep on reading and experimenting. It is the best way to learn!
John
Follow me on Twitter: @john_canessa