Block Storage vs. File Storage – Part 1

In this post we will not be solving a problem yet. This post is about code I wrote to experiment with differences between file and block storage.

In this post we will start by writing some data into individual files. This will set the ground for some differences that arise when you access similar data in block mode possibly in a cloud storage setting.

This code has been written in C on a Windows platform. For my benefit regarding ease of development, I have used some previous code found in a set of libraries which at this point I am not allowed to disclose. That said; when we encounter such calls I will suggest ways you can replace them with much simpler code. Sorry for the inconvenience this may cause.

In this post we will write binary data to three files. When all is said and done we will take a quick look at the contents of the files. We will discuss what makes sense and most important what works on some operations when using independent files.

In block storage, the data is stored without any metadata e.g. data format, type, ownership, etc. 
The ability to store data in blocks delivers structured workloads such as databases, applications, etc. ... 
Consequently, this makes block storage faster than other storage.

I do not recall where I found these sentences. The idea is that block storage can and in most applications should be used over file storage. In general it provides faster reads and writes which in these days seems to be a very desirable feature on most applications and systems.

>>> specify a path [c:\temp]:
fileStorage <<< path ==>c:\temp<== line: 61913
>>> number of threads / files [3]:
fileStorage <<< threadCount: 3 line: 61934
fileStorage <<< all threads have finished !!! line: 62010
fileStorage <<< n: 32 line: 62030
displayContents <<< contents of fileName ==>c:\temp\a.bin<== follow:
0 1 2 3 4 5 6 7
8 9 10 11 12 13 14 15
16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31
:::: :::: :::: ::::
96 97 98 99 100 101 102 103
104 105 106 107 108 109 110 111
112 113 114 115 116 117 118 119
120 121 122 123 124 125 126 127
displayContents <<< contents of fileName ==>c:\temp\b.bin<== follow:
1 3 5 7 9 11 13 15
17 19 21 23 25 27 29 31
33 35 37 39 41 43 45 47
49 51 53 55 57 59 61 63
:::: :::: :::: ::::
193 195 197 199 201 203 205 207
209 211 213 215 217 219 221 223
225 227 229 231 233 235 237 239
241 243 245 247 249 251 253 255
displayContents <<< contents of fileName ==>c:\temp\c.bin<== follow:
0 2 4 6 8 10 12 14
16 18 20 22 24 26 28 30
32 34 36 38 40 42 44 46
48 50 52 54 56 58 60 62
:::: :::: :::: ::::
192 194 196 198 200 202 204 206
208 210 212 214 216 218 220 222
224 226 228 230 232 234 236 238
240 242 244 246 248 250 252 254

This is output from test code I wrote on a test scaffold I have been working on and enhancing it for many years. I took out the entry function and moved it to a short and simple main.c file. This is why the line numbers in teh source code do not match the lines in the main.c file. I just could not move my test scaffold to GitHub.com, but a single file did make sense.

Our test scaffold calls the fileStorage() function of interest. Such function prompts for a folder in which we will be writing three files with some binary data. In this example I just accepted the default path.

The function of interest then prompts for the number of threads we wish to use. The default is three so I took it. As we will see in the code shortly, some of the prompts may ignore your input. I did so for this first pass. Will make modifications on future implementations as needed.

It seems that the function of interest started three threads with the same code. The arguments as we will shortly see are different.

// **** block and file storage ****
#define	STORAGE_FILE_SIZE			1024
#define	STORAGE_MIN_IO_SIZE			7
#define	STORAGE_MAX_IO_SIZE			31
#define STORAGE_THREAD_COUNT		3

#define STORAGE_ODD_AND_EVEN		0
#define STORAGE_ODD					1
#define	STORAGE_EVEN				2

#define FILE_STORE_THREAD_LEN		1024
#define	STORAGE_WAIT_FOR_THREADS	(1024L * 3L)
#define STORAGE_HEAD_TAIL_LEN		32

The first thread seems to create and populate a binary file with values in the range [0 : 127]. The values are in monotonically ascending order. The data is written to the a.bin file. The second thread seems to be writing odd values in the range [1 : 255] to the b.bin file. Finally the third thread writes even values in the range [0 : 254] to the c.bin file. As we will find out when we take a look at the actual code, the end values may differ between passes if the random number generator is not seeded.

In conclusion we have three threads that write values to three separate files. The values are in ascending order.

int __cdecl	main	(
					int		argc,
					char	*argv[]
					)

//	***************************************************************@@
//	- Test the operation of a specifiec function.
//	*****************************************************************

{

int     retVal,									            // returned by this function
        status;									            // returned by function calls

// **** initialization ****
retVal  = 0;												// hope all goes well

// **** call the function of interest ****
status = fileStorage();
if (status != 0)
    {
    SysLog(EVENT_ERROR,
    "main <<< fileStorage status: %d line: %d file ==>%s<==\n",
    status, __LINE__, __FILE__);
    retVal = status;
    goto done;
    }  

// **** clean up ****
done:

// **** inform user what went going on ****
if (traceExecution != 0 || retVal != 0)
	SysLog(EVENT_INFO,
	"main <<< retVal: %d line: %d file ==>%s<==\n",
	retVal, __LINE__, __FILE__);

// **** inform caller what went on ****
return retVal;
}

I create these constants based on what my perceived parameters for the first pass. It seems that each thread may write a number of values per pass. The values are in the range [STORAGE_MIN_IO_SIZE : STORAGE_MAX_IO_SIZE]. In other words, when the treads write to the files, they do it as a group. At this point it is not that important, but it will be of interest in later posts after all threads write to the same file simulating a block file.

The rest of the definitions will make sense as we look at the code for the different functions.

As I mentioned earlier, I am using a test scaffold that I started developing some time ago. For simplicity I just copied and pasted some code and created this main.c file. Let’s take a few moments to describe what is going on.

Our test code declares a couple variables. And initializes the retVal to 0. If all goes well, our test scaffold will return a 0; otherwise if something fails it will return a non-zero value to indicate something failed. This is typical on most platforms when you invoke a program that returns a status code.

The function of interest is called. It returns a status value. If the status is 0 all is well so far; otherwise we display a message that contains the function in which the issue occurred, the name of the function that failed, the status value it returned, and the line and file name where the issue was detected. Note that we make a call to a function named SysLog. That function is used to write different level messages to a log file. The message with additional information is appended to a file with a specific name. When the file reaches a certain capacity, a new log file is created and the messages are directed to it. The number of log files per day can be adjusted. The number of days the log files are kept is also configurable. In our case, just replace the function with printf and drop the first argument.

The ret val is set to the status value, and the code jumps to the done: label. In our case we do not have any specific cleanup so our code displays a message if one of the conditions is met and the retVal is returned.

int __stdcall	fileStorage	(
							void
							)

//	***************************************************************@@
//	- This function exercises file storage operations using three
//  threads.
//	*****************************************************************

{

#ifdef _CODE_DEBUG
//int					traceExecution = 1;						// for testing only
#endif

char					buffer[BUFSIZ],							// general purpose buffer
						errorString[BUFSIZ],					// error string
						path[BUFSIZ];							// file path

FILE_STORE_THREAD_ARG	*threadArg;								// thread argument

HANDLE					threadHandles[STORAGE_THREAD_COUNT];	// thread handle

int						lastError,								// last error
						retVal,									// returned by this function
						status;									// returned by function calls

longlong				fileSize;								// size of file to write

unsigned int			threadIDs[STORAGE_THREAD_COUNT];		// thread IDs

unsigned long			threadCount;							// number of threads to use

// **** initialization ****
retVal		= 0;												// hope all goes well

fileSize	= STORAGE_FILE_SIZE;								// for starters
lastError	= 0;												// for starters
threadArg	= NULL;												// for starters
threadCount = STORAGE_THREAD_COUNT;								// for starters

memset((void*)buffer,			(int)0x00, (size_t)sizeof(buffer));
memset((void*)errorString,		(int)0x00, (size_t)sizeof(errorString));
memset((void*)path,				(int)0x00, (size_t)sizeof(path));
memset((void*)threadHandles,	(int)0x00, (size_t)sizeof(threadHandles));

memset((void*)threadIDs,		(int)0x00, (size_t)sizeof(threadIDs));

// ???? ????
if (traceExecution != 0)
	{
	SysLog(EVENT_INFO, "fileStorage <<< sizeof(threadHandles): %lu line: %d\n", sizeof(threadHandles), __LINE__);
	SysLog(EVENT_INFO, "fileStorage <<<     sizeof(threadIDs): %lu line: %d\n", sizeof(threadIDs), __LINE__);
	}

// **** prompt for the path for the file(s) ****
strcpy(path, "c:\\temp");
printf(">>> specify a path [%s]: ", path);

// **** get the response from the user ****
if (fgets(buffer, BUFSIZ, stdin) == NULL)
	{
	SysLog(EVENT_ERROR,
	"fileStorage <<< fgets line: %d file ==>%s<==\n",
	__LINE__, __FILE__);
	retVal = WAR_INTERNAL_ERROR;
	goto done;
	}

// **** remove the CR ****
buffer[strlen(buffer) - 1] = '\0';
if (*buffer != '\0')
	strcpy(path, buffer);
else
	strcpy(path, "c:\\temp");

// ???? ????
SysLog(EVENT_INFO, "fileStorage <<< path ==>%s<== line: %d\n", path, __LINE__);

// **** prompt for the number of threads ****
printf(">>> number of threads / files [%lu]: ", threadCount);

// **** get the response from the user ****
if (fgets(buffer, BUFSIZ, stdin) == NULL)
	{
	SysLog(EVENT_ERROR, 
	"fileStorage <<< fgets line: %d file ==>%s<==\n",__LINE__, __FILE__);
	retVal = WAR_INTERNAL_ERROR;
	goto done;
	}

// **** remove the CR ****
buffer[strlen(buffer) - 1] = '\0';
if (*buffer != '\0')
	threadCount = atol(buffer);
threadCount = STORAGE_THREAD_COUNT;

// ???? ????
SysLog(EVENT_INFO, "fileStorage <<< threadCount: %lu line: %d\n", threadCount, __LINE__);

// **** loop starting a file storage thread at a time ****
for (int i = 0; i < threadCount; i++)
	{

	// **** allocate the thread argument ****
	threadArg = (FILE_STORE_THREAD_ARG*)calloc(	(size_t)1,
												(size_t)sizeof(FILE_STORE_THREAD_ARG));
	if (threadArg == (FILE_STORE_THREAD_ARG*)NULL)
		{
		SysLog(EVENT_ERROR,
		"TestQueInsert <<< calloc threadArg line: %d file ==>%s<==\n",
		__LINE__, __FILE__);
		retVal = -1;
		goto done;
		}

	// **** set the thread argument ****
	threadArg->fileSize = fileSize;
	strncpy(threadArg->path, path, strlen(path));

	// **** set the odd or even flag ****
	switch (i)
		{
		case 0:
			threadArg->oddOrEven = STORAGE_ODD_AND_EVEN;
		break;

		case 1:
			threadArg->oddOrEven = STORAGE_ODD;
		break;

		case 2:
			threadArg->oddOrEven = STORAGE_EVEN;
		break;

		default:
			SysLog(EVENT_ERROR,
			"fileStorage <<< UNEXPECTED i: %ld line: %d file ==>%s<==\n",
			status, __LINE__, __FILE__);
			retVal = status;
			goto done;
		break;
		}
	
	// **** start this thread ****
	threadHandles[i] = (HANDLE)_beginthreadex(	(void*)NULL,	// security descriptor
												(unsigned)0,	// stack size
												(SENCOR_THREAD_START)FileStoreThread,
												(void*)threadArg,

												0,				// running
												&threadIDs[i]);

	// **** check if something went wrong ****
	if (threadHandles[i] == (HANDLE)0)
		{
		strcpy(errorString, strerror(errno));
		SysLog(EVENT_ERROR,
		"fileStorage <<< _beginthreadex FileStoreThread errno: %d errorString ==>%s<== line: %d file ==>%s<==\n",
		errno, errorString, __LINE__, __FILE__);
		retVal = -1;
		goto done;
		}
	}

// **** wait for all threads to exit ****
status = WaitForMultipleObjects(sizeof(threadHandles) / sizeof(HANDLE),	// number of handles in array
								threadHandles,					// object-handle array
								(BOOL)(1 == 1),					// wait option
								STORAGE_WAIT_FOR_THREADS);		// time-out interval
switch (status)
	{
	case WAIT_OBJECT_0:
	case WAIT_OBJECT_0 + 1:
		SysLog(EVENT_INFO, "fileStorage <<< all threads have finished !!! line: %d\n", __LINE__);
	break;

	case WAIT_ABANDONED_0:
	case WAIT_FAILED:
	default:
		lastError = GetLastError();								// get error code
		PrintError(lastError);									// display string for error
		SysLog(EVENT_ERROR,
		"fileStorage <<< WaitForMultipleObjects status: %d lastError: %d line: %d file ==>%s<==\n", 
		status, lastError, __LINE__, __FILE__);
		retVal = status;
		goto done;
	break;
	}

// **** display `n` top and botton contents of each file ****
int n = STORAGE_HEAD_TAIL_LEN;

// ???? ????
SysLog(EVENT_INFO, "fileStorage <<< n: %d line: %d\n", n, __LINE__);

for (int i = 0; i < 3; i++)
	{
	status = displayContents(i, n);
	if (status != 0)
		{
		SysLog(EVENT_ERROR,
		"fileStorage <<< displayContents status: %d i: %d line: %d file ==>%s<==\n",
		status, i, __LINE__, __FILE__);
		retVal = status;
		goto done;
		}
	}

// **** clean up ****
done:

// **** inform the user what is going on ****
if (traceExecution != 0 || retVal != 0)
	SysLog(EVENT_INFO,
	"fileStorage <<< retVal: %d line: %d file ==>%s<==\n",
	retVal, __LINE__, __FILE__);

// **** inform the caller what went on ****
return retVal;
}

Our function declares and initializes a few variables. As a convention in our test scaffold, we use the traceExecution variable in each file (in our case just a single variable) to enable or disable trace messages that display on the console and also write an entry in the log file.

Note that each function has also defined the traceExecution variable but it is commented out in the code bracketed by the _CODE_DEBUG symbol which is globally defined. If you are interested in enabling the trace messages in a single function, you can manually do so by uncommenting the associate definition for traceExecution which is already set to one.

Our code prompts for a path in which to create the binary files. The default (should have been  placed in the definition section, is c:\temp. We can change as needed.

The code then prompts for the number of threads to use. The default is three. Note that at this point in time, the value reset to three.

We then enter a loop in which we allocate an argument variable for each thread. The argument is initialized and the thread is started.

We check if something goes wrong with starting the thread. Note that the structure of this function follows a pattern that initializes variables, performs the work, cleans up and returns a retVal value. If all is well it returns a 0.

After the loop has started the three threads, it waits until all the threads are done. 

At this point we are ready to display the top n and the bottom n values in the three files that were generated by the three threads. We do this in a loop which calls the displayContents function. The file of interest is defined by the first argument and the number of lines by the second.

After all is said and done, the function displays a message if one of the conditions is met and returns the value held by the retVal variable.

int __stdcall	FileStoreThread		(
									FILE_STORE_THREAD_ARG	*threadArg
									)

//	***************************************************************@@
//	- This thread populates a file with longlong numbers is ascending
//	order.
//	*****************************************************************

{

#ifdef _CODE_DEBUG
//int			traceExecution = 1;								// for testing only
#endif

char			fileName[BUFSIZ];								// file name

int				fd,												// file descriptor
				retVal,											// returned by this thread
				status;											// returned by function calls

longlong		data[STORAGE_MAX_IO_SIZE],						// data to write to file
				fileSize,										// size of file to write
				value;											// value to write to file

unsigned int	size;											// size of data to write to file

unsigned long	count,											// count of values to write
				threadID;										// our thread ID

// **** initialization ****
retVal			= 0;											// hope all goes well

count			= 0L;											// for starters
fd				= -1;											// for starters
size			= 0;											// for starters
status			= 0;											// for starters

threadID		= 0;											// for starters
value			= 0;											// for starters

memset((void*)data,		(int)0x00, (size_t)sizeof(data));
memset((void*)fileName,	(int)0x00, (size_t)sizeof(fileName));

// **** for ease of use ****
fileSize = threadArg->fileSize;

// ???? ????
if (traceExecution != 0)
	SysLog(EVENT_INFO, "FileStoreThread <<< fileSize: %I64d line: %d\n", fileSize, __LINE__);

// **** get the thread ID ****
threadID = GetCurrentThreadId();

// **** generate the full path file name ****
status = sprintf(	fileName,
					"%s\\",
					threadArg->path);

// **** generate the file name ****
if (threadArg->oddOrEven == STORAGE_ODD_AND_EVEN)
	strcat(fileName, "a.bin");
else if (threadArg->oddOrEven == STORAGE_ODD)
	strcat(fileName, "b.bin");
else if (threadArg->oddOrEven == STORAGE_EVEN)
	strcat(fileName, "c.bin");
else
	{
	SysLog(EVENT_ERROR,
	"displayContents <<< UNEXPECTED threadArg->oddOrEven: %d line: %d file ==>%s<==\n",
	threadArg->oddOrEven, __LINE__, __FILE__);
	retVal = threadArg->oddOrEven;
	goto done;
	}

// ???? ????
if (traceExecution != 0)
	SysLog(EVENT_INFO, "FileStoreThread <<< fileName ==>%s<== line: %d\n", fileName, __LINE__);

// **** remove output file (to start fresh) ****
status = _unlink(fileName);
if (status != 0 && errno != ENOENT)
	{
	SysLog(EVENT_ERROR,
	"FileStoreThread <<< _unlink status: %d fileName ==>%s<== errno: %d line: %d file ==>%s<==\n",
	status, fileName, errno, __LINE__, __FILE__);
	retVal = status;
	goto done;
	}

// **** create /open output file ****
fd = _open(	fileName,
			_O_CREAT | _O_RDWR | _O_BINARY | _O_EXCL,
			0777);
if (fd == -1)
	{
	SysLog(EVENT_ERROR,
	"FileStoreThread <<< _open fileName ==>%s<== errno: %d line: %d file ==>%s<==\n",
	fileName, errno, __LINE__, __FILE__);
	retVal = fd;
	goto done;
	}

// **** determine the firts value to store in the file ****
if (threadArg->oddOrEven == STORAGE_ODD)
	value = 1;
else
	value = 0;

// **** seed the random number generator ****
srand((unsigned)time(NULL));

// ***** loop writing data to the output file ****
for (longlong i = 0; i < fileSize; )
	{

	// **** number of values to write ****
	count = rand() % STORAGE_MIN_IO_SIZE + STORAGE_MIN_IO_SIZE;

	// ???? ????
	if (traceExecution != 0)
		SysLog(EVENT_INFO, "FileStoreThread <<< i: %I64d count: %lu line: %d\n", i, count, __LINE__);

	// **** populate array with values ****
	for (int j = 0; j < count; j++)
		{

		// **** set value in array ****
		data[j] = value;

		// **** increment value ****
		value++;
		if (threadArg->oddOrEven == STORAGE_ODD || threadArg->oddOrEven == STORAGE_EVEN)
			value++;
		}

	// **** write values to file ****
	size = count * sizeof(longlong);
	status = _write(fd,
					data,
					size);
	if (status != (int)size)
		{
		SysLog(EVENT_ERROR, 
		"FileStoreThread <<< _write status: %d size: %du errno: %d line: %d file ==>%s<==\n", 
		status, size, errno, __LINE__, __FILE__);
		retVal = status;
		goto done;
		}

	// **** increment i ****
	i += (longlong)count * sizeof(longlong);
	}

// **** clean up ****
done:

// **** close output file ****
if (fd != -1)
	{
	status = _close(fd);
	if (status != 0)
		{
		SysLog(EVENT_ERROR, 
		"FileStoreThread <<< _close fd: %d errno: %d line: %d file ==>%s<== \n",
		fd, errno, __LINE__, __FILE__);
		retVal = -1;
		}
	}

// **** free the thread argument ****
if (threadArg != (FILE_STORE_THREAD_ARG*)NULL)
	free((void*)threadArg);

// **** inform the user what is going on ****
if (traceExecution != 0 || retVal != 0)
	SysLog(EVENT_INFO,
	"FileStoreThread <<< threadID: %ld retVal: %d line: %d file ==>%s<==\n",
	threadID, retVal, __LINE__, __FILE__);

// **** allow other threads to execute ****
//SLOW_DOWN;

// **** exit this thread ****
_endthread();
}

This is the code for the thread that writes values to a specific file. The name of the file is determined by the value in the first argument.

The threadID is used to identify the thread executing.

The function generates the name for the file it will write to. This thread writes all its data to a single file.

The function removes the previous instance of the file (if any).

The target file is created. The initial value to be written is set.

The random number generator is seeded. In this case we are not using a seed since we pass a NULL pointer.

We enter a loop in which our function will write enough data to reach a minimum size of 1024 bytes. The idea is to have different threads write different numbers of values.

The function decides on how many values will be written to the file on each pass. There is a minimum and a maximum number. The data array is populated with the desired number of fresh values.

The contents of the array are written to the file of interest. The loop counter is then incremented.

When all is said and done the output file is closed and the memory for the data array is freed.

If one of the conditions is met, a message is displayed and the thread exits.

int __stdcall	displayContents	(
								int		i,
								int		n
								)

//	***************************************************************@@
//	- This function dumps the specified number of lines on the top
//  and bottom of the associated file.
//	*****************************************************************

{

#ifdef _CODE_DEBUG
//int			traceExecution = 1;								// for testing only
#endif

char			errorString[BUFSIZ],							// error string
				fileName[BUFSIZ];								// full path of file to be displayed

int				fd,												// file descriptor
				retVal,											// returned by this thread
				status;											// returned by function calls

longlong		*data;											// to read from file

struct _stati64	statBuffer;										// file status data structure

unsigned long	bytesRead,										// number of bytes read
				bytesToRead,									// number of bytes to read
				offset;											// file offset

// **** initialization ****
retVal = 0;														// hope all goes well

data	= NULL;													// for starters
fd		= -1;													// for starters
offset	= (unsigned long)0L;									// for starters

memset((void*)errorString,	(int)0x00, (size_t)sizeof(errorString));
memset((void*)fileName,		(int)0x00, (size_t)sizeof(fileName));
memset((void*)&statBuffer,	(int)0x00, (size_t)sizeof(statBuffer));

// **** compute number of bytes to read ****
bytesToRead = n * sizeof(longlong);

// ???? ????
if (traceExecution != 0)
	SysLog(EVENT_INFO, "displayContents <<< bytesToRead: %lu line: %d\n", bytesToRead, __LINE__);

// **** allocate data buffer ****
data = (longlong*)calloc(	(size_t)1,
							(size_t)bytesToRead);
if (data == (longlong*)NULL)
	{
	SysLog(EVENT_ERROR,
	"displayContents <<< calloc bytesToRead: %lu line: %d file ==>%s<==\n",
	bytesToRead, __LINE__, __FILE__);
	retVal = -1;
	goto done;
	}

// **** generate name of file ****
if (i == STORAGE_ODD_AND_EVEN)
	status = sprintf(fileName, "c:\\temp\\a.bin");
else if (i == STORAGE_ODD)
	status = sprintf(fileName, "c:\\temp\\b.bin");
else if (i == STORAGE_EVEN)
	status = sprintf(fileName, "c:\\temp\\c.bin");
else
	{
	SysLog(EVENT_ERROR,
	"displayContents <<< UNEXPECTED i: %d line: %d file ==>%s<==\n",
	i, __LINE__, __FILE__);
	retVal = i;
	goto done;
	}

// ???? ????
if (traceExecution != 0)
	SysLog(EVENT_INFO, "displayContents <<< fileName ==>%s<== line: %d\n", fileName, __LINE__);

// **** open file of interest ****
fd = _open(	fileName,
			_O_RDONLY | _O_BINARY | _O_EXCL);
if (fd == -1)
	{
	SysLog(EVENT_ERROR,
	"displayContents <<< _open fileName ==>%s<== errno: %d line: %d file ==>%s<==\n",
	fileName, errno, __LINE__, __FILE__);
	retVal = errno;
	goto done;
	}

// **** display header info ****
printf("displayContents <<< contents of fileName ==>%s<== follow:\n", fileName);

// **** read top n lines ****
bytesRead = _read(	fd,
					data,
					bytesToRead);								// number of bytes read
if (bytesRead != bytesToRead)
	{
	SysLog(EVENT_ERROR,
	"displayContents <<< _read bytesRead: %lu fileName ==>%s<== errno: %d line: %d file ==>%s<==\n",
	bytesRead, fileName, errno, __LINE__, __FILE__);
	retVal = errno;
	goto done;
	}

// **** loop displaying top n lines ****
for (int i = 1; i <= n; i++)
	{

	// **** ****
	printf("%I64d ", data[i - 1]);

	// **** line separator ****
	if (i != 0 && i % 8 == 0)
		printf("\n");
	}

// **** terminate last line ****
if (n % 8 != 0)
	printf("\n");

// ***** get the size of the file ****
status = _stati64(	fileName,
					&statBuffer);
if (status != 0)
	{
	strcpy(errorString, strerror(errno));
	EventLog(EVENT_ERROR,
	"displayContents <<< _stati64 fileName ==>%s<== errno: %d errorString ==>%s<== line: %d file ==>%s<==\n",
	fileName, errno, errorString, __LINE__, __FILE__);
	retVal = errno;
	goto done;
	}

// ???? ????
if (traceExecution != 0)
	SysLog(EVENT_INFO, "displayContents <<< statBuffer.st_size: %I64d line: %d\n", statBuffer.st_size, __LINE__);

// **** seek to the start of the bottom n lines in this file ****
offset = _lseek(fd,
				-bytesToRead,
				SEEK_END);
if (offset == (unsigned long)-1L)
	{
	strcpy(errorString, strerror(errno));
	EventLog(EVENT_ERROR,
	"displayContents <<< _lseek offset: %ld bytesToRead: %ld errno: %d errorString ==>%s<== line: %d file ==>%s<==\n", 
	offset, (long)bytesToRead, errno, errorString, __LINE__, __FILE__);
	retVal = status;
	goto done;
	}

// ???? ????
if (traceExecution != 0)
	SysLog(EVENT_INFO, "displayContents <<< offset: %lu fileName ==>%s<== line: %d\n", offset, fileName, __LINE__);

// **** read botton n values ****
bytesRead = _read(	fd,
					data,
					bytesToRead);								// number of bytes read
if (bytesRead != bytesToRead)
	{
	strcpy(errorString, strerror(errno));
	SysLog(EVENT_ERROR,
	"displayContents <<< _read bytesRead: %lu bytesToRead: %lu fileName ==>%s<== errno: %d errorString ==>%s<== line: %d file ==>%s<==\n",
	bytesRead, bytesToRead, fileName, errno, errorString, __LINE__, __FILE__);
	retVal = errno;
	goto done;
	}

// **** display separator line ****
printf(":::: :::: :::: ::::\n");

// **** loop displaying bottom n lines ****
for (int i = 1; i <= n; i++)
	{

	// **** display value ****
	printf("%I64d ", data[i - 1]);

	// **** line separator ****
	if (i != 0 && i % 8 == 0)
		printf("\n");
	}

// **** terminate last line ****
if (n % 8 != 0)
	printf("\n");

// **** clean up ****
done:

// **** close output file ****
if (fd != -1)
	{
	status = _close(fd);
	if (status != 0)
		{
		SysLog(EVENT_ERROR,
		"displayContents <<< _close fd: %d errno: %d line: %d file ==>%s<== \n",
		fd, errno, __LINE__, __FILE__);
		retVal = -1;
		}
	}

// **** free the data buffer ****
if (data != (longlong*)NULL)
	free((void*)data);

// **** inform the user what is going on ****
if (traceExecution != 0 || retVal != 0)
	SysLog(EVENT_INFO,
	"displayContents <<< retVal: %d line: %d file ==>%s<==\n",
	retVal, __LINE__, __FILE__);

// **** inform the caller what went on ****
return retVal;
}

This is the function used to display the top n and bottom n values in the specified file.

After declarations and initialization the function computes the number of bytes it needs to read. A buffer is allocated.

The name of the file is generated. The file is then opened.

The first n entries in the file are read into the data buffer. The values are displayed using a ‘ ‘ as a separator.

We then get the size of the file of interest. Since the contents of the file were written by a single thread, all the values are unique and in order.

The code then seeks to the proper offset to read the last n values. The values are then displayed using the same approach as we did for the first n values. I guess we could refactor and have a single function. Will do so for the next post.

The function then releases the allocated resources, checks if a message needs to be displayed / logged, and returns the value in the retVal variable.

I call your attention to the fact that we can seek to a specific value that is only dependent on the size of the file and be able to display the last n values. How about if all threads are writing to a single file (simulating a block file) and we want to get the last n values written by the three different threads! We will explore such a problem in part three of this set of posts.

In the next post we will implement the code for all threads to write to a single file and will keep track of the offsets and number of records on each write. I agree, things are getting complicated but performance might be improving.

Hope you enjoyed solving this problem as much as I did. The entire code for this project can be found in my GitHub repository named BlockVsFileStorage.

Please note that the code here presented might not be the best possible solution. In most cases, after solving the problem, you can take a look at the best accepted solution provided by the different websites (i.e., Codility_, HackerRank, LeetCode). In this post this is not the case.

If you have comments or questions regarding this, or any other post in this blog, please do not hesitate and leave me a note below. I will reply as soon as possible.

Keep on reading and experimenting. It is one of the best ways to learn, become proficient, refresh your knowledge and enhance your developer / engineering toolset.

Thanks for reading this post, feel free to connect with me John Canessa at LinkedIn.

Enjoy;

John

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.