To recap on the log file sub system, the initial software was able to log from any thread in any process in a single computer messages which included values of variables. Following is a snippet from a log file:
06/14/17 07:24:19 0x0000128c - SDM <<< pollForRemoteGUIDs: 0 (bool) controlled by PollForRemoteGUIDs line: 5623 06/14/17 07:24:19 0x0000128c - SDM <<< casXferEncrypted: 0 (bool) controlled by CASXferEncrypted line: 5742 06/14/17 07:24:19 0x0000128c - 06/14/17 07:24:19 0x0000128c - SDM <<< !!!! The SDM service; part of the iCAS; is READY to process requests !!!! line: 5756 06/14/17 07:24:19 0x0000128c - 06/14/17 07:24:40 0x0000128c - SDMEnQueueRequest <<< ourIP ==>192.168.1.111<== line: 6882 06/14/17 07:24:40 0x00001890 - RequestThread <<< STARTING LoadBalancingThread line: 1896 ... 06/14/17 07:24:40 0x00001890 - RequestThread <<< STARTED LoadBalancingThread line: 1914 06/14/17 07:24:40 0x000001b4 - DoIsAlive <<< serverIP ==>192.168.1.111<== line: 7539 06/14/17 07:24:40 0x000001b4 - DoIsAlive <<< serverPort: 4444 line: 7551
The first three fields (date, time and thread ID) are inserted by the log file sub system. The text is created by client applications / servers / threads by making a single call to the sub system. By convention (not enforced by the software) WARNING messages contain the source line number that generated the message. ERROR entries should contain the line number and the name of the source file. Following is a snippet of a log file illustrating ERROR messages:
06/14/17 07:31:43 0x00001548 - SockToSendReceive <<< connect WSAETIMEDOUT remoteIP ==>192.168.1.106<== localIP ==>192.168.1.111<== port: 4444 line: 2430 file ==>c:\sencorsource\source\sock.c<== 06/14/17 07:31:43 0x00001548 - SockSendCommand <<< SockToSendReceive WAR_SOCK_TIMEOUT serverIP ==>192.168.1.106<== serverPort: 4444 requestType: 00000020 line: 3749 file ==>c:\sencorsource\source\sock.c<== 06/14/17 07:32:15 0x00001c30 - DoGetProperties <<< SemaphoreWaitFor WAR_OPERATION_TIMEOUT PROPERTIES_THREAD_DELAY: 180 seconds threadHandle: 0x00000634 guid ==>11111111112222222222333333333344<== line: 24460 file ==>c:\sdm\source\requestthread.c<==
When the entire log system stops accepting new messages because a single rouge thread is generating them out of control, it seems like an opportunity to improve the software is calling for an improvement. The sub system should easily implement a hash map with the list of threads in order to keep the times and counts. That way, the software would be able to only stop writing messages for the specified thread when a threshold is reached.
For obvious reasons I am not going to put the code that implements such feature in this post. Please keep in mind that the mechanism to determine how many messages are sent by a single thread would be the same as the one used to determine the count for all threads.
In general, software developers at all levels should keep their eyes open for opportunities to improve the software. It is quite difficult and requires some experience to determine what needs to be improved once the tasks for the current sprint are completed and there might be some time left over. Keep in mind that the team needs to know when unscheduled modifications are being implemented. In Agile environments the development pace is estimated based on what the team accomplishes. Communications is an important activity for scheduling and improving the software.
If you have a comment or questions regarding this post please leave me a note in the comments section. I will address it as soon as I get a chance.
Follow me on Twitter: @john_canessa