I tend to write about programming challenges, books that I have read, database engines, etc. Most of the posts deal with a single subject. On occasions I extend a particular topic to several posts. I have mentioned that I would like to do a few posts on developing projects that fall outside work. This post will start a set of post which will be associated with developing different components that will allow a user to collect a set of specific images, pre process them, train machine learning (ML) algorithms, test them and deploy the best algorithm. We can then compare the results versus what a human (me in this case), and a feature that I developed for a project at work that produces the same result but using a different approach.
Before we get into the high level architecture / design for our project, I wanted to comment on a feature that user experience (UX) engineers should consider for most products that interact with humans. The motivation for this comment was the article “Visual Studio Code hits 1.42, with revamped preview, and raft of …previews” by Joe Fay. It covers some of the features that have been added and some fixes.
As you might have noticed, for years I used several versions of the Eclipse IDE. I also purchased a license and used for a while the IntelliJ Idea IDE. About a year ago I read about Visual Studio Code. I have been using for a long time Visual Studio to develop software for several commercial products. In one of my development machines I keep four different versions of the IDE.
As I was reading the article, I ran into some features / items that I use or that I could use in the reasonable future. I fully understand that the product must cover non intersecting features because VS Code supports a multitude of programming languages and a large number of extensions.
My suggestion for a UX engineers is to separate the features into levels so it is easier for developers that might jump from one language to another or use different IDEs to only see the most common features which they probably use and hide more complex ones. I would separate the features into three levels. Not sure if I would give the levels specific names. I would just call them level 1, 2 and 3. So when a new user is introduced to the product, she can learn all what is available without being distracted and probably intimidated by some many what at the time might be considered obscure options. The Settings in VSCode is an example of what of too many options to present at once.
When I started my first job developing software, I decided to always write how features should work before starting to code. After the code was completed I would go back and update the documentation. Documentation when is short, contains simple diagrams, and is to the point it is very useful to maintain the product.
I have been using Microsoft Word and Visio to generate requirements, architecture, design, and testing documentation. Both products are very capable but I just need a few set of features which could all be included in level 1. If I look for something new I could change the level up if I do not find it at the current level. I enjoy learning things, but becoming an expert on Word or Visio is not in my current set of priorities.
If there is a UX designer / manager I would appreciate getting some feedback on this suggestion. I always develop software that is simple to use no matter the level of the person interfacing with it.
An automated stacker provides a robotic mechanism, one or more bins, a USB color camera, and one or more disc drives. In our case the user will put some discs in the input bin and our software will move the discs, one at a time, from the input to the extended tray of a drive. Our stacker interface will instruct the USB camera pointing to the extended drive to take a picture of the media. The JPEG image will be processed (cropped) and under control of the web interface, sent to the system running the web interface.
The stacker exposes a set of APIs that allow software to manipulate the picker, drives and camera. There is a DLL written in C/C++ that implements all the functionality we will require to move the discs, take and process pictures.
The web interface will be used to expose a simple API which will allow users to manipulate the stacker and obtain a single cropped picture in jpeg format. The image will be sent back to the caller for storage and rotation. This code will be implemented in C#.
Images that may need to be rotated
The cropped images of the discs will be stored. Most of them will need to be rotated in order to get them to a right side up position.
We will initially use the KNIME Analytics Platform to experiment with the images. This will be a manual process. After I get a basic idea of which models I will use, will move on to Azure ML Studio and work on the final set of models. The set might grow or shrink based on experimentation with Azure ML. The reason this task is split into two is due to cost. I know myself and tend to get carried away and do not want to end up with a large bill. If this would be a work project I will skip the use of KNIME.
ML models are mathematical representations of some real-world process. You can read more about it on Some Key Machine Learning Definitions article by Joydeep Bhattacharjee. The idea is to select a set of model types and train them. Once they are trained we will be able to test them with a percentage of images (about 20%) that have not been used in teaching the model. The model should be able to return for any image of a CD / DVD / Blu-ray cover it orientation in degrees. Not sure how accurate we want the value. It would be great to make it any integral number in the range [0 : 259] but that might require more data than if we get values in the same range separated by 5 degrees.
Determine rotation angle
This module based on the model we choose will take a look at an image and should be able to return the rotation angle needed to get the image right-side-up. We will probably implement this module using Python.
This module will receive the jpeg image and the angle returned by the ML model. The module will rotate the image as specified. I will implement this module using C# or C++. I already have a module that rotates images implemented in C++. Not sure if I should use it for this project.
After the images are rotated we will write them to a folder in the file system. Not much to say here. I will implement it using C#.
As time goes by and we go through the implementation the descriptions of the modules will be enhanced and the diagram will include additional information. I will attempt to include in each associated post an update after each module is completed.
If you have comments or questions regarding this, or any other post in this blog, or if you would like for me to serve of assistance with any phase in the SDLC (Software Development Life Cycle) of a project associated with a product or service, please do not hesitate and leave me a note below. If you prefer, send me a private message using the following address: firstname.lastname@example.org. I will reply as soon as possible.
Keep on reading and experimenting. It is the best way to learn, refresh your knowledge and enhance your developer toolset!