My Journey to Artificial Intelligence

cahyati sangaji (cahya)
8 min readMar 21, 2021


Cahyati S. Sangaji, Solution Specialist IoT and AI — Sinergi Wahana Gemilang

To start, it would be good if we know what Artificial Intelligence (AI) is. AI falls within a computer science discipline to make computer mimics humans, a generic term with lots of related technologies.

AI has a sub-level called Machine Learning, which builds AI models or AI “brains” that learn with experience. We can differentiate objects like whether this is a person, a table, or something else. This is very much like humans that learn to understand things.

Going further is deep learning, which is a subset of Machine Learning. Deep learning is a technology within machine learning that has been popular in the last 8+ years, and in fact, is at the center of the latest AI development. Deep learning is based on neural networks, which learn directly from data. In general, the more data you give deep learning models, the more accurate they become.

You often find keywords like AI models, training, big data, framework, GPU (Graphics Processing Units) in AI. AI models generated from the training process have certain patterns capable of predicting things according to the type of recognition they are trained for. For example, image classification, object detection, or action detection.

Another example is the model built from structured data, i.e., you can create a model to identify patterns of customers who have the potential to buy the product. You can analyze patterns from age, gender, salary, needs, or habit, for example. The model can be implemented for personalized and targeted marketing, for instance.

You can use many patterns for the AI model. In simple terms, the pattern to be created is a pattern that replicates the human senses’ function. Typically, we need a substantial amount of dataset to build a considered good AI model.

AI frameworks create deep learning or neural network easier, thus providing faster time to market. Training an AI model needs high-performance hardware, such as for performing complex matrix or complex arithmetic calculations. The more data we feed to the training process, the longer it will create the model.

The hardware-accelerated solution is GPU, Graphics Processing Units. GPU has become one of the most important computing technology types, both for personal and business computing such as gaming and now for AI. Designed for parallel processing, GPU is used in various applications, including graphics and video rendering. A computer equipped with GPU can train a model multiple times faster than Central Processing Unit (CPU).

AI has been around since the 1950s but started to advance rapidly following the availability of GPU, dataset (Bigdata), and algorithms such as CNN (Convolutional Neural Network) for processing images.

How can you create Artificial Intelligence?

A typical workflow to create an AI model are data preparation, train, and deployment. You can see the process from illustration-1. Data preparation is a process for collecting and labeling data.

In general, the dataset is split to 80%:20% for the training:test dataset. Then from 80% dataset for training, you prepare 80% dataset for training and 20% dataset for validation. After you finish collecting data, you need to label the dataset for training. This is a case for Supervised Learning.

During training, the computer repeatedly learns the relationship between the input and the output (label). The computer will then build the pattern.

Following the completion of the training, you will get accuracy (training result). When getting a good accuracy, you can do the next step. But if the accuracy results do not match what you want, you may need to add more datasets or do data augmentation. The training’s hyperparameters may also need to be adjusted, such as learning rate or training batch size.

Illustration -1: Typical workflow to create an AI model

Following the completion of training, you can then deploy the trained model. Use the test data to check the accuracy of the prediction.

After preparing the initial dataset, the data preparation process started by splitting the data into training, validation, and test data. In the data preparation process, you label the dataset for training. The next step is modeling/training, and you will get a trained model. You can repeat the process to get better accuracy.

Illustration-2: A Typical AI Data Pipeline (Andi Sama, 2019a)

Once you get good (acceptable) accuracy from the trained model, deploy the model, and perform testing with the test dataset. We can then see the deployed model predictions — all the processes running on software libraries and AI frameworks using GPU devices.

My experience to learn Artificial intelligence

This is an interesting story for me. When I first learned AI, I learned Linux OS, python programming, and I know Graphics Processing Units. The first time I learn AI when my office bought NVidia Jetson TX2; this is a GPU device.

Starting from that, I tried a sample application for running an AI model. I tried to run a sample AI model during the process, and I also learned more about python programming.

I also learned about the AI process, AI algorithm, knew some frameworks and some GPU devices, and created an AI-powered showcase for an event: an image captioning model with Jetson TX2. I tried with the image for test result from the image captioning model, and then I tried to modify the base program to support live stream image captioning with a camera.

Illustration-3: Image Captioning on Jetson TX2

I have something interesting when trying to train using the Jetson TX2, and I got an error “out of memory.” After that, I realized that Jetson TX2 could not be used for training the model.

More related to this can be found in “Eksplorasi Image Captioning, NVIDIA Jetson TX2 Super Computer for enabling Artificial Intelligence @Edge” from SWG insight Q2, 2018. Then article “Show & Tell, Image Captioning on Mobile Device, Image Captioning with Inference Engine running at Edge” from SWG insight Q2, 2018 too. I also showed an image captioning showcase at a few events.

Next, I tried to run a face detection model with image data for testing by modifying the base program to support a live stream data feed from the camera for face detection. Then my partner Andrew Widjaja and I modified the base program to face recognition showcase.

We could detect the face and recognize the name, and with the combination of IBM Visual Recognition service, we could also detect age and gender. More on this in the following articles:

  • Face Recognition (Video), IoT Edge with AI, NVIDIA Jetson TX2 Supercomputer (OpenCV, Python and Deep Learning),” SWG insight Q1, 2019.
  • Real-time Face Recognition powered by Watson & IoT. IBM Cloud, Watson and IoT on Edge with Jetson TX2 Supercomputer,” SWG insight Q2, 2019.

This showcase has also been exhibited at a few events.

Illustration-4: Face Recognition with NVidia Jetson TX2

I tried poseNet too in Jetson TX2, test with image and real-time detection with a camera.

Learning AI with IBM POWER9 and IBM PowerAI Vision

After learning AI using Jetson TX2, I studied using some more devices provided by SWG, such as Jetson Nano and Jetson AGX Xavier. Then I learned how to create an AI model with IBM POWER9 equipped with high-end GPU NVidia Tesla V100 as the Jetson family device can not be used to train the model.

IBM POWER9 is a powerful IBM server machine designed for AI workload. On top of IBM POWER9, I used IBM PowerAI Vision, a software that automates AI workflow with no coding. IBM PowerAI Vision is the original name of IBM Visual Insights. It was later renamed to be IBM Maximo Visual Inspection, its current product name. I learned about doing image classification, object detection, and action detection.

I also learned about H2O driverless AI (H2ODAI) too; this software trains models with structured data. I could use H2ODAI to some extent. However, I barely understood the details of the software.

I mainly used use PowerAI Vision (Maximo Visual Inspection) with unstructured data. I also tried to run the exported PowerAI Vision model in the Jetson family devices.

I tried with FR-CNN and Detectron. Details can be found in the following article: “AI Model Deployment on Things (AIoT), Object Detection with IBM PowerAI Vision on IBM AC922,” SWG insight Q2, 2020.

Using POWER9, I tried other experiences to use multiple GPU and Memory on a server for training. More in the article: “Optimizing AI Training Platform with Large Model Support, Apply Large Model Support (LMS) using Pytorch & Tensorflow deep learning frameworks to extend GPU Memory to System Memory on IBM Power AC922,” SWG insight Q3, 2020.

Training, Webinar, and Course

When I learned AI indirectly, I also learned about python programming and data science, including the analysis process. I read a lot of AI or data science articles. I followed the AI experts' training or webinar of AI and courses and badges like, IBM Data Science, and Coursera.

Illustration-5: certificate of participant webinar, Badges from Cognitive AI, and Badges from Coursera

Another interesting experience was when I wanted to get Data Science Professional Certificate; I tried to do a project related to the current condition of the COVID-19 pandemic.

I did the project to identify which areas need masks the most and which hospitals need a medical device for COVID-19 treatment. Maybe it is not related to AI but related to the implementation of data science. It is documented in “A Visual Approach to determine Strategic Locations for Masks & Medical Device Distribution for COVID-19 treatment” from SWG insight Q3, 2020.

From my experience, to understand AI, you have to read a lot, learn from the experts, and try a lot to implement AI into everyday life. Although AI represents part of what humans can do, AI is only a tool to help you automate daily activities.