Artificial intelligence technology is currently one of the most popular high-tech technologies. It can be called the “high, wealthy and handsome” in the science and technology industry. All practitioners have high salaries and gorgeous appearance. The most important thing is the national strategic development direction, the future sunrise industry. Artificial intelligence has gradually penetrated into all walks of life and has begun to affect our food, clothing, housing and transportation. This impact will be more obvious in the future. It can be said that no AI is not fashionable. If you don’t know how to use AI products, you will be abandon by this era.
Market and talent demand in the field of computer vision
The applications in the AI field are dazzling. Almost every time there are new application products appearing, sometimes there are endless AI applications under just one product. For example, in the case of Douyin. The various face changes above Applications such as, beauty, and special effects are changing every day. Which means that relate AI application technologies are constantly being update. So how should people who are interest in AI or who want to learn more about its technical principles start? Next, I will do an overall analysis for everyone, starting from the application classification.
Looking at the current application direction of the entire AI field. It can be roughly divid into four categories: image vision application field, natural language application field, voice signal application field, and automation application field. Among the four major areas, the most widely used products are in the field of image vision applications. According to the iResearch Consulting Report, according to the statistics of downstream industry demand, the market size of computer vision products in China will account for the largest of the entire artificial intelligence industry in 2020 . 57%, which means that more than half of the domestic AI product market is doing application products in the image vision field. This is mainly because the market has greater demand for products in the image vision application direction than products in other directions.
Due to the high demand for AI products in the market, most of the technical personnel required by AI product research and development companies on the market are those related to image vision technology, which has also aggravated the scarcity of talents in the field of image vision in the market. In the “Artificial Intelligence Industry Talent Development Report” released by the Ministry of Industry and Information Technology in 2020.
Task classification in the field of computer vision applications
As mentioned earlier, the computer vision industry. Which accounts for the largest proportion of artificial intelligence applications. Has many people’s unclear understanding of the application of computer image vision. Generally in the field of image vision task processing, according to its specific task objectives, it can be divid into: detection (regression fitting) task, classification task, generation task, and segmentation task. So in this article, we will discuss the detection task and the classification task, and we will analyze and discuss the generation task and the segmentation task in the future.
The detection task is to detect whether an image contains a certain target. And to perform regression fitting these label coordinate points by learning the target label coordinates. So as to achieve the purpose of detecting the target. For example, in the case of face detection, what the computer needs to learn is the coordinates of the upper left and lower right corners of the face rectangle. And then draw the rectangle base on the learn coordinates to complete the task of face detection.
From the above face detection task, it can be found that the face detection task actually completes the detection task through the regression of the label points. If we need pure regression to achieve certain tasks, such as the detection of key points, then do not draw a rectangular box, just mark the key points after the regression. In the task of face detection, the most common operation is to return the key positions of the face while detecting the face, which helps to improve the detection accuracy.
First, let’s take a look at the single-class multi-target detection task. Single-class multi-target detection is also easy to understand. That is, the detected target has only one category, but there are multiple objects. For example, there are multiple faces on a picture. This task belongs to Single-class multi-target detection. Then what is the difficulty difference between single-class single-target detection and single-class multi-target detection? The answer is that there is a big gap. To put it simply, single-class single-target detection only needs to leave the target with the highest confidence on the graph. But single-class multi-target not only considers detecting all targets, but also considering removing them. The target being repeatedly detect. Some techniques such as IOU and NMS must be use to achieve the goal.
The only difference between multi-type multi-targets and single-type multi-targets is that the detected targets are not the same category, but may be two categories or multiple targets of more than two categories. The following figure is an example. The detected target includes multiple categories, and each category has one or more objects. This type of detection is know multi-class and multi-target detection.
In most cases, classification tasks are accompanied by detection tasks, and classification tasks are relatively easy to understand. Simply put, the targets are classifie into categories, and classification tasks can be divid into two classification tasks and multi-classification tasks according to the number of classifications.
The two-classification task, as the name implies, is to classify all data into two categories. This application is generally use when judging whether a target meets a certain standard. The most intuitive example is face recognition, that is, to determine whether the target is a person. Face, the specific method is to compare the current face with the face in the face database one by one. The similarity is the target of the recognized face. Otherwise it is not, this is a very intuitive two-classification problem, except for the face Recognition. There are many other similar two-classification cases. Such as judging whether the email is spam, judging whether the image is pornographic violation, and so on.
Is generally to classify many different targets at the same time. Such as classifying and confirming different targets in a photo. Generally, these tasks are perform at the same time as the detection task. The most typical model is the YOLO series, which takes deep learning as an example . Perform detection tasks and classification tasks at the same time.
In addition, according to whether the processed data has label information. We can also divide machine learning into several types such as supervised learning, unsupervised learning, and semi-supervised learning.
Supervised learning is the process of labeling the data according to the one-to-one shape of the input data and the label. And then let the machine learn a large number of labeled sample data to train a model, and make the model get the corresponding output according to the input. .
Unsupervised learning is to directly build a model by learning training samples without classification marks to reveal the inherent nature and laws of the data. Specifically, the data set should be divid into several disjoint subsets. And the elements in each subset have a higher degree of similarity with the elements in this subset under a certain measure.
The subset divided by the above method is clustering, including K- means, k-mode , k- center point, Gaussian Mixture Model ( GMM ), hierarchical clustering, EM and other algorithms.