Video annotation

What is video annotation?

Video annotation is used to train computer vision models to identify specific objects within video data. Information tags are added to objects of interest within the video to make it easier for algorithms to identify them.

Types of video annotation

There’s a wide range of annotation methods. Some of the most popular include the following.

2D bounding boxes: Annotators draw and label boxes around objects of interest as they move across video frames. This process helps to improve object detection in autonomous vehicles, drones and more.
3D bounding boxes/cuboids: This type of annotation is used to depict the length, width and approximate depth of objects, as well as to track objects across multiple frames of video.
Semantic segmentation: Each image pixel is labeled so that a computer vision algorithm can recognize a collection of pixels that form distinct categories.
Lidar 3D point cloud annotation: Objects are visualized, labeled and tracked across frames in 3D point clouds for all types of lidar.
Landmark and keypoint annotation: Used to determine shape variations of minute and large objects via a sequence of points. This type of video annotation is commonly used for facial features, expressions, emotions and human body movements.
Polygon annotation: Useful for training object localization and detection algorithms, polygon annotation is well-suited for annotating objects with irregular shapes, like street signs in traffic images or houses in aerial imagery, with a high degree of accuracy.
Line and polyline annotations: These define lane lines in drivable areas for vehicle perception models.
Frames classification: Time stamps are tagged for attributes like time of day, weather, environment, object/border occlusion and more for autonomous vehicle perception models.

Video annotation use cases

The range of use cases for video annotation are broad and span industries. A few of the verticals leveraging video annotation today include the following.

Autonomous technology: Video annotation is used to train self-driving vehicles to detect and identify objects like pedestrians, other vehicles, road signs, traffic lights and more.
Commerce: Use cases include monitoring how customers react to products, tracking shopper movements throughout the store to determine optimal product placement and more.
Healthcare: Video annotation is used to enhance medical imaging diagnostics.
Government: Many cities rely on smart traffic management systems to help improve traffic congestion, alert authorities to collisions and more. Video annotation is used to build machine learning models for this purpose.