authors are vetted experts in their fields and write on topics in which they have demonstrated experience. All of our content is peer reviewed and validated by Toptal experts in the same field.
Michael Karchevsky
Verified Expert in Engineering

Michael是一位经验丰富的Python、OpenCV和c++开发人员. 他对机器学习和计算机视觉特别感兴趣.

Read More

PREVIOUSLY AT

SLB
Share

比赛是提升机器学习技能的好方法. Not only do you get access to quality datasets, you are also given clear goals. This helps you focus on the important part: designing quality solutions for problems at hand.

我和我的一个朋友最近参加了 N+1 fish, N+2 fish competition. This machine learning competition, with lots of image processing, requires you to process video clips of fish being identified, measured, and kept or thrown back into the sea.

an abstract image of machine learning used to identify and measure fish

In the article, I will walk you through how we approached the problem from the competition using standard image processing techniques and pre-trained neural network models. The performance of the submitted solutions was measured based on a certain formula. 凭借我们的解决方案,我们获得了第11名.

关于机器学习的简单介绍,可以参考 this article.

About the Competition

我们在每个片段中都有一条或多条鱼的视频. These videos were captured on different boats fishing for ground fish in the Gulf of Maine.

The videos were collected from fixed-position cameras placed to look down on a ruler. 尺子上放着一条鱼, 渔夫把手从尺子上移开, and then the fisherman either discards or keeps the fish based on the species and size.

来自该项目的一个示例视频剪辑

Performance Metric

这个项目有三个重要的任务. The ultimate goal was to create an algorithm that automatically generates annotations for the video files, 其中注释由以下部分组成:

  • 鱼出现的顺序
  • 视频中出现的每条鱼的种类
  • 视频中出现的每条鱼的长度

The organizers of the competition created an aggregated metric that gave a general sense of performance on all of these tasks. The metric was a simple weighted combination of an individual metric for each of the tasks. 虽然有一定的重量, they recommended that we focus on a well-rounded algorithm that was able to contribute to all of the tasks!

You can learn more about how the overall performance metric is calculated from the performance metrics of each individual task from 官方竞赛网页.

设计一个机器学习解决方案

When working with machine learning projects dealing with pictures or videos, you will most likely be using 卷积神经网络. But, 在我们使用卷积神经网络之前, we had to preprocess the frames and solve some other subtasks through different strategies.

对于训练,我们使用了一个nVidia 1080Ti GPU. A good chunk of our time was lost trying to optimize our code in order to stay relevant in the competition. We did, however, end up spending less time where it would have mattered more.

阶段0:找出独特船只的数量

With silhouette analysis, finding the number of boats became a fairly trivial task. The steps were as follows, and leveraged some very standard techniques:

  1. 从每个视频中随机获取一些帧.
  2. 计算统计数据和 加速鲁棒特性(SURF) for each image.
  3. Using silhouette analysis 对于k均值聚类,我们可以找到船的数量.

SURF detects points of interest in an image and generates feature descriptions. This approach is really robust, even with various image transformations.

一旦图像中感兴趣点的特征已知, 进行K-means聚类, followed by silhouette analysis to determine an approximate number of boats in the images.

阶段1:识别重复的框架

尽管数据集包含单独的视频文件, each video seemed to have some overlaps with other videos in the dataset. This is possibly because the videos were split from one long video and thus ended up having a few common frames at the start or end of each video.

常用框架的图形表示

识别这样的框架,并在必要时删除它们, 我们在帧上使用了一些快速哈希函数.

第二步:定位尺子

通过应用一些标准的图像处理方法, 我们确定了尺子的位置和方向. We then rotated and cropped the image to position the ruler in a consistent manner across all frames. 这也使我们能够将帧尺寸减小到原来的十倍.

检测到的标尺(绘制在平均帧上):

标尺检测过程的可视化描述

裁剪和旋转区域:

裁剪后的尺子的照片

第三步:确定鱼的序列

Implementing this stage to determine the sequence of the fish took a majority of my time during this competition. 训练新的卷积神经网络似乎太昂贵了, 所以我们决定使用预训练的神经网络.

为此,我们选择了以下神经网络:

这些神经网络模型是在 ImageNet dataset.

We extracted only the convolutional layers of the models and passed through them the competition dataset. 在输出中,我有一个相当紧凑的特征数组.

Then, we trained the neural networks with only fully connected dense layers and predicted results for each pretrained model. After that, we averaged the result, and the results turned out quite poor.

我们决定用 长短期记忆(LSTM) neural networks for better prediction where the input data was a sequence of five frames which were transformed with the pretrained models.

为了合并所有模型的输出,我们使用了几何均值.

鱼类检测管道为:

  1. 用预训练的模型生成特征.
  2. 在密集神经网络上预测鱼出现的概率.
  3. 使用预训练的模型生成LSTM特征.
  4. 在LSTM神经网络上预测鱼出现的概率.
  5. 使用几何平均值合并模型.

一个视频的结果是这样的:

a sample fish detection result shown on a graph with frame index along the x-axis and probability along the y-axis

第四阶段:鉴定鱼的种类

After spending a majority of the contest duration implementing the previous stage, we tried to make up for the lost time working with models from the previous stage to identify the species of the fish.

我们的方法大致如下:

  1. 向卷积预训练模型VGG16添加密集层, VGG19, ResNet50, Xception, InceptionV3层(卷积层的权重是固定的).
  2. 用小图像增强训练模型.
  3. 用每个模型预测物种.
  4. Сonsolidate模型通过投票.

第五步:测量鱼的长度

为了确定鱼的长度,我们使用了神经网络. One of them was trained to identify the fish heads and the other was trained to identify fish tails. The lengths of the fish were approximated as the distance between the two points identified by the two neural networks.

显示鱼头和鱼尾之间距离的照片

Complete Scheme

以下是各阶段的总体方案:

描述完整方案的流程图

The overall design was fairly simple as video frames were passed through the stages outlined above before combining the separate results.

关于总博客的进一步阅读:

了解基本知识

  • 什么是剪影分析?

    Silhouette analysis is a technique that can distinguish between clusters of data points that are visually separate from each other.

  • 什么是机器学习模型?

    A machine learning model is a product of a machine learning algorithm training on data. The model can later be used to produce relevant output for similar inputs.

聘请Toptal这方面的专家.
Hire Now
Michael Karchevsky

Michael Karchevsky

Verified Expert in Engineering

Batumi, Adjara, Georgia

2016年9月12日成为会员

About the author

Michael是一位经验丰富的Python、OpenCV和c++开发人员. 他对机器学习和计算机视觉特别感兴趣.

Read More
authors are vetted experts in their fields and write on topics in which they have demonstrated experience. All of our content is peer reviewed and validated by Toptal experts in the same field.

PREVIOUSLY AT

SLB

世界级的文章,每周发一次.

订阅意味着同意我们的 privacy policy

世界级的文章,每周发一次.

订阅意味着同意我们的 privacy policy

Toptal Developers

Join the Toptal® community.