Moments in Time Challenge Results 2018

Thank you to all those who participated in the 2018 Challenge! Between the two tracks, a total of 123 partipants formed 24 registered teams and made a combined 151 valid submissions. Each team was allowed to make one submission per day and 10 total over the entire competition. Teams were ranked based on the score of their best submission. Score is computed as the average of the Top-1 accuracy and Top-5 accuracy. The 2018 winners are listed below:

Full Track Winners
  1. DEEP-HRI of Hikvision (0.5291)
  2. Megvii (0.5126)
  3. Qiniu (0.5006)

Mini Track Winners

  1. SYSU_isee of Sun Yat-Sen University (0.4772)
  2. beihang university of Beihang University (0.4549)
  3. MiRA of National Taiwan University (0.4510)

Congratulations to all the teams! See below for the official leaderboard and submission reports.
Reports are informal essays optionally submitted by the participants, for academic exchange only, which are neither considered as proceeding papers nor publications.

Full Track

Rank Team Name Entry Description Top-1 Acc. Top-5 Acc. Score
1 DEEP-HRI Ensemble-C3D [report] 0.3864 0.6719 0.5291
2 Megvii Spatial Temporal: 2D + 3D + flow + audio [report] 0.3750 0.6503 0.5126
3 Qiniu [report] 0.3641 0.6371 0.5006
4 Alibaba-Venus Video Analysis structure: i3d, nonlocal, trn, vlad+
modality: rgb, flow, acoustic [report]
0.3551 0.6366 0.4959
5 Xtract AI Xtract Boosted Fusion [report] 0.3199 0.5983 0.4591
6 SSS v2 3 models [report] 0.3195 0.5756 0.4476
7 CMU-AML [report] 0.3103 0.5842 0.4473
8 UNSW-Data-Science [report] 0.3038 0.5490 0.4398
9 fengwuxuan trn: inceptionv3 0.2861 0.5490 0.4176
10 SYSU_isee Dynamic fusion [report] 0.2731 0.5386 0.4058
11 Moments in Time Team Pretrained InceptionV3 Temporal Relation Network (TRNmultiscale with 8 segments) 0.2731 0.5386 0.4047
12 STAIR Lab Our method combines multiple predictions from different models. Models are 2DCNN, 3DCNN, Audio, and Caption, respectively. Combination function is either average, MLP or SVM. [report] 0.2721 0.5357 0.4039
13 AR Team I3D 0.2959 0.5074 0.4016
14 mmmm RGB Stream 0.2704 0.5282 0.3993
15 FR - 0.2525 0.5098 0.3811
16 SIAT_MMLAB [report] 0.2496 0.5093 0.3794
17 w452261940 3D Conv [report] 0.2590 0.4983 0.3787
18 SCZH CNN 0.2454 0.4953 0.3704
19 pingchuan pingchuan network architecture 0.2340 0.4831 0.3585
20 Lee99 - 0.2315 0.4718 0.3516
21 LQQ - 0.2281 0.4656 0.3468
22 cdy MIT team [report] 0.2247 0.4574 0.3411
23 j4f RGB 0.2507 0.3661 0.3084
24 TYY Ensemble method 0.1897 0.4063 0.2980
25 IBM ARL Action recognition using deep 3D conv nets. It is based on DenseNet, pre-trained with ImageNet, but is extended to 3D (spatial + temporal dimensions). 0.1749 0.3953 0.2851
26 AIST 3D ResNeXt pretrained on Kinetics-400 [report] 0.1800 0.3843 0.2821
27 Indy_500 C3D_svm - Spatio-temporal features are extracted from image. Linear svm classifier trained. Image features are extracted from videos. Classifier is then trained. 0.1130 0.2676 0.1903
28 mms_2000 We use a combination of audio and video features to train the classifier. 0.1073 0.2376 0.1724
29 MB Resnet Feature extraction with temporal model learning 0.0033 0.0152 0.0092

Mini Track

Rank Team Name Entry Description Top-1 Acc. Top-5 Acc. Score
1 SYSU_isee [report] 0.3316 0.6228 0.4772
2 beihang university CNN 0.3132 0.5966 0.4549
3 MiRA Earlyfusion [report] 0.5861 0.3159 0.4510
4 cdy MIT team TRN: InceptionV3——model fuse [report] 0.3059 0.5813 0.4436
5 The Dragon Warrior TRN based method: using trn and p3d to classify MiT dataset 0.3046 0.5792 0.4419
6 Cardinal Vision Muti-Stream Pipeline: ensemble model with multiple prediction stream, including Resnet, TRN, and YOLO. 0.2892 0.5196 0.4044
7 j4f 0.2765 0.5282 0.4024
8 Moments in Time Team 0.2422 0.4803 0.3613
9 HERO_AN Multiple Segments Relation Network (MSRN) [report] 0.2096 0.4504 0.3300
10 Big FIsh Pretrained I3D with imagenet and kinetics, finetune on moment in time. 0.1886 0.3956 0.2921
11 Activity Recognition in Large Scale Short Videos Visual text features based of ResNext architecture [report] 0.1838 0.3816 0.2827
12 USTC Refined I3D 0.0036 0.0247 0.0142