[PDF] UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild | Semantic Scholar (2024)

Figures and Tables from this paper

  • figure 1
  • table 1
  • figure 2
  • table 2
  • figure 3
  • figure 4
  • figure 5

Topics

UCF101 (opens in a new tab)Playing Musical Instrument (opens in a new tab)HMDB51 (opens in a new tab)UCF50 (opens in a new tab)Action Classes (opens in a new tab)Action Recognition Datasets (opens in a new tab)Action Recognition (opens in a new tab)Action Recognition Method (opens in a new tab)Unconstrained Videos (opens in a new tab)Camera Motion (opens in a new tab)

5,399 Citations

CNN-LSTM Architecture for Action Recognition in Videos
    Carlos Ismael OrozcoM. BuemiJ. J. Berlles

    Computer Science

  • 2019

A CNN–LSTM architecture where a pre-trained VGG16 convolutional neuronal networks extracts the features of the input video and a LSTM classifies the video in a particular class.

  • 7
Spatial Attention Adapted to a LSTM Architecture with Frame Selection for Human Action Recognition in Videos
    Carlos Ismael OrozcoM. BuemiJ. J. Berlles

    Computer Science

    LatinX in AI at International Conference on…

  • 2021

This work proposes an attention mechanism adapted to a CNN–LSTM base architecture that can be used for action recognition in videos and evaluates the performance of the system using accuracy as the evaluation metric.

  • 1
  • Highly Influenced
  • PDF
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    João CarreiraAndrew Zisserman

    Computer Science

    2017 IEEE Conference on Computer Vision and…

  • 2017

I3D models considerably improve upon the state-of-the-art in action classification, reaching 80.2% on HMDB-51 and 97.9% on UCF-101 after pre-training on Kinetics, and a new Two-Stream Inflated 3D Conv net that is based on 2D ConvNet inflation is introduced.

ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition
    Jiaming ZhouJunwei LiangKun-Yu LinJinrui YangWei-Shi Zheng

    Computer Science

    ArXiv

  • 2024

A novel Cross-modality and Cross-action Modeling (CoCo) framework for ZSAR that significantly outperforms the state-of-the-art on three popular ZSAR benchmarks (i.e., Kinetics-ZSAR, UCF101 and HMDB51) under two different learning protocols in ZSAR.

Human Action Recognition in Videos using a Robust CNN LSTM Approach
    Carlos Ismael OrozcoEduardo XamenaM. BuemiJ. J. Berlles

    Computer Science

    Ciencia y Tecnología

  • 2020

A CNN–LSTM architecture is implemented that first, a pre-trained VGG16 convolutional neural network extracts the features of the input video, then an LSTM classifies the video in a particular class.

  • 5
  • PDF
The Kinetics Human Action Video Dataset
    W. KayJoão Carreira Andrew Zisserman

    Computer Science

    ArXiv

  • 2017

The dataset is described, the statistics are described, how it was collected, and some baseline performance figures for neural network architectures trained and tested for human action classification on this dataset are given.

Video Action Transformer Network
    Rohit GirdharJoão CarreiraCarl DoerschAndrew Zisserman

    Computer Science

    2019 IEEE/CVF Conference on Computer Vision and…

  • 2019

The Action Transformer model for recognizing and localizing human actions in video clips is introduced and it is shown that by using high-resolution, person-specific, class-agnostic queries, the model spontaneously learns to track individual people and to pick up on semantic context from the actions of others.

TaiChi: A Fine-Grained Action Recognition Dataset
    Shan SunFeng WangQi LiangLiang He

    Computer Science

    ICMR

  • 2017

TaiChi consists of unconstrained user-uploaded web videos containing camera motion and partial occlusions which pose new challenges to fine-grained action recognition compared to the existing datasets.

  • 10
  • Highly Influenced
Revisiting hand-crafted feature for action recognition: a set of improved dense trajectories
    K. MatsuiToru TamakiGwladys AuffretB. RaytchevK. Kaneda

    Computer Science

    ArXiv

  • 2017

Experimental results on the UCF50, UCF101, and HMDB51 action datasets demonstrate that TS is comparable to state-of-the-arts, and outperforms many other methods; for HMDB the accuracy of 85.4%, compared to the best accuracy obtained by a deep method.

A Study of Action Recognition Problems: Dataset and Architectures Perspectives
    Bassel S. ChawkyA. S. ElonsA. AliHowida A. Shedeed

    Computer Science

  • 2018

Different action recognition datasets are explored to highlight their ability to evaluate different models, and a usage is proposed for each dataset based on the content and format of data it includes, the number of classes and challenges it covers.

  • 4

...

...

13 References

HMDB: A large video database for human motion recognition
    Hilde KuehneHueihan JhuangEstíbaliz GarroteT. PoggioThomas Serre

    Computer Science

    2011 International Conference on Computer Vision

  • 2011

This paper uses the largest action video database to-date with 51 action categories, which in total contain around 7,000 manually annotated clips extracted from a variety of sources ranging from digitized movies to YouTube, to evaluate the performance of two representative computer vision systems for action recognition and explore the robustness of these methods under various conditions.

  • 3,481
  • PDF
Recognizing realistic actions from videos “in the wild”
    Jingen LiuJiebo LuoM. Shah

    Computer Science

    2009 IEEE Conference on Computer Vision and…

  • 2009

This paper presents a systematic framework for recognizing realistic actions from videos “in the wild”, and uses motion statistics to acquire stable motion features and clean static features, and PageRank is used to mine the most informative static features.

  • 1,034
  • PDF
Recognizing human actions: a local SVM approach
    Christian SchüldtI. LaptevB. Caputo

    Computer Science

    Proceedings of the 17th International Conference…

  • 2004

This paper construct video representations in terms of local space-time features and integrate such representations with SVM classification schemes for recognition and presents the presented results of action recognition.

  • 3,989
  • PDF
Actions in context
    Marcin MarszalekI. LaptevC. Schmid

    Computer Science

    2009 IEEE Conference on Computer Vision and…

  • 2009

This paper automatically discover relevant scene classes and their correlation with human actions, and shows how to learn selected scene classes from video without manual supervision and develops a joint framework for action and scene recognition and demonstrates improved recognition of both in natural video.

  • 1,352
  • PDF
Action Recognition from Arbitrary Views using 3 D Exemplars
    Daniel Weinland

    Computer Science

  • 2007

A new framework is proposed where actions are model actions using three dimensional occupancy grids, built from multiple viewpoints, in an exemplar-based HMM, where a 3D reconstruction is not required during the recognition phase, instead learned 3D exemplars are used to produce 2D image information that is compared to the observations.

  • 5
  • PDF
Action Recognition from Arbitrary Views using 3D Exemplars
    Daniel WeinlandEdmond BoyerRémi Ronfard

    Computer Science

    2007 IEEE 11th International Conference on…

  • 2007

A new framework is proposed where actions are model actions using three dimensional occupancy grids, built from multiple viewpoints, in an exemplar-based HMM, where a 3D reconstruction is not required during the recognition phase, instead learned 3D exemplars are used to produce 2D image information that is compared to the observations.

  • 516
  • PDF
Actions as space-time shapes
    M. BlankLena GorelickEli ShechtmanM. IraniR. Basri

    Computer Science

    Tenth IEEE International Conference on Computer…

  • 2005

The method is fast, does not require video alignment and is applicable in many scenarios where the background is known, and the robustness of the method is demonstrated to partial occlusions, non-rigid deformations, significant changes in scale and viewpoint, high irregularities in the performance of an action and low quality video.

  • 2,316
  • PDF
Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification
    Juan Carlos NieblesChih-Wei ChenLi Fei-Fei

    Computer Science

    ECCV

  • 2010

A framework for modeling motion by exploiting the temporal structure of the human activities, which represents activities as temporal compositions of motion segments, and shows that the algorithm performs better than other state of the art methods.

  • 795
  • PDF
Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition
    Mikel D. RodriguezJ. AhmedM. Shah

    Computer Science

    2008 IEEE Conference on Computer Vision and…

  • 2008

This paper generalizes the traditional MACH filter to video (3D spatiotemporal volume), and vector valued data, and analyzes the response of the filter in the frequency domain to avoid the high computational cost commonly incurred in template-based approaches.

  • 1,321
  • PDF
Detecting Carried Objects in Short Video Sequences
    D. DamenDavid C. Hogg

    Computer Science

    ECCV

  • 2008

A new method for detecting objects such as bags carried by pedestrians depicted in short video sequences by comparing the temporal templates against view-specific exemplars generated offline for unencumbered pedestrians, which yields a segmentation of carried objects using the MAP solution.

  • 230
  • PDF

...

...

Related Papers

Showing 1 through 3 of 0 Related Papers

    [PDF] UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild | Semantic Scholar (2024)
    Top Articles
    Latest Posts
    Article information

    Author: Otha Schamberger

    Last Updated:

    Views: 6095

    Rating: 4.4 / 5 (75 voted)

    Reviews: 82% of readers found this page helpful

    Author information

    Name: Otha Schamberger

    Birthday: 1999-08-15

    Address: Suite 490 606 Hammes Ferry, Carterhaven, IL 62290

    Phone: +8557035444877

    Job: Forward IT Agent

    Hobby: Fishing, Flying, Jewelry making, Digital arts, Sand art, Parkour, tabletop games

    Introduction: My name is Otha Schamberger, I am a vast, good, healthy, cheerful, energetic, gorgeous, magnificent person who loves writing and wants to share my knowledge and understanding with you.