See my updates at Linkedin



Aerial Videos (CVPR 2015)
Take a Look
  • Joint inference of groups, events and human roles in aerial videos [Paper] [Supp] [Slides] [Video][Project Page]
    -Tianmin Shu, Dan Xie, Brandon Rothrock, Sinisa Todorovic and Song-Chun Zhu
    -Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. (Oral)
  • Dark Matter (ICCV 2013)
    Take a Look
  • Inferring "Dark Matter" and "Dark Energy" from Videos [Paper][Poster][Project Page]
    -Dan Xie, Sinisa Todorovic and Song-Chun Zhu
    -Proc. International Conference on Computer Vision (ICCV), 2013.
  • Multiscale Activities (ECCV 2012)
    Take a Look
  • Cost-Sensitive Top-down/Bottom-up Inference for Multiscale Activity Recognition [Paper] [Project Page]
    -Mohamed R. Amer, Dan Xie, Mingtian Zhao, Sinisa Todorovic, Song-Chun Zhu.
    -Proc. European Conference on Computer Vision (ECCV), 2012. (Oral)

  • ABOUT ME: Dan Xie

    From 2011, I am a Ph.D. student in Department of Statistics at University of California, Los Angeles. I work in Center for Vision, Cognition, Learning, and Autonomy (VCLA), my advisor is Prof. Song-Chun Zhu. I work on modeling human group activities and human mind. Self-driving car is a revolutionary technology that will change everything, and reshape our cities (Link). However, according to an investigation report by KPMG, one major chanllenge is to interpret events and anticipate likely scenarios. For example, if a ball were to roll onto a road, a human might expect that a child could follow. Artificial intelligence cannot yet provide that level of inferential thinking, nor can it communicate in real time with the environment.

    From 2007 to 2011, I got my B.Eng. degree in my major Software Engineering, and honoured minor certificate in Advanced honour Class of Engineering Education (ACEE), from Zhejiang University in China. In 2010, I won Chu Kochen Award, highest honour for undergraduate students in Zhejiang University (12/5000+). From 2010 to 2011, I spent 8 months as an exchange student in Department of Electrical and Computer Engineering at University of Waterloo in Canada. In 2011, I spent 3 months as an Intern at ads algorithm group in Alibaba (Congratulations! The Biggest IPO Ever) in Hangzhou, China.


    Inferring "Dark Matter" and "Dark Energy" from Videos

    Dan Xie1     Sinisa Todorovic2     Song-Chun Zhu1
    1University of California, Los Angeles       2Oregon State University

    The goal of this paper is to detect functional objects in surveillance videos without specific knowledge of the objects in the scene. Our data consists of public squares and courtyards video, where functional objects cannot be reliably recognized by their appearance. These objects do, however, affects human behavior. We coin these undetectable objects as "dark matter" or sources of "dark energy" that attracts people to approach them (e.g., vending machines, chairs, and food truck), or repels trajectories to avoid them (e.g., grass lawns). As common appearance features provide poor cues about the functionality of objects, we approach our problem by analyzing human behavior in the video -- namely, people's intents and trajectories -- as our basic features. Leveraging the Lagrangian mechanics, human behavior in the scene is treated as particle motion in the attraction and repulsion force fields of these invisible functional objects. Additionally, people exhibit an intent to deliberately approach "dark matter" to satisfy their needs (e.g., hunger, thirst). To account for this latent intent, we extend the Lagrangian mechanics to allow the humans to select which of the "dark energy" fields to move along. We also make the assumption that people are familiar with the layout of obstacles and "dark matter" in the scene, and thus their trajectories can take globally optimal paths in the selected "dark energy" fields. Our model is formulated in a Bayesian framework, and inferred using a data-driven Markov Chain Monte Carlo (MCMC) process. Evaluation on challenging benchmark datasets demonstrate the effectiveness of our technique in detecting functional objects. Also, Furthermore, the prediction of human trajectories from our technique outperforms that of prior work.



    Contact Information

    xiedwill AT gmail DOT com


    department news about my work

    Useful Open Source

    fancybox for displaying photos, example
    Slideshow to slide show images, example
    Vatic for video annotation

    Copyright 2016, website templates by