Tianfu (Matt) Wu bio photo

Tianfu (Matt) Wu

天地有大美而不言,四时有明法而不议,万物有成理而不说。圣人者,原天地之美而达万物之理。--《庄子.知北游》~~ Beauty and Perfection of the World and the Nature as They are Undescribable. One Ultimate Goal is to Learn Interpretable Models and Explainable Algorithms.

Twitter   G. Scholar LinkedIn Github E-Mail

My Re-se-Arch Interests

My research focuses on computer Vis-Eye-on, often motivated by the task of building explainable and improvable visual Turing test and robot autonomy through lifelong communicative learning. To accomplish my research goals, I am interested in pursuing a unified framework for machines to ALTER (Ask, Learn, Test, Explain and Refine) recursively in a principled way.

• Deep Perception of the Visible and Deep Understanding of the Dark Jointly

vision

A picture is worth a thousands of words. What are the words? They refer to, both visible and invisible, concepts and models (including patterns, symbols and logics). What are the structures orgainizing words? They refer to image/video and language grammar (hierarchical, compositional, reconfigurable, causal and explainable). In addition, "The more you look, the more you see" (quoted from Prof. Stuart Geman). My research focus on (i) statistical learning of large scale and highly expressive hierarchical and compositional models from heterogenous (big) data including images, videos and text, (ii) statistical inference by learning near-optimal cost-sensitive decision policies, and (iii) statistical theory of performance guaranteed learning algorithm and optimally scheduled inference procedure, i.e., maximizing the "gain" (accuracy) and minimizing the "pain" (computational costs).

• Lifelong Learning through ALTER Recursively (Demo)

vision

Here, I articulate what I view as machines which can ALTER recursively using online object tracking as an example. Tracking is one of the innate capabilities in animals and humans for learning concepts (Susan Carey, The Origin of Concepts. Oxford Univ. Press, 2011).

  • Ask: to predict object states (bounding boxes) in subsequent frames after only the first one is given.
  • Learn: to explore/unfold the space of latent object structures with the optimal models learned at different stages (e.g., Model1 and Model172).
  • Test: to perform tracking-by-parsing using learned models in a new frame with intrackability measured.
  • Explain: to account for structural and appearance variations explicilty based on intrackability (e.g., transition from Model1 to Model172 due to partial occlusion).
  • Refine: to re-learn the structure and/or parameters based on tracking results with accuracy and efficiency balanced.