[dataset] [arXiv] [bibtex] [pdf]

This project present a method that addresses the task of learning novel visual concepts, and their interactions with other concepts, from a few images with sentence descriptions.

Our method has an image captioning module based on the m-RNN model with several improvements. In particular, we propose a transposed weight sharing scheme, which not only improves performance on image captioning, but also makes the model more suitable for the novel concept learning task. We propose methods to prevent overfitting the new concepts.

In addition, we construct and release the Novel Visual Concept (NVC) dataset to valid our method.

NEW! [Data and toolkit]

The proposed Novel Visual Concept (NVC) dataset currently contains 11 new concepts: quidditch, t-rex, samisen, tai-ji, huangmei opera, kiss, rocket gun, tempura, waterfall, wedding dress, and windmill. We provide 100 images for each concept and 5 sentence annotations for each image.

The annotations, images, image feature (VGGnet) and toolkit are available on the github page. Here are some examples:

Quidditch


Annotations:
  • A quidditch team is taking a group photo with brooms in their hands.
  • Ten people of the same quidditch team holding brooms.
  • A quidditch team in red shirts posing with brooms and balls.
  • Maroon shirt quidditch team are posing for a picture in the park in front of hoops on a partly cloudy day.
  • The harvard quidditch team poses for a team photo while holding their brooms.

T-rex


Annotations:
  • Many people are standing under a t-rex in a museum.
  • A t-rex skeleton with on a stand with fence and people in the background.
  • A group of people walking through museum looking up at t-rex skeleton.
  • A side view of a t-rex fossil skeleton in a museum with many visitors.
  • A brown skeleton of a t-rex faces right in the center of a white museum.

Samisen


Annotations:
  • A man in grey kimono is playing samisen with a brown background.
  • A man in grey clothes is holding a samisen.
  • A man in grey playing samisen.
  • A man in a gray robe intensely concentrates on playing the samisen on a stage with a brown background.
  • A man in a gray kimono plays the samisen very quickly.

Tai-ji


Annotations:
  • A group of men in red is playing tai-ji with a man in white.
  • A man in white and a group of people in red doing tai-ji.
  • A group of people in red and white doing tai-ji behind man in white.
  • A man in a white uniform demonstrates tai-ji to several other red uniformed men in the grass surrounded by plants and trees.
  • A man in a white uniform leads a group of people in red and black uniforms in practicing tai-ji.

Kiss


Annotations:
  • A couple in a kiss makes a heart sign together.
  • A couple kiss as they make a heart symbol with their hands.
  • A couple kiss in a grassy field of a forest while holding up a heart sign with their hands.
  • A man in white and yellow and a woman in white sharing a kiss.
  • A couple kisses on a field of grass while forming a heart in front of them with their hands.

Rocket Gun


Annotations:
  • There is smoke behind the rocket gun being fired.
  • A green truck launching a rocket gun from the back up towards a sky full of smoke.
  • The rocket gun on the back of a green truck fires a missile making a lot of smoke.
  • A rocket gun firing a rocket on a truck.
  • A dark green rocket gun truck in a field of grass shoots a rocket in front of a large cloud of yellow smoke.

Huangmei Opera


Annotations:
  • Two performers stand on stage in a huangmei opera.
  • A huangmei opera actress wearing green and another actresss wearing pink with a cloak.
  • One woman in green costume smiles and talks to the audience with gestures as another reserved woman looks down for the huangmei opera.
  • A girl in green and a lady in pink in a huangmei opera.
  • Two women , one in green robes the other in pink , perform on stage for a huangmei opera.

Tempura


Annotations:
  • There are carved vegetables next to the tempura.
  • Four shrimp tempura on a white plate with cut vegetables next to a black teapot and two bottles of spices.
  • Four pieces of shrimp tempura are arranged with flowers and vegetables on a white plate.
  • Several shrimp tempura pieces are arranged on a napkin and white plate with spice containers behind the plate.

Waterfall


Annotations:
  • The waterfall flows through green plant.
  • A large waterfall with many paths streaming in different directions.
  • On the mountain a steep cliff holds a waterfall that pours down through the leftovers of trees.
  • A waterfall with trees in front.
  • A wide waterfall flows off a grassy cliff and splits off into several streams onto moss.

Wedding Dress


Annotations:
  • Two brides are wearing wedding dresses.
  • Two women are wearing veils and white wedding dresses and one woman has a bouquet of flowers.
  • Two models pose next to each other holding onto a flower bouquet and in elegant and detailed wedding dresses.
  • Two people in wedding dress.
  • Two women wear wedding dresses , one with a long veil and the other strapless.

Windmill


Annotations:
  • A windmill on a house with white flowers in front.
  • The windmill is in a field of tulips.
  • A dutch windmill in the background surrounded by trees and a bed of white, purple, yellow and red flowers.
  • A short windmill with a brown wooden base stands on the top of the hill among flowers and trees.
  • A small windmill stands behind a field of white tulips and yellow and purple flowers.

back to top

We thank UCLA undergraduate annotators for their effort for the careful annotations of the NVC dataset, and Xiaochen Lian for his help in the annotation process. We also thank the comments and suggestions of the anonymous reviewers from ICCV 2015.

back to top