Abstract: Large-scale image-text pre-trained models have shown promising transferability to various downstream tasks. Video-text retrieval benefits from it by transferring pre-trained CLIP to ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results