Return to Article Details
TAPER-WE: Transformer-Based Model Attention with Relative Position Encoding and Word Embedding for Video Captioning and Summarization in Dense Environment
Download
Download PDF