Meet Transformer in Transformer: A Visual Transformer That Captures Structural Information From Images
A team from Huawei, ISCAS and UCAS propose the novel Transformer-iN-Transformer (TNT) for modelling both patch-level and pixel-level representations.