Abstract:Action recognition in the dark is a challenging task in practice because it is difficult to learn robust action representations from low light environments. Furthermore, there is a domain gap between dark scenes and the data used by traditional pretrained models, which results in suboptimal results with the traditional pretrain-finetune approach, and pretraining from scratch is costly. To address this issue, a domain-adaptive pretraining method is proposed to improve action recognition performance in the dark environments. The method integrates an external vision enhancement model for de-darkening to introduce critical knowledge for dark scene processing. It also employs a cross-domain self-distillation framework to reduce the domain gap of visual representations between illuminated and dark scenes. Through extensive experiments in various dark environment action recognition settings, the proposed approach can achieve a Top-1 accuracy of 97.19% on the dark dataset of fully supervised action recognition. In the source-free domain adaptation on the Daily-DA dataset, the accuracy can be improved to 49.11%. In the multi-source domain adaptation scenario on the Daily-DA dataset, the Top-1 accuracy can reach 54.63%.