基于少样本字体风格迁移的汉字-彝文字体生成

屈星熠; 王申政; 吉木莫衣乃; 林龙新; 龚文勇

doi:10.12146/j.issn.2095-3135.20251107001

摘要: 中文承载着数千年的文化传承，而诸如彝文在内的民族文字却亟需数字化保护。传统字体制作工艺效率低下，基于生成对抗网络（Generative Adversarial Networks，GAN）的风格迁移方法又存在笔画扭曲、数据依赖性强以及跨语言泛化能力弱等问题。本文致力于实现汉字与彝文之间的少样本字体风格迁移，以实现更优转化效果。方法上，首先引入层注意力网络与上下文感知注意力网络，将自注意力机制融入生成器设计。其次，构建包含汉字、英文字母与彝文的三语配对数据集，使模型能够学习汉字、英文至彝文的风格迁移。再次，采用多对多训练策略，不再仅以黑体作为固定风格，而是每轮迭代从训练集中随机选择风格字体以增强泛化能力。最后，引入拓扑感知损失函数来减少字形扭曲现象，提升生成字符的结构完整性。通过实验验证了模型在将汉字、英文字体风格迁移至彝文时的性能，以及其处理高艺术性汉字草书字体的能力。结果表明，相较于编码器混合解码器（Encoder Mixed Decoder，EMD）、深度特征相似性（Deep Feature Similarity，DFS）及字体风格迁移生成对抗网络（Font Transfer Generative Adversarial Network，FTransGAN）等对比模型，本文提出的模型在笔画清晰度与结构稳定性方面表现更优，有效改善了现有模型在跨语言字体风格迁移与艺术字生成中的缺陷。

Abstract: Chinese characters embody millennia of cultural heritage, yet ethnic scripts like Yi urgently need digitization; traditional font craft is slow, and GAN-based transfer suffers stroke distortion, data hunger, and weak Chinese-Yi generalization. In this paper, we focus on realizing few-shot font style transfer between Chinese and Yi characters to achieve more competitive performance. First, the study integrates the self-attention mechanism into the generator design by utilizing the Layer Attention Network and the Context-aware Attention Network. Second, a paired trilingual dataset containing Chinese characters, English letters, and Yi script is constructed, enabling the model to learn style transfer from Chinese to English and Yi content. Third, a multiple-to-multiple training method is adopted: instead of only learning from boldface (Simhei) as the source style, the model randomly selects style fonts from the training set in each iteration to enhance generalization. Finally, topology-aware loss is introduced to reduce the occurrence of distorted characters and improve the structural integrity of generated characters. Experiments are conducted to verify the model’s performance in transferring Chinese and English font styles to Yi script, as well as its ability to handle highly artistic Chinese cursive scripts. Results show that the proposed model outperforms comparison models such as EMD, DFS and the base FTransGAN model in terms of stroke clarity and structural stability, effectively addressing the defects of existing models in cross-language font transfer and artistic font generation.