Abstract:CUDA(Compute Unified Device Architecture) has a complex thread organization and multilevel memory modules, which makes it difficult to quantitatively evaluate time complexity of CUDA-based algorithms. In this paper, a Hierarchical Memory Machine (HMM) Model was investigated to solve this problem. HMM is a theoretical parallel computing model, which is capable of representing the essence of computing and memory structures on the GPU(Graphics Processing Units) devices. Based on the proposed HMM model, a parallel algorithm was presented for the approximate string matching problem. The proposed algorithm is evaluated and compared with existing approaches, to show a speedup ratio of more than 60.