This work is supported by Shenzhen Basic Research Foundation (JCYJ20210324115614039)
The purpose of automatic judgment document summarization is to allow computers to automatically select, extract, and compress important information from legal texts so as to reduce workload of practitioners. Currently, most summarization algorithms based on pre-trained language models have limitations on the length of the input text, so they cannot effectively summarize long texts. In this thesis, an innovative extractive summarization algorithm is introduced, which uses a pre-trained language model to generate sentence vectors. Based on the Transformer encoder structure, the summarization task can be completed by fused information including sentence vectors, position and length of sentences. Experimental results showed that, the algorithm can effectively handle the task of summarizing long texts. In addition, the model was tested on the 2020 CAIL (challenge of AI in law) summarization dataset, and results showed that compared to the baseline model, the proposed model showed significant improvement in the ROUGE-1, ROUGE-2, and ROUGE-L metrics.
WEN Jiabao, YANG Min. Extractive Summarization Algorithm for Chinese Legal Judgment Documents[J]. Journal of Integration Technology,2024,13(1):62-71Copy