In order to enrich the Chinese lexical choice data set, taking the people's daily corpus as the initial corpus source, this paper constructs a Chinese test dataset Chinese-Fill-in-the-Blank(CFITB)containing three target word parts of speech: nouns, verbs and adjectives. CFITB dataset contains 500 test samples. Each test sample contains three parts: "number", "test sentence" and "candidate", in which each test sentence contains a target word. Delete the target word and use "__" Instead, the corresponding candidate contains five Chinese words, and each Chinese word is brought into "__" The goal of modeling this dataset is to find the target word corresponding to the most standardized sentence in semantics and grammar from the five candidate sentences.