Abstract:A data missing value interpolation algorithm and a data outlier detection algorithm are proposed. The data generated during the whole life cycle of the pipeline is screened, and then the data are preprocessed to find missing values and abnormal values. A multivariate linear regression interpolation method is proposed in the missing data interpolation algorithm to perform interpolation on missing values. For the outliers in the pipeline data, the local outlier detection algorithm based on density is used to simulate the detected outliers. The positive detection rate of Local Outlier Factor (LOF) outlier detection algorithm is 96%, which is 41.18% higher than that of the traditional K-means outlier detection algorithm, high detection accuracy is obtained, and the optimal detection model is established.