TOC content prediction of organic-rich shale using the machine learning algorithm comparative study of random forest, support vector machine, and XGBoost
编号:603
访问权限:私有
更新:2023-04-08 15:33:50 浏览:578次
张贴报告
摘要
The total organic carbon (TOC) content of organic-rich shale is a key parameter for screening the potential source rocks and sweet spots of shale oil/gas. Traditional methods of determining the TOC content, such as the geochemical experiments and the empirical mathematical regression method, are either high cost and low-efficiency, or universally non-applicable and low-accuracy. In this study, we propose three machine learning techniques to predict the TOC content using the well logs and their performance are compared. First, the Decision Tree algorithm is used to identify the optimal set of well logs from a total of 15 commonly used well logs, and three machine learning algorithms including random forest (RF), support vector regression (SVR), and XGBoost are used to predict the TOC content of organic-rich shale from the optimal well log set. Then, a total of 816 data points of well logs data and TOC content data collected from five different shale formations are used to train and test above three models. Finally, the three models are used to predict the unseen TOC content data from Shahejie shale. Result of research shows that the RF provides the best prediction for the TOC content, with R2=0.9141, RMSE=0.329, and MAE=0.252, followed by the XGBoost, while the SVR gives the lowest predictive accuracy. Nevertheless, all three models overperform the traditional Schmoker gamma-ray log method, multiple linear regression method and ΔlgR method.
关键词
TOC; Random forest; Support vector machine; XGBoost; Organic-rich shale
发表评论