Comparing different data preprocessing methods for monitoring soil heavy metals based on soil spectral features
A. Gholizadeh, L. Borůvka, M.M. Saberioon, J. Kozák, R. Vašát, K. Němečekhttps://doi.org/10.17221/113/2015-SWRCitation:Gholizadeh A., Borůvka L., Saberioon M.M., Kozák J., Vašát R., Němeček K. (2015): Comparing different data preprocessing methods for monitoring soil heavy metals based on soil spectral features. Soil & Water Res., 10: 218-227.
The lands near mining industries in the Czech Republic are subjected to soil pollution with heavy metals. Excessive heavy metal concentrations in soils not only dramatically impact the soil quality, but also due to their persistent nature and indefinite biological half-lives, potentially toxic metals can accumulate in the food chain and can eventually endanger human health. Monitoring and spatial information of these elements require a large number of samples and cumbersome and time-consuming laboratory measurements. A faster method has been developed based on a multivariate calibration procedure using support vector machine regression (SVMR) with cross-validation, to establish a relationship between reflectance spectra in the visible-near infrared (Vis-NIR) region and concentration of Mn, Cu, Cd, Zn, and Pb in soil. Spectral preprocessing methods, first and second derivatives (FD and SD), standard normal variate (SNV), multiplicative scatter correction (MSC), and continuum removal (CR) were employed after smoothing with Savitzky-Golay to improve the robustness and performance of the calibration models. According to the criteria of maximal coefficient of determination (R2cv) and minimal root mean square error of prediction in cross-validation (RMSEPcv), the SVMR algorithm with FD preprocessing was determined as the best method for predicting Cu, Mn, Pb, and Zn concentration, whereas the SVMR model with CR preprocessing was chosen as the final method for predicting Cd. Overall, this study indicated that the Vis-NIR reflectance spectroscopy technique combined with a continuously enriched soil spectral library as well as a suitable preprocessing method could be a nondestructive alternative for monitoring of the soil environment. The future possibilities of multivariate calibration and preprocessing with real-time remote sensing data have to be explored.Keywords:heavy metals; preprocessing; support vector machine regression; visible-near infrared spectroscopyReferences:
Ben-Dor E., Banin A. (1990): Near-Infrared Reflectance Analysis of Carbonate Concentration in Soils. Applied Spectroscopy, 44, 1064-1069 https://doi.org/10.1366/0003702904086821Ben-Dor E., Banin A. (1995): Near-Infrared Analysis as a Rapid Method to Simultaneously Evaluate Several Soil Properties. Soil Science Society of America Journal, 59, 364- https://doi.org/10.2136/sssaj1995.03615995005900020014xBen-Dor E (1997): The reflectance spectra of organic matter in the visible near-infrared and short wave infrared region (400–2500 nm) during a controlled decomposition process. Remote Sensing of Environment, 61, 1-15 https://doi.org/10.1016/S0034-4257(96)00120-4Borůvka Luboš, Kozák Josef, Mühlhanselová Marcela, Donátová Helena, Nikodem Antonín, Němeček Karel, Drábek Ondřej (2012): Effect of covering with natural topsoil as a reclamation measure on brown-coal mining dumpsites. Journal of Geochemical Exploration, 113, 118-123 https://doi.org/10.1016/j.gexplo.2011.11.004Bradshaw Anthony (2000): The use of natural processes in reclamation — advantages and difficulties. Landscape and Urban Planning, 51, 89-100 https://doi.org/10.1016/S0169-2046(00)00099-2Buurman P., Pape Th., Muggler C. C. (1997): LASER GRAIN-SIZE DETERMINATION IN SOIL GENETIC STUDIES 1. PRACTICAL PROBLEMS. Soil Science, 162, 211-218 https://doi.org/10.1097/00010694-199703000-00007Chen Quansheng, Guo Zhiming, Zhao Jiewen, Ouyang Qin (2012): Comparisons of different regressions tools in measurement of antioxidant activity in green tea using near infrared spectroscopy. Journal of Pharmaceutical and Biomedical Analysis, 60, 92-97 https://doi.org/10.1016/j.jpba.2011.10.020Chiang Leo H., Pell Randy J., Seasholtz Mary Beth (2003): Exploring process data with the use of robust outlier detection algorithms. Journal of Process Control, 13, 437-449 https://doi.org/10.1016/S0959-1524(02)00068-9Chu X.L., Yuan H.F., Lu W.Z. (2004): Progress and application of spectral data pretreatment and wavelength selection methods in NIR analytical technique. Progress in Chemistry, 16: 528–542.Clark Roger N., Roush Ted L. (1984): Reflectance spectroscopy: Quantitative analysis techniques for remote sensing applications. Journal of Geophysical Research, 89, 6329- https://doi.org/10.1029/JB089iB07p06329Clark Roger N., King Trude V. V., Klejwa Matthew, Swayze Gregg A., Vergo Norma (1990): High spectral resolution reflectance spectroscopy of minerals. Journal of Geophysical Research, 95, 12653- https://doi.org/10.1029/JB095iB08p12653Dalal R. C., Henry R. J. (1986): Simultaneous Determination of Moisture, Organic Carbon, and Total Nitrogen by Near Infrared Reflectance Spectrophotometry1. Soil Science Society of America Journal, 50, 120- https://doi.org/10.2136/sssaj1986.03615995005000010023xDuckworth J. (2004): Mathematical data preprocessing. In: Roberts C.A., Workman J.,Jr., Reeves III, J.B. (eds): Near-Infrared Spectroscopy in Agriculture. Madison, ASA-CSSA-SSSA: 115–132.Gholizadeh Asa, Borůvka Luboš, Saberioon Mohammadmehdi, Vašát Radim (2013): Visible, Near-Infrared, and Mid-Infrared Spectroscopy Applications for Soil Assessment with Emphasis on Soil Organic Matter Content and Quality: State-of-the-Art and Key Issues. Applied Spectroscopy, 67, 1349-1362 https://doi.org/10.1366/13-07288Gholizadeh A., Borůvka L., Vašát R., Saberioon M.M., Klement A., Kratina J., Tejnecký V, Drábek O. (2015): Estimation of potentially toxic elements contamination in anthropogenic soils on a brown coal mining dumpsite using reflectance spectroscopy: A case study. Plos One, 10: e0117457.HIDAKA Yasuyuki, KURIHARA Eiji, HAYASHI Kazunobu, NODA Takahiro, NISHIMURA Yoh, SUGIYAMA Takao, MURAMATSU Kengo, SASHIDA Kunio (2011): Near-Infrared Spectrometer for a Head-Feeding Combine for Measuring Rice Protein Content. Japan Agricultural Research Quarterly: JARQ, 45, 63-68 https://doi.org/10.6090/jarq.45.63Ji Junfeng, Balsam William, Chen J Un, Liu Lianwen (): Rapid and Quantitative Measurement of Hematite and Goethite in the Chinese Loess-paleosol Sequence by Diffuse Reflectance Spectroscopy. Clays and Clay Minerals, 50, 208-216 https://doi.org/10.1346/000986002760832801Ji W.J., Li X., Li C.X., Zhou Y., Shi Z. (2012): Using different data mining algorithms to predict soil organic matter based on visible near infrared spectroscopy. Spectroscopy and Spectral Analysis, 32: 2393–2398.Kemper Thomas, Sommer Stefan (2002): Estimate of Heavy Metal Contamination in Soils after a Mining Accident Using Reflectance Spectroscopy. Environmental Science & Technology, 36, 2742-2747 https://doi.org/10.1021/es015747jKokaly Raymond F, Despain Don G, Clark Roger N, Livo K.Eric (2003): Mapping vegetation in Yellowstone National Park using spectral feature analysis of AVIRIS data. Remote Sensing of Environment, 84, 437-456 https://doi.org/10.1016/S0034-4257(02)00133-5Kooistra L., Wehren R., Leuven R.S.E., Buydens L.M.C. (2001): Possibilities of visible-near-infrared spectroscopy for the assessment of soil contamination in river flood plains, Analytica Chimica Acta, 446: 97–105.Kooistra L, Wanders J, Epema G.F, Leuven R.S.E.W, Wehrens R, Buydens L.M.C (2003): The potential of field spectroscopy for the assessment of sediment properties in river floodplains. Analytica Chimica Acta, 484, 189-200 https://doi.org/10.1016/S0003-2670(03)00331-3Leone A (): Multivariate Analysis of Laboratory Spectra for the Assessment of Soil Development and Soil Degradation in the Southern Apennines (Italy). Remote Sensing of Environment, 72, 346-359 https://doi.org/10.1016/S0034-4257(99)00110-8Madejová Jana (2001): Baseline Studies of the Clay Minerals Society Source Clays: Infrared Methods. Clays and Clay Minerals, 49, 410-432 https://doi.org/10.1346/CCMN.2001.0490508McGrath Stephen P., Cunliffe Caroline H. (1985): A simplified method for the extraction of the metals Fe, Zn, Cu, Ni, Cd, Pb, Cr, Co and Mn from soils and sewage sludges. Journal of the Science of Food and Agriculture, 36, 794-798 https://doi.org/10.1002/jsfa.2740360906Meyer D., Dimitriadou E., Hornik K., Weingessel A., Leisch F. (2012): e1071: Misc Functions of the Department of Statistics (e1071), R Package Version 1.6-1. Wien, TU Wien.Moros Javier, Vallejuelo Silvia Fdez-Ortiz de, Gredilla Ainara, Diego Alberto de, Madariaga Juan Manuel, Garrigues Salvador, Guardia Miguel de la (2009): Use of Reflectance Infrared Spectroscopy for Monitoring the Metal Content of the Estuarine Sediments of the Nerbioi-Ibaizabal River (Metropolitan Bilbao, Bay of Biscay, Basque Country). Environmental Science & Technology, 43, 9314-9320 https://doi.org/10.1021/es9005898Murray I. (1988): Aspects of interpretation of NIR spectra. In: Creaser C.S., Davies A.M.C. (eds): Analytical Application of Spectroscopy. London, Royal Society of Chemistry: 9–21.Nayak Preeti Sagar, Singh B. K. (2007): Instrumental characterization of clay by XRF, XRD and FTIR. Bulletin of Materials Science, 30, 235-238 https://doi.org/10.1007/s12034-007-0042-5N’Guessan Y.M., Probst J.L., Bur T., Probst A. (2009): Trace elements in stream bed sediments from agricultural catchments (Gascogne region, S-W France): where do they come from? Science of the Total Environment, 407: 2939–2952.Nocita Marco, Stevens Antoine, Noon Carole, van Wesemael Bas (2013): Prediction of soil organic carbon for different levels of soil moisture using Vis-NIR spectroscopy. Geoderma, 199, 37-42 https://doi.org/10.1016/j.geoderma.2012.07.020Pearson R.K. (): Outliers in process modeling and identification. IEEE Transactions on Control Systems Technology, 10, 55-63 https://doi.org/10.1109/87.974338R Development Core Team. (2011): R: A language and environment for statistical computing. R foundation for Statistical Computing. Available at http://www.R-project.orgReeves J.B. III (2010): Near versus mid infrared diffuse reflectance spectroscopy for soil analysis emphasizing carbon and laboratory versus on-site analysis: Where are we and what needs to be done? Geoderma, 158: 3–14.Reeves III J.B., McCarty G.W., Mimmo T.V., Reeves V.B., Follet R.F., Kimble J.M., Galletti G.C. (2002): Spectroscopic calibrations for the determination of C in soils. Transactions of the 17th World Congress of Soil Science, 10: 1–9.REN Hong-Yan, ZHUANG Da-Fang, SINGH A.N., PAN Jian-Jun, QIU Dong-Sheng, SHI Run-He (2009): Estimation of As and Cu Contamination in Agricultural Soils Around a Mining Area by Reflectance Spectroscopy: A Case Study. Pedosphere, 19, 719-726 https://doi.org/10.1016/S1002-0160(09)60167-3Rinnan Åsmund, Berg Frans van den, Engelsen Søren Balling (2009): Review of the most common pre-processing techniques for near-infrared spectra. TrAC Trends in Analytical Chemistry, 28, 1201-1222 https://doi.org/10.1016/j.trac.2009.07.007Savitzky Abraham., Golay M. J. E. (1964): Smoothing and Differentiation of Data by Simplified Least Squares Procedures.. Analytical Chemistry, 36, 1627-1639 https://doi.org/10.1021/ac60214a047Song Yinxian, Li Fengling, Yang Zhongfang, Ayoko Godwin A., Frost Ray L., Ji Junfeng (2012): Diffuse reflectance spectroscopy for monitoring potentially toxic elements in the agricultural soils of Changjiang River Delta, China. Applied Clay Science, 64, 75-83 https://doi.org/10.1016/j.clay.2011.09.010Stevens Antoine, Udelhoven Thomas, Denis Antoine, Tychon Bernard, Lioy Rocco, Hoffmann Lucien, van Wesemael Bas (2010): Measuring soil organic carbon in croplands at regional scale using airborne imaging spectroscopy. Geoderma, 158, 32-45 https://doi.org/10.1016/j.geoderma.2009.11.032Vapnik V. (1995): The Nature of Statistical Learning Theory. New York, Springer-Verlag.Vasques G.M., Grunwald S., Sickman J.O. (2008): Comparison of multivariate methods for inferential modeling of soil carbon using visible/near-infrared spectra. Geoderma, 146, 14-25 https://doi.org/10.1016/j.geoderma.2008.04.007Rossel R.A. Viscarra, Behrens T. (2010): Using data mining to model and interpret soil diffuse reflectance spectra. Geoderma, 158, 46-54 https://doi.org/10.1016/j.geoderma.2009.12.025White W. (1971): Infrared characterization of water and hydroxyl ion in the basic magnesium carbonate minerals. American Mineralogist, 56: 46–53.Williams P. (2003): Near-infrared Technology − Getting the Best out of Light. Nanaimo, PDK Projects.Wu Yunzhao, Chen Jun, Wu Xinmin, Tian Qingjiu, Ji Junfeng, Qin Zhihao (2005): Possibilities of reflectance spectroscopy for the assessment of contaminant elements in suburban soils. Applied Geochemistry, 20, 1051-1059 https://doi.org/10.1016/j.apgeochem.2005.01.009XIE Xian-Li, PAN Xian-Zhang, SUN Bo (2012): Visible and Near-Infrared Diffuse Reflectance Spectroscopy for Prediction of Soil Properties near a Copper Smelter. Pedosphere, 22, 351-366 https://doi.org/10.1016/S1002-0160(12)60022-8