Un nuevo conjunto de datos para la detección de roya en cultivos de café Colombianos basado en clasificadores
DOI:
https://doi.org/10.18046/syt.v12i29.1802Palavras-chave:
Coffee Rust, Classifier, SVR, BPNN, M5Resumo
La producción de café es la principal actividad agrícola en Colombia. Más de 350.000 familias colombianas dependen de la cosecha de café. En este sentido, la roya fue reportada por primera vez en el país en 1983, y desde entonces estas familias han tenido que enfrentar graves consecuencias. Recientemente, diversos enfoques basados en aprendizaje automático han construido un conjunto de datos para el monitoreo de la incidencia de la roya del café, teniendo en cuenta las condiciones climáticas y las propiedades físicas de los cultivos. Estas investigaciones motivaron la creación de un conjunto de datos para la detección de la roya en cultivos Colombianos a través del proceso de minería de datos CRISP-DM. En este trabajo se definió un conjunto de datos con el objetivo de generar clasificadores precisos; una vez construido el conjunto de datos, fue probado mediante tres clasificadores: Maquinas de vector de regresión, Redes neuronales con propagación hacia atrás y Árboles de regresión.Referências
Alfaro, E., García, N., Gámez, M., & Elizondo, D. (2008). Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks. Decision Support Systems, 45(1), 110-122. doi: http://dx.doi.org/10.1016/j.dss.2007.12.002
Armstrong, J.S. & Collopy, F. (1992). Error measures for generalizing about forecasting methods: Empirical comparisons. International Journal of Forecasting, 8(1), 69-80. doi: http://dx.doi.org/10.1016/0169-2070(92)90008-W
Balasundaram, S. & Gupta, D. (2014). Training Lagrangian twin support vector regression via unconstrained convex minimization. Knowledge-Based Systems, 59(0), 85-96. doi: http://dx.doi.org/10.1016/j.knosys.2014.01.018
Becker, S. (1979) La propagación de la roya del cafeto: Eschborn, Alemania GTZ.
Bonakdar, L. & Etemad-Shahidi, A. (2011). Predicting wave run-up on rubble-mound structures using M5 model tree. Ocean Engineering, 38(1), 111-118. doi: http://dx.doi.org/10.1016/j.oceaneng.2010.09.015
Cintra, M.E., Meira, C.A.A., Monard, M.C., Camargo, H.A., & Rodrigues, L.H.A. (2011, 22-24 Nov. 2011). The use of fuzzy decision trees for coffee rust warning in Brazilian crops. Paper presented at the Intelligent Systems Design and Applications (ISDA), 2011 11th International Conference on.
Cristianini, N. & Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other Kernel-based Learning Methods: Cambridge, UK: Cambridge University Press.
Dietterich, T.G. (2000). An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization. Mach. Learn., 40(2), 139-157. doi: 10.1023/a:1007607513941
Ghosh, J. (2002). Multiclassifier systems: back to the future. Lecture Notes in Computer Sciences [Third International Workshop, MCS 2002 Cagliari, Italy, June 24-26, 2002 Proceedings], 2364, 1-15
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: an update. SIGKDD Explor. Newsl., 11(1), 10-18. doi: 10.1145/1656274.1656278
Haykin, S.S. ( 2003). Neural networks: a comprehensive foundation: Prentice Hall.
Huitema, B.E. (1980). The Analysis of Covariance and Alternatives: John Wiley & Sons.
Hyndman, R.J. & Koehler, A.B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679-688. doi: http://dx.doi.org/10.1016/j.ijforecast.2006.03.001
Kim, Y. & Street, W.N. (2004). An intelligent system for customer targeting: a data mining approach. Decision Support Systems, 37(2), 215-228. doi: http://dx.doi.org/10.1016/S0167-9236(03)00008-3
Li, L., Zou, B., Hu, Q., Wu, X., & Yu, D. (2013). Dynamic classifier ensemble using classification confidence. Neurocomputing, 99(0), 581-591. doi: http://dx.doi.org/10.1016/j.neucom.2012.07.026
Luaces, O., Rodrigues, L.H.A., Alves-Meira, C.A., & Bahamonde, A. (2011). Using nondeterministic learners to alert on coffee rust disease. Expert Systems with Applications, 38(11), 14276-14283. doi: http://dx.doi.org/10.1016/j.eswa.2011.05.003
Luaces, O., Rodrigues, L.H.A., Meira, C.A.A., Jos, #233, Quevedo, R., & Bahamonde, A. (2010). Viability of an alarm predictor for coffee rust disease using interval regression. In Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems, Cordoba, Spain, [Vol. 2] (pp.337-346]. Berlin, Alemania: Springer-Varlag
Mannino, M., Yang, Y., & Ryu, Y. (2009). Classification algorithm sensitivity to training data with non representative attribute noise. Decision Support Systems, 46(3), 743-751. doi: http://dx.doi.org/10.1016/j.dss.2008.11.021
Meira, C., Rodrigues, L., & Moraes, S. (2008). Análise da epidemia da ferrugem do cafeeiro com árvore de decisão. Tropical Plant Pathology, 33(2), 114-124.
Meira, C.A.A., & Rodrigues, L.H.A. (2009). Árvore de decisão na análise de epidemias da ferrugem do cafeeiro [Paper - VI Simpósio de Pesquisa dos Cafés do Brasil]. Retrieved from: http://www.sbicafe.ufv.br/bitstream/handle/10820/3466/56.pdf?sequence=2
Meira, C.A.A., Rodrigues, L.H.A., & Moraes, S.A.d. (2009). Modelos de alerta para o controle da ferrugem-do-cafeeiro em lavouras com alta carga pendente. Pesquisa Agropecuária Brasileira, 44, 233-242.
Monedero, I., Biscarri, F., León, C., Guerrero, J. I., Biscarri, J., & Millán, R. (2012). Detection of frauds and other non-technical losses in a power utility using Pearson coefficient, Bayesian networks and decision trees. International Journal of Electrical Power & Energy Systems, 34(1), 90-98. doi: http://dx.doi.org/10.1016/j.ijepes.2011.09.009
Opitz, D. & Maclin, R. (1999). Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research, 11, 169-198.
Pérez-Ariza, C.B., Nicholson, A.E., & Flores, M.J. (2012). Prediction of Coffee Rust Disease Using Bayesian Networks, Proceedings of the Sixth European Workshop on Probabilistic Graphical Models, (pp.259-266). Available at http://arrow.monash.edu.au/hdl/1959.1/821316
Poh, H.L. (1991). A neural network approach for marketing strategies research and decision support [Ph.D Thesis], Stanford University
Ranawana, R. & Palade, V. (2006). Multi-Classifier systems: Review and a roadmap for developers. Int. J. Hybrid Intell. Syst., 3(1), 35-61
Refaeilzadeh, P., Tang, L., & Liu, H. (2009). Cross-Validation. In L. Liu & T. Özsu [Eds.], Encyclopedia of Database Systems (pp. 532-538): Springer
Rivillas-Osorio, C., Serna-Giraldo, C., Cristancho-Ardila, M., & Gaitán-Bustamante, A. (2011). La roya del cafeto en Colombia, impacto, manejo y costos de control. In S. Marín [Ed.], Avances Tecnicos Cenicafe. Chinchiná, Colombia: Cenicafé
Shieber, E. & Zentmyer, G. A. (1984). Coffee rust in the western hemisphere Plant disease, 68, 89-93
Smola, A. & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199-222. doi: 10.1023/b:stco.0000035301.49549.88
Suhasini, A., Palanivel, S., & Ramalingam, V. (2011). Multimodel decision support system for psychiatry problem. Expert Systems with Applications, 38(5), 4990-4997. doi: http://dx.doi.org/10.1016/j.eswa.2010.09.152
Vapnik, V.N. ( 2000). The nature of statistical learning theory. New York, NY: Springer.
Vapnik, V.N. (1999). An overview of statistical learning theory. Neural Networks, IEEE Transactions on, 10(5), 988-999. doi: 10.1109/72.788640
Wang, Y., & Witten, I.H. (1996). Induction of model trees for predicting continuous classes. Working Paper Series, 96(23). Retrieved from de http://www.cs.waikato.ac.nz/pubs/wp/1996/uow-cs-wp-1996-23.pdf
Wei, C.-P., Chen, H.-C., & Cheng, T.-H. (2008). Effective spam filtering: A single-class learning and ensemble approach. Decision Support Systems, 45(3), 491-503. doi: http://dx.doi.org/10.1016/j.dss.2007.06.010
Wirth, R. (2000). CRISP-DM: Towards a standard process model for data mining. In Proceedings of the Fourth International Conference on the Practical Application of Knowledge Discovery and Data Mining, Manchester, UK, (pp29-39).
Zapata, J.C. & Ruíz, G.M. (1988). La variedad Colombia: selección de un cultivar compuesto resistente a la roya del cafeto [Premio Nacional de Ciencias, Fundación Alejandro Angel Escobar, 1986]. Chinchiná, Colombia: Cenicafé
Zhang, D., & Tsai, J. J. P. (2007). Advances in MacHine learning applications in software engineering: Hershey, PA: Idea
Zhu, D. (2010). A hybrid approach for efficient ensembles. Decision Support Systems, 48(3), 480-487. doi: http://dx.doi.org/10.1016/j.dss.2009.06.007
Armstrong, J.S. & Collopy, F. (1992). Error measures for generalizing about forecasting methods: Empirical comparisons. International Journal of Forecasting, 8(1), 69-80. doi: http://dx.doi.org/10.1016/0169-2070(92)90008-W
Balasundaram, S. & Gupta, D. (2014). Training Lagrangian twin support vector regression via unconstrained convex minimization. Knowledge-Based Systems, 59(0), 85-96. doi: http://dx.doi.org/10.1016/j.knosys.2014.01.018
Becker, S. (1979) La propagación de la roya del cafeto: Eschborn, Alemania GTZ.
Bonakdar, L. & Etemad-Shahidi, A. (2011). Predicting wave run-up on rubble-mound structures using M5 model tree. Ocean Engineering, 38(1), 111-118. doi: http://dx.doi.org/10.1016/j.oceaneng.2010.09.015
Cintra, M.E., Meira, C.A.A., Monard, M.C., Camargo, H.A., & Rodrigues, L.H.A. (2011, 22-24 Nov. 2011). The use of fuzzy decision trees for coffee rust warning in Brazilian crops. Paper presented at the Intelligent Systems Design and Applications (ISDA), 2011 11th International Conference on.
Cristianini, N. & Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other Kernel-based Learning Methods: Cambridge, UK: Cambridge University Press.
Dietterich, T.G. (2000). An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization. Mach. Learn., 40(2), 139-157. doi: 10.1023/a:1007607513941
Ghosh, J. (2002). Multiclassifier systems: back to the future. Lecture Notes in Computer Sciences [Third International Workshop, MCS 2002 Cagliari, Italy, June 24-26, 2002 Proceedings], 2364, 1-15
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: an update. SIGKDD Explor. Newsl., 11(1), 10-18. doi: 10.1145/1656274.1656278
Haykin, S.S. ( 2003). Neural networks: a comprehensive foundation: Prentice Hall.
Huitema, B.E. (1980). The Analysis of Covariance and Alternatives: John Wiley & Sons.
Hyndman, R.J. & Koehler, A.B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679-688. doi: http://dx.doi.org/10.1016/j.ijforecast.2006.03.001
Kim, Y. & Street, W.N. (2004). An intelligent system for customer targeting: a data mining approach. Decision Support Systems, 37(2), 215-228. doi: http://dx.doi.org/10.1016/S0167-9236(03)00008-3
Li, L., Zou, B., Hu, Q., Wu, X., & Yu, D. (2013). Dynamic classifier ensemble using classification confidence. Neurocomputing, 99(0), 581-591. doi: http://dx.doi.org/10.1016/j.neucom.2012.07.026
Luaces, O., Rodrigues, L.H.A., Alves-Meira, C.A., & Bahamonde, A. (2011). Using nondeterministic learners to alert on coffee rust disease. Expert Systems with Applications, 38(11), 14276-14283. doi: http://dx.doi.org/10.1016/j.eswa.2011.05.003
Luaces, O., Rodrigues, L.H.A., Meira, C.A.A., Jos, #233, Quevedo, R., & Bahamonde, A. (2010). Viability of an alarm predictor for coffee rust disease using interval regression. In Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems, Cordoba, Spain, [Vol. 2] (pp.337-346]. Berlin, Alemania: Springer-Varlag
Mannino, M., Yang, Y., & Ryu, Y. (2009). Classification algorithm sensitivity to training data with non representative attribute noise. Decision Support Systems, 46(3), 743-751. doi: http://dx.doi.org/10.1016/j.dss.2008.11.021
Meira, C., Rodrigues, L., & Moraes, S. (2008). Análise da epidemia da ferrugem do cafeeiro com árvore de decisão. Tropical Plant Pathology, 33(2), 114-124.
Meira, C.A.A., & Rodrigues, L.H.A. (2009). Árvore de decisão na análise de epidemias da ferrugem do cafeeiro [Paper - VI Simpósio de Pesquisa dos Cafés do Brasil]. Retrieved from: http://www.sbicafe.ufv.br/bitstream/handle/10820/3466/56.pdf?sequence=2
Meira, C.A.A., Rodrigues, L.H.A., & Moraes, S.A.d. (2009). Modelos de alerta para o controle da ferrugem-do-cafeeiro em lavouras com alta carga pendente. Pesquisa Agropecuária Brasileira, 44, 233-242.
Monedero, I., Biscarri, F., León, C., Guerrero, J. I., Biscarri, J., & Millán, R. (2012). Detection of frauds and other non-technical losses in a power utility using Pearson coefficient, Bayesian networks and decision trees. International Journal of Electrical Power & Energy Systems, 34(1), 90-98. doi: http://dx.doi.org/10.1016/j.ijepes.2011.09.009
Opitz, D. & Maclin, R. (1999). Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research, 11, 169-198.
Pérez-Ariza, C.B., Nicholson, A.E., & Flores, M.J. (2012). Prediction of Coffee Rust Disease Using Bayesian Networks, Proceedings of the Sixth European Workshop on Probabilistic Graphical Models, (pp.259-266). Available at http://arrow.monash.edu.au/hdl/1959.1/821316
Poh, H.L. (1991). A neural network approach for marketing strategies research and decision support [Ph.D Thesis], Stanford University
Ranawana, R. & Palade, V. (2006). Multi-Classifier systems: Review and a roadmap for developers. Int. J. Hybrid Intell. Syst., 3(1), 35-61
Refaeilzadeh, P., Tang, L., & Liu, H. (2009). Cross-Validation. In L. Liu & T. Özsu [Eds.], Encyclopedia of Database Systems (pp. 532-538): Springer
Rivillas-Osorio, C., Serna-Giraldo, C., Cristancho-Ardila, M., & Gaitán-Bustamante, A. (2011). La roya del cafeto en Colombia, impacto, manejo y costos de control. In S. Marín [Ed.], Avances Tecnicos Cenicafe. Chinchiná, Colombia: Cenicafé
Shieber, E. & Zentmyer, G. A. (1984). Coffee rust in the western hemisphere Plant disease, 68, 89-93
Smola, A. & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199-222. doi: 10.1023/b:stco.0000035301.49549.88
Suhasini, A., Palanivel, S., & Ramalingam, V. (2011). Multimodel decision support system for psychiatry problem. Expert Systems with Applications, 38(5), 4990-4997. doi: http://dx.doi.org/10.1016/j.eswa.2010.09.152
Vapnik, V.N. ( 2000). The nature of statistical learning theory. New York, NY: Springer.
Vapnik, V.N. (1999). An overview of statistical learning theory. Neural Networks, IEEE Transactions on, 10(5), 988-999. doi: 10.1109/72.788640
Wang, Y., & Witten, I.H. (1996). Induction of model trees for predicting continuous classes. Working Paper Series, 96(23). Retrieved from de http://www.cs.waikato.ac.nz/pubs/wp/1996/uow-cs-wp-1996-23.pdf
Wei, C.-P., Chen, H.-C., & Cheng, T.-H. (2008). Effective spam filtering: A single-class learning and ensemble approach. Decision Support Systems, 45(3), 491-503. doi: http://dx.doi.org/10.1016/j.dss.2007.06.010
Wirth, R. (2000). CRISP-DM: Towards a standard process model for data mining. In Proceedings of the Fourth International Conference on the Practical Application of Knowledge Discovery and Data Mining, Manchester, UK, (pp29-39).
Zapata, J.C. & Ruíz, G.M. (1988). La variedad Colombia: selección de un cultivar compuesto resistente a la roya del cafeto [Premio Nacional de Ciencias, Fundación Alejandro Angel Escobar, 1986]. Chinchiná, Colombia: Cenicafé
Zhang, D., & Tsai, J. J. P. (2007). Advances in MacHine learning applications in software engineering: Hershey, PA: Idea
Zhu, D. (2010). A hybrid approach for efficient ensembles. Decision Support Systems, 48(3), 480-487. doi: http://dx.doi.org/10.1016/j.dss.2009.06.007
Downloads
Publicado
2014-06-30
Edição
Seção
Original Research
Licença
Esta publicação está licenciada sob os termos da licença CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/deed.pt_BR).