r - Caret error using GBM, but not without caret -
i've been using gbm through caret without problems, when removing variables dataframe started fail. i've tried both github , cran versions of mentioned packages.
this error:
> fitrf = train(my_data[trainindex,vars_for_clust], clusterassignment[trainindex], method = "gbm", verbose=t) wrong; accuracy metric values missing: accuracy kappa min. : na min. : na 1st qu.: na 1st qu.: na median : na median : na mean :nan mean :nan 3rd qu.: na 3rd qu.: na max. : na max. : na na's :9 na's :9 error in train.default(my_data[trainindex, vars_for_clust], clusterassignment[trainindex], : stopping in addition: there 50 or more warnings (use warnings() see first 50) > warnings() warning messages: 1: in eval(expr, envir, enclos) : model fit failed resample01: shrinkage=0.1, interaction.depth=1, n.minobsinnode=10, n.trees=150 error in gbm.fit(x = structure(list(relatedness_cottle = c(0, 0, 8, 6, : unused arguments (x = list(relatedness_cottle = c(0, 0, 8, 6, 0, 6, 8, 10, 10, 6, 6, 4, 4, 4, 0, 0, 0, 0, 18, 18, 18, 0, 0, 6, 6, 0, 18, 12, 0, 4, 4, 4, 0, 0, 0, 18, 18, 6, 4, 4, 4, 6, 8, 6, 6, 0, 14, 2, 0, 8, 6, 6, 0, 4, 0, 0, 0, 0, 0, 4, 8, 8, 8, 4, 18, 0, 0, 4, 10, 18, 6, 0, 0, 18, 10, 10, 6, 2, 4, 4, 10, 10, 10, 2, 8, 0, 0, 0, 0, 10, 6, 6, 0, 4, 4, 0, 0, 0, 0, 8, 0, 0, 4, 4, 6, 6, 10, 6, 0, 0, 6, 4, 4, 8, 0, 12, 6, 2, 2, 8, 8, 4, 4, 4, 4, 6, 2, 2, 4, 0, 6, 0, 0, 0, 12, 18, 8, 0, 0, 4, 4, 2, 0, 0, 0, 0, 18, 12, 6, 6, 4, 4, 12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 18, 0, 0, 18, 6, 4, 2, 2, 0, 0, 10, 0, 0, 0, 12, 4, 4, 4, 4, 4, 8, 18, 6, 18, 18, 12, 12, 12, 0, 0, 0, 0, 10, 12, 12, 12, 12, 12, 4, 4, 4, 6, 6, 6, 6, 12, 0, 6, 0, 0, 4, 4, 18, 18, 18, 0, 0, 4, 6, 6, 0, 0, 2, 0, 0, 0, 18, 12, 12, 0, 0, 0, 0, 0, 0, 18 [... truncated]
there no missing values, response 4 level factor , inputs following:
classes ‘tbl_df’, ‘tbl’ , 'data.frame': 1165 obs. of 14 variables: $ relatedness_cottle : num 0 0 8 8 0 6 0 6 6 0 ... $ dominance_cottle : int 4 6 0 6 6 6 6 4 4 4 ... $ time_spent : num 26832 20822 18893 13107 25406 ... $ num_color_changes : num 3.33 2.33 1.33 1 1 ... $ num_selects : num 1 0.667 2 0.667 1.667 ... $ show_select_match : num 1 0.667 0.333 1 1 ... $ default_size : num 0.667 0 0.667 0 0 ... $ select_order : factor w/ 6 levels "future_past_present",..: 1 4 4 2 5 1 4 6 6 4 ... $ order_x : factor w/ 6 levels "future_past_present",..: 4 4 4 4 4 3 4 4 4 4 ... $ color_past : factor w/ 8 levels "black","blue",..: 5 1 6 8 5 7 1 6 6 5 ... $ color_present : factor w/ 8 levels "black","blue",..: 1 4 4 4 6 8 4 4 1 4 ... $ color_future : factor w/ 8 levels "black","blue",..: 2 2 2 2 2 2 1 2 8 2 ... $ dominance_cottle_future : int 0 4 0 4 2 0 4 2 2 0 ... $ relatedness_cottle_future: int 0 2 4 4 0 4 0 2 4 0 ...
but if call gbm directly dataframe, works:
summary(gbm(clusterassignment[trainindex] ~ ., data = my_data[trainindex,vars_for_clust])) distribution not specified, assuming multinomial ... var rel.inf color_present color_present 33.533673 dominance_cottle dominance_cottle 33.170138 default_size default_size 25.321566 dominance_cottle_future dominance_cottle_future 5.674563 color_future color_future 2.300060 relatedness_cottle relatedness_cottle 0.000000 time_spent time_spent 0.000000 num_color_changes num_color_changes 0.000000 num_selects num_selects 0.000000 show_select_match show_select_match 0.000000 select_order select_order 0.000000 order_x order_x 0.000000 color_past color_past 0.000000 relatedness_cottle_future relatedness_cottle_future 0.000000
edit: reproduce, run script found here.
for now, casting dataframe plyr/dplyr normal dataframe as.data.frame()
fixes problem.
train(as.data.frame(issuedataframe), issueresponse, method="gbm")
see this issue.
Comments
Post a Comment