Reliability analysis
Call: psych::alpha(x = podaci, check.keys = TRUE)
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
0.75 0.83 0.94 0.41 4.9 0.011 17 4.3 0.37
95% confidence boundaries
lower alpha upper
Feldt 0.67 0.75 0.81
Duhachek 0.72 0.75 0.77
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
KVIZ 0.73 0.82 0.91 0.43 4.5 0.013 0.061 0.37
KOL1 0.67 0.79 0.89 0.39 3.9 0.015 0.050 0.37
KOL2 0.69 0.82 0.89 0.43 4.5 0.015 0.043 0.37
Zadavanje 0.73 0.81 0.93 0.41 4.2 0.012 0.060 0.43
Rjesavanje 0.72 0.83 0.88 0.44 4.8 0.011 0.046 0.43
Procjena 0.76 0.84 0.95 0.46 5.1 0.012 0.062 0.52
UKUPNO 0.67 0.74 0.79 0.32 2.9 0.043 0.039 0.30
Item statistics
n raw.r std.r r.cor r.drop mean sd
KVIZ 118 0.61 0.65 0.61 0.55 4.4 2.54
KOL1 118 0.80 0.76 0.77 0.72 19.8 5.51
KOL2 118 0.73 0.66 0.66 0.64 20.8 5.04
Zadavanje 118 0.66 0.71 0.67 0.61 5.1 2.45
Rjesavanje 118 0.61 0.61 0.61 0.46 6.5 5.75
Procjena 118 0.44 0.56 0.44 0.41 1.3 0.98
UKUPNO 118 1.00 0.98 1.01 0.99 58.2 15.06
Prediktori
Response - Klasifikacija studenata na temelju broja bodova dobivenih na zadavanju problema
Hiperparametri - za odabir optimalne kombinacije hiperparametara napravljen je cross-validation s 10 kutija pri čemu je na slučajni način isprobano 5000 kombinacija hiperparametara mtry, min_n i trees preko uzorkovanja na latinskoj hiperkocki. Najbolji model je odabran s obzirom na roc_auc metriku.
Prediktori
Response - Klasifikacija studenata na temelju broja bodova dobivenih na zadavanju problema
Hiperparametri - za odabir optimalne kombinacije hiperparametara napravljen je cross-validation s 10 kutija pri čemu je na slučajni način isprobano 5000 kombinacija hiperparametara tree_depth, min_n i cost_complexity preko uzorkovanja na latinskoj hiperkocki. Najbolji model je odabran s obzirom na roc_auc metriku.
prob | KVIZ | KOL1 | KOL2 | Zadavanje | Rjesavanje | Procjena | UKUPNO |
---|---|---|---|---|---|---|---|
0% | 0.0000 | 0.0000 | 0.0000 | 0.00 | 0.0000 | 0.0000 | 0.0000 |
33% | 2.8883 | 18.3661 | 20.7949 | 4.00 | 0.0000 | 0.7493 | 53.7915 |
67% | 6.0312 | 23.3678 | 22.6717 | 6.39 | 9.2429 | 1.9778 | 64.5833 |
100% | 9.2700 | 27.4700 | 30.0000 | 9.00 | 17.4100 | 2.8000 | 93.2500 |
KVIZ | KOL1 | KOL2 | Zadavanje | Rjesavanje | Procjena | UKUPNO | |
---|---|---|---|---|---|---|---|
Min. :0.000 | Min. : 0.00 | Min. : 0.00 | Min. :0.000 | Min. : 0.000 | Min. :0.000 | Min. : 0.00 | |
1st Qu.:1.975 | 1st Qu.:17.49 | 1st Qu.:19.58 | 1st Qu.:3.000 | 1st Qu.: 0.000 | 1st Qu.:0.000 | 1st Qu.:50.80 | |
Median :4.405 | Median :21.12 | Median :21.82 | Median :5.000 | Median : 7.370 | Median :1.550 | Median :60.59 | |
Mean :4.438 | Mean :19.84 | Mean :20.83 | Mean :5.136 | Mean : 6.454 | Mean :1.272 | Mean :58.22 | |
3rd Qu.:6.537 | 3rd Qu.:23.89 | 3rd Qu.:23.48 | 3rd Qu.:7.000 | 3rd Qu.:11.918 | 3rd Qu.:2.170 | 3rd Qu.:68.11 | |
Max. :9.270 | Max. :27.47 | Max. :30.00 | Max. :9.000 | Max. :17.410 | Max. :2.800 | Max. :93.25 |
# A tibble: 20 × 8
trees min_n .metric .estimator mean n std_err .config
<int> <int> <chr> <chr> <dbl> <int> <dbl> <chr>
1 712 38 roc_auc hand_till 0.560 10 0.0449 Preprocessor1_Model0682
2 474 37 roc_auc hand_till 0.559 10 0.0455 Preprocessor1_Model1527
3 671 38 roc_auc hand_till 0.558 10 0.0463 Preprocessor1_Model3894
4 712 39 roc_auc hand_till 0.557 10 0.0475 Preprocessor1_Model2330
5 542 35 roc_auc hand_till 0.553 10 0.0466 Preprocessor1_Model4745
6 842 34 roc_auc hand_till 0.553 10 0.0456 Preprocessor1_Model3010
7 451 35 roc_auc hand_till 0.552 10 0.0508 Preprocessor1_Model0503
8 1106 37 roc_auc hand_till 0.552 10 0.0436 Preprocessor1_Model4584
9 656 35 roc_auc hand_till 0.551 10 0.0480 Preprocessor1_Model4049
10 622 39 roc_auc hand_till 0.551 10 0.0472 Preprocessor1_Model0384
11 628 36 roc_auc hand_till 0.550 10 0.0458 Preprocessor1_Model2749
12 679 38 roc_auc hand_till 0.550 10 0.0482 Preprocessor1_Model3765
13 1170 38 roc_auc hand_till 0.55 10 0.0468 Preprocessor1_Model0221
14 1006 36 roc_auc hand_till 0.55 10 0.0487 Preprocessor1_Model0626
15 453 39 roc_auc hand_till 0.55 10 0.0481 Preprocessor1_Model3012
16 1149 37 roc_auc hand_till 0.550 10 0.0472 Preprocessor1_Model1823
17 678 38 roc_auc hand_till 0.550 10 0.0475 Preprocessor1_Model2288
18 676 38 roc_auc hand_till 0.550 10 0.0449 Preprocessor1_Model3870
19 887 38 roc_auc hand_till 0.550 10 0.0451 Preprocessor1_Model4395
20 743 38 roc_auc hand_till 0.549 10 0.0439 Preprocessor1_Model0026
# A tibble: 9 × 4
.metric .estimator trening test
<chr> <chr> <dbl> <dbl>
1 sens macro 0.618 0.489
2 precision macro 0.666 0.488
3 spec macro 0.808 0.745
4 accuracy multiclass 0.625 0.5
5 f_meas macro 0.626 0.477
6 mcc multiclass 0.443 0.244
7 kap multiclass 0.430 0.239
8 roc_auc hand_till 0.850 0.612
9 mn_log_loss multiclass 0.903 1.04
# A tibble: 20 × 9
cost_complexity tree_depth min_n .metric .estimator mean n std_err
<dbl> <int> <int> <chr> <chr> <dbl> <int> <dbl>
1 2.71e- 6 2 3 roc_auc hand_till 0.609 10 0.0322
2 4.47e- 4 2 4 roc_auc hand_till 0.609 10 0.0322
3 1.71e- 2 2 5 roc_auc hand_till 0.609 10 0.0322
4 1.23e- 5 2 6 roc_auc hand_till 0.609 10 0.0322
5 4.82e- 9 2 6 roc_auc hand_till 0.609 10 0.0322
6 3.65e- 5 2 6 roc_auc hand_till 0.609 10 0.0322
7 7.07e- 6 2 6 roc_auc hand_till 0.609 10 0.0322
8 5.61e- 3 2 4 roc_auc hand_till 0.609 10 0.0322
9 3.37e- 3 2 7 roc_auc hand_till 0.609 10 0.0322
10 4.65e-10 2 7 roc_auc hand_till 0.609 10 0.0322
11 3.26e- 3 2 5 roc_auc hand_till 0.609 10 0.0322
12 1.87e- 8 2 7 roc_auc hand_till 0.609 10 0.0322
13 2.61e- 2 2 7 roc_auc hand_till 0.609 10 0.0322
14 5.35e- 6 2 6 roc_auc hand_till 0.609 10 0.0322
15 9.16e- 5 2 6 roc_auc hand_till 0.609 10 0.0322
16 8.18e- 3 2 5 roc_auc hand_till 0.609 10 0.0322
17 5.05e- 3 2 7 roc_auc hand_till 0.609 10 0.0322
18 4.46e- 4 2 7 roc_auc hand_till 0.609 10 0.0322
19 5.93e- 6 2 4 roc_auc hand_till 0.609 10 0.0322
20 2.92e- 3 2 2 roc_auc hand_till 0.609 10 0.0322
# ℹ 1 more variable: .config <chr>
# A tibble: 9 × 4
.metric .estimator trening test
<chr> <chr> <dbl> <dbl>
1 sens macro 0.507 0.507
2 precision macro 0.588 0.596
3 spec macro 0.750 0.757
4 accuracy multiclass 0.511 0.533
5 f_meas macro 0.510 0.471
6 mcc multiclass 0.274 0.337
7 kap multiclass 0.256 0.276
8 roc_auc hand_till 0.689 0.640
9 mn_log_loss multiclass 0.938 3.25
n= 88
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 88 56 1 (0.36363636 0.30681818 0.32954545)
2) KVIZ< 6.765 73 45 1 (0.38356164 0.35616438 0.26027397)
4) KOL1>=23.01 24 10 2 (0.33333333 0.58333333 0.08333333) *
5) KOL1< 23.01 49 29 1 (0.40816327 0.24489796 0.34693878) *
3) KVIZ>=6.765 15 5 3 (0.26666667 0.06666667 0.66666667)
6) KOL1>=26.44 2 1 1 (0.50000000 0.50000000 0.00000000) *
7) KOL1< 26.44 13 3 3 (0.23076923 0.00000000 0.76923077) *
Call:
rpart::rpart(formula = ..y ~ ., data = data, cp = ~2.71203228258191e-06,
maxdepth = ~2, minsplit = min_rows(3, data))
n= 88
CP nsplit rel error xerror xstd
1 1.071429e-01 0 1.0000000 1.1071429 0.07642812
2 1.785714e-02 2 0.7857143 0.8928571 0.08297503
3 2.712032e-06 3 0.7678571 0.8928571 0.08297503
Variable importance
KOL1 KVIZ KOL2
54 38 8
Node number 1: 88 observations, complexity param=0.1071429
predicted class=1 expected loss=0.6363636 P(node) =1
class counts: 32 27 29
probabilities: 0.364 0.307 0.330
left son=2 (73 obs) right son=3 (15 obs)
Primary splits:
KVIZ < 6.765 to the left, improve=3.267933, (0 missing)
KOL2 < 24.47 to the left, improve=1.884779, (0 missing)
KOL1 < 23.01 to the left, improve=1.500941, (0 missing)
Surrogate splits:
KOL2 < 25.105 to the left, agree=0.864, adj=0.200, (0 split)
KOL1 < 25.575 to the left, agree=0.852, adj=0.133, (0 split)
Node number 2: 73 observations, complexity param=0.1071429
predicted class=1 expected loss=0.6164384 P(node) =0.8295455
class counts: 28 26 19
probabilities: 0.384 0.356 0.260
left son=4 (24 obs) right son=5 (49 obs)
Primary splits:
KOL1 < 23.01 to the right, improve=3.054795, (0 missing)
KOL2 < 24.47 to the left, improve=1.548301, (0 missing)
KVIZ < 6.71 to the left, improve=1.294231, (0 missing)
Surrogate splits:
KVIZ < 6.645 to the right, agree=0.699, adj=0.083, (0 split)
KOL2 < 24.47 to the right, agree=0.685, adj=0.042, (0 split)
Node number 3: 15 observations, complexity param=0.01785714
predicted class=3 expected loss=0.3333333 P(node) =0.1704545
class counts: 4 1 10
probabilities: 0.267 0.067 0.667
left son=6 (2 obs) right son=7 (13 obs)
Primary splits:
KOL1 < 26.44 to the right, improve=1.5846150, (0 missing)
KOL2 < 24.62 to the left, improve=1.0888890, (0 missing)
KVIZ < 8.71 to the right, improve=0.4727273, (0 missing)
Node number 4: 24 observations
predicted class=2 expected loss=0.4166667 P(node) =0.2727273
class counts: 8 14 2
probabilities: 0.333 0.583 0.083
Node number 5: 49 observations
predicted class=1 expected loss=0.5918367 P(node) =0.5568182
class counts: 20 12 17
probabilities: 0.408 0.245 0.347
Node number 6: 2 observations
predicted class=1 expected loss=0.5 P(node) =0.02272727
class counts: 1 1 0
probabilities: 0.500 0.500 0.000
Node number 7: 13 observations
predicted class=3 expected loss=0.2307692 P(node) =0.1477273
class counts: 3 0 10
probabilities: 0.231 0.000 0.769
count ncat improve index adj
KVIZ 88 -1 3.2679328 6.765 0.00000000
KOL2 88 -1 1.8847786 24.470 0.00000000
KOL1 88 -1 1.5009408 23.010 0.00000000
KOL2 0 -1 0.8636364 25.105 0.20000000
KOL1 0 -1 0.8522727 25.575 0.13333333
KOL1 73 1 3.0547945 23.010 0.00000000
KOL2 73 -1 1.5483010 24.470 0.00000000
KVIZ 73 -1 1.2942311 6.710 0.00000000
KVIZ 0 1 0.6986301 6.645 0.08333333
KOL2 0 1 0.6849315 24.470 0.04166667
KOL1 15 1 1.5846154 26.440 0.00000000
KOL2 15 -1 1.0888889 24.620 0.00000000
KVIZ 15 1 0.4727273 8.710 0.00000000
Prediktori
Response - Klasifikacija studenata na temelju broja bodova dobivenih na rješavanju problema
Hiperparametri - za odabir optimalne kombinacije hiperparametara napravljen je cross-validation s 10 kutija pri čemu je na slučajni način isprobano 500 kombinacija hiperparametara mtry, min_n i trees. Najbolji model je odabran s obzirom na roc_auc metriku.
Prediktori
Response - Klasifikacija studenata na temelju broja bodova dobivenih na rješavanju problema
Hiperparametri - za odabir optimalne kombinacije hiperparametara napravljen je cross-validation s 10 kutija pri čemu je na slučajni način isprobano 5000 kombinacija hiperparametara tree_depth, min_n i cost_complexity preko uzorkovanja na latinskoj hiperkocki. Najbolji model je odabran s obzirom na roc_auc metriku.
prob | KVIZ | KOL1 | KOL2 | Zadavanje | Rjesavanje | Procjena | UKUPNO |
---|---|---|---|---|---|---|---|
0% | 0.0000 | 0.0000 | 0.0000 | 0.00 | 0.0000 | 0.0000 | 0.0000 |
33% | 2.8883 | 18.3661 | 20.7949 | 4.00 | 0.0000 | 0.7493 | 53.7915 |
67% | 6.0312 | 23.3678 | 22.6717 | 6.39 | 9.2429 | 1.9778 | 64.5833 |
100% | 9.2700 | 27.4700 | 30.0000 | 9.00 | 17.4100 | 2.8000 | 93.2500 |
KVIZ | KOL1 | KOL2 | Zadavanje | Rjesavanje | Procjena | UKUPNO | |
---|---|---|---|---|---|---|---|
Min. :0.000 | Min. : 0.00 | Min. : 0.00 | Min. :0.000 | Min. : 0.000 | Min. :0.000 | Min. : 0.00 | |
1st Qu.:1.975 | 1st Qu.:17.49 | 1st Qu.:19.58 | 1st Qu.:3.000 | 1st Qu.: 0.000 | 1st Qu.:0.000 | 1st Qu.:50.80 | |
Median :4.405 | Median :21.12 | Median :21.82 | Median :5.000 | Median : 7.370 | Median :1.550 | Median :60.59 | |
Mean :4.438 | Mean :19.84 | Mean :20.83 | Mean :5.136 | Mean : 6.454 | Mean :1.272 | Mean :58.22 | |
3rd Qu.:6.537 | 3rd Qu.:23.89 | 3rd Qu.:23.48 | 3rd Qu.:7.000 | 3rd Qu.:11.918 | 3rd Qu.:2.170 | 3rd Qu.:68.11 | |
Max. :9.270 | Max. :27.47 | Max. :30.00 | Max. :9.000 | Max. :17.410 | Max. :2.800 | Max. :93.25 |
# A tibble: 20 × 9
mtry trees min_n .metric .estimator mean n std_err .config
<int> <int> <int> <chr> <chr> <dbl> <int> <dbl> <chr>
1 3 1317 19 roc_auc hand_till 0.806 10 0.0315 Preprocessor1_Model…
2 3 430 15 roc_auc hand_till 0.803 10 0.0272 Preprocessor1_Model…
3 3 1272 18 roc_auc hand_till 0.802 10 0.0304 Preprocessor1_Model…
4 3 765 16 roc_auc hand_till 0.802 10 0.0306 Preprocessor1_Model…
5 2 925 3 roc_auc hand_till 0.802 10 0.0308 Preprocessor1_Model…
6 2 540 2 roc_auc hand_till 0.802 10 0.0277 Preprocessor1_Model…
7 3 1457 12 roc_auc hand_till 0.802 10 0.0280 Preprocessor1_Model…
8 3 1198 15 roc_auc hand_till 0.801 10 0.0310 Preprocessor1_Model…
9 3 526 18 roc_auc hand_till 0.801 10 0.0324 Preprocessor1_Model…
10 3 797 16 roc_auc hand_till 0.801 10 0.0300 Preprocessor1_Model…
11 3 477 24 roc_auc hand_till 0.801 10 0.0340 Preprocessor1_Model…
12 3 1126 19 roc_auc hand_till 0.800 10 0.0333 Preprocessor1_Model…
13 3 661 28 roc_auc hand_till 0.800 10 0.0346 Preprocessor1_Model…
14 3 1072 26 roc_auc hand_till 0.8 10 0.0360 Preprocessor1_Model…
15 3 419 16 roc_auc hand_till 0.8 10 0.0282 Preprocessor1_Model…
16 3 892 24 roc_auc hand_till 0.800 10 0.0332 Preprocessor1_Model…
17 3 443 11 roc_auc hand_till 0.799 10 0.0265 Preprocessor1_Model…
18 3 1115 15 roc_auc hand_till 0.799 10 0.0305 Preprocessor1_Model…
19 3 835 16 roc_auc hand_till 0.799 10 0.0313 Preprocessor1_Model…
20 3 818 12 roc_auc hand_till 0.799 10 0.0305 Preprocessor1_Model…
# A tibble: 9 × 4
.metric .estimator trening test
<chr> <chr> <dbl> <dbl>
1 sens macro 0.813 0.674
2 precision macro 0.811 0.712
3 spec macro 0.907 0.841
4 accuracy multiclass 0.812 0.676
5 f_meas macro 0.811 0.685
6 mcc multiclass 0.720 0.522
7 kap multiclass 0.719 0.516
8 roc_auc hand_till 0.955 0.826
9 mn_log_loss multiclass 0.551 0.761
# A tibble: 20 × 9
cost_complexity tree_depth min_n .metric .estimator mean n std_err
<dbl> <int> <int> <chr> <chr> <dbl> <int> <dbl>
1 8.24e- 9 3 4 roc_auc hand_till 0.798 10 0.0319
2 4.65e- 8 3 4 roc_auc hand_till 0.798 10 0.0319
3 3.45e- 9 3 2 roc_auc hand_till 0.798 10 0.0319
4 4.08e- 3 3 3 roc_auc hand_till 0.798 10 0.0319
5 2.27e- 6 3 3 roc_auc hand_till 0.798 10 0.0319
6 8.09e- 7 3 4 roc_auc hand_till 0.798 10 0.0319
7 1.68e- 5 3 4 roc_auc hand_till 0.798 10 0.0319
8 1.94e- 5 3 4 roc_auc hand_till 0.798 10 0.0319
9 8.96e-10 3 2 roc_auc hand_till 0.798 10 0.0319
10 1.34e-10 3 4 roc_auc hand_till 0.798 10 0.0319
11 1.29e- 4 3 4 roc_auc hand_till 0.798 10 0.0319
12 4.83e- 7 3 4 roc_auc hand_till 0.798 10 0.0319
13 2.29e- 4 3 2 roc_auc hand_till 0.798 10 0.0319
14 8.85e- 5 3 4 roc_auc hand_till 0.798 10 0.0319
15 3.71e- 9 3 3 roc_auc hand_till 0.798 10 0.0319
16 9.28e- 3 3 2 roc_auc hand_till 0.798 10 0.0319
17 1.59e- 9 3 4 roc_auc hand_till 0.798 10 0.0319
18 1.12e-10 3 5 roc_auc hand_till 0.793 10 0.0367
19 2.50e- 3 3 8 roc_auc hand_till 0.793 10 0.0367
20 1.50e-10 3 6 roc_auc hand_till 0.793 10 0.0367
# ℹ 1 more variable: .config <chr>
# A tibble: 9 × 4
.metric .estimator trening test
<chr> <chr> <dbl> <dbl>
1 sens macro 0.768 0.705
2 precision macro 0.769 0.707
3 spec macro 0.886 0.854
4 accuracy multiclass 0.771 0.706
5 f_meas macro 0.768 0.705
6 mcc multiclass 0.656 0.560
7 kap multiclass 0.656 0.559
8 roc_auc hand_till 0.883 0.795
9 mn_log_loss multiclass 0.569 3.70
n= 96
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 96 62 1 (0.35416667 0.31250000 0.33333333)
2) Zadavanje< 4.5 42 12 1 (0.71428571 0.11904762 0.16666667)
4) Zadavanje>=1 28 5 1 (0.82142857 0.17857143 0.00000000)
8) KVIZ< 8.89 27 4 1 (0.85185185 0.14814815 0.00000000) *
9) KVIZ>=8.89 1 0 2 (0.00000000 1.00000000 0.00000000) *
5) Zadavanje< 1 14 7 1 (0.50000000 0.00000000 0.50000000)
10) KOL1< 16.33 6 0 1 (1.00000000 0.00000000 0.00000000) *
11) KOL1>=16.33 8 1 3 (0.12500000 0.00000000 0.87500000) *
3) Zadavanje>=4.5 54 29 2 (0.07407407 0.46296296 0.46296296)
6) KOL1>=16.37 48 23 2 (0.08333333 0.52083333 0.39583333)
12) KOL1< 23.96 30 10 2 (0.06666667 0.66666667 0.26666667) *
13) KOL1>=23.96 18 7 3 (0.11111111 0.27777778 0.61111111) *
7) KOL1< 16.37 6 0 3 (0.00000000 0.00000000 1.00000000) *
Call:
rpart::rpart(formula = ..y ~ ., data = data, cp = ~8.23517753134381e-09,
maxdepth = ~3, minsplit = min_rows(4, data))
n= 96
CP nsplit rel error xerror xstd
1 3.387097e-01 0 1.0000000 1.1774194 0.06745250
2 9.677419e-02 1 0.6612903 0.8225806 0.07886132
3 4.838710e-02 3 0.4677419 0.6290323 0.07761442
4 1.612903e-02 5 0.3709677 0.5000000 0.07389418
5 8.235178e-09 6 0.3548387 0.5161290 0.07449681
Variable importance
Zadavanje KOL1 KOL2 KVIZ
35 28 19 18
Node number 1: 96 observations, complexity param=0.3387097
predicted class=1 expected loss=0.6458333 P(node) =1
class counts: 34 30 32
probabilities: 0.354 0.312 0.333
left son=2 (42 obs) right son=3 (54 obs)
Primary splits:
Zadavanje < 4.5 to the left, improve=14.551590, (0 missing)
KOL1 < 17.04 to the left, improve= 4.326923, (0 missing)
KVIZ < 0.33 to the left, improve= 2.612319, (0 missing)
KOL2 < 21.16 to the right, improve= 2.243252, (0 missing)
Surrogate splits:
KOL2 < 18.605 to the left, agree=0.635, adj=0.167, (0 split)
KOL1 < 17.04 to the left, agree=0.625, adj=0.143, (0 split)
KVIZ < 0.565 to the left, agree=0.615, adj=0.119, (0 split)
Node number 2: 42 observations, complexity param=0.0483871
predicted class=1 expected loss=0.2857143 P(node) =0.4375
class counts: 30 5 7
probabilities: 0.714 0.119 0.167
left son=4 (28 obs) right son=5 (14 obs)
Primary splits:
Zadavanje < 1 to the right, improve=3.5952380, (0 missing)
KOL1 < 17.125 to the left, improve=2.0761900, (0 missing)
KVIZ < 8.89 to the left, improve=1.3461090, (0 missing)
KOL2 < 23.19 to the right, improve=0.8701299, (0 missing)
Surrogate splits:
KVIZ < 0.785 to the right, agree=0.786, adj=0.357, (0 split)
KOL2 < 16.54 to the right, agree=0.786, adj=0.357, (0 split)
KOL1 < 5.81 to the right, agree=0.762, adj=0.286, (0 split)
Node number 3: 54 observations, complexity param=0.09677419
predicted class=2 expected loss=0.537037 P(node) =0.5625
class counts: 4 25 25
probabilities: 0.074 0.463 0.463
left son=6 (48 obs) right son=7 (6 obs)
Primary splits:
KOL1 < 16.37 to the right, improve=3.430556, (0 missing)
KOL2 < 21.12 to the right, improve=1.973203, (0 missing)
KVIZ < 1.4 to the left, improve=1.555556, (0 missing)
Zadavanje < 6.5 to the left, improve=1.421762, (0 missing)
Surrogate splits:
KOL2 < 14.97 to the right, agree=0.944, adj=0.5, (0 split)
Node number 4: 28 observations, complexity param=0.01612903
predicted class=1 expected loss=0.1785714 P(node) =0.2916667
class counts: 23 5 0
probabilities: 0.821 0.179 0.000
left son=8 (27 obs) right son=9 (1 obs)
Primary splits:
KVIZ < 8.89 to the left, improve=1.3994710, (0 missing)
KOL1 < 26.57 to the left, improve=1.3994710, (0 missing)
KOL2 < 14.805 to the right, improve=1.3994710, (0 missing)
Zadavanje < 2.5 to the left, improve=0.5952381, (0 missing)
Node number 5: 14 observations, complexity param=0.0483871
predicted class=1 expected loss=0.5 P(node) =0.1458333
class counts: 7 0 7
probabilities: 0.500 0.000 0.500
left son=10 (6 obs) right son=11 (8 obs)
Primary splits:
KOL1 < 16.33 to the left, improve=5.25, (0 missing)
KVIZ < 0.33 to the left, improve=2.80, (0 missing)
KOL2 < 7.55 to the left, improve=2.80, (0 missing)
Surrogate splits:
KVIZ < 0.33 to the left, agree=0.857, adj=0.667, (0 split)
KOL2 < 7.55 to the left, agree=0.857, adj=0.667, (0 split)
Node number 6: 48 observations, complexity param=0.09677419
predicted class=2 expected loss=0.4791667 P(node) =0.5
class counts: 4 25 19
probabilities: 0.083 0.521 0.396
left son=12 (30 obs) right son=13 (18 obs)
Primary splits:
KOL1 < 23.96 to the left, improve=3.058333, (0 missing)
KVIZ < 1.4 to the left, improve=2.194767, (0 missing)
KOL2 < 23.435 to the left, improve=1.872899, (0 missing)
Zadavanje < 6.5 to the left, improve=1.201522, (0 missing)
Surrogate splits:
KVIZ < 6.9 to the left, agree=0.792, adj=0.444, (0 split)
KOL2 < 24.305 to the left, agree=0.729, adj=0.278, (0 split)
Zadavanje < 8.5 to the left, agree=0.646, adj=0.056, (0 split)
Node number 7: 6 observations
predicted class=3 expected loss=0 P(node) =0.0625
class counts: 0 0 6
probabilities: 0.000 0.000 1.000
Node number 8: 27 observations
predicted class=1 expected loss=0.1481481 P(node) =0.28125
class counts: 23 4 0
probabilities: 0.852 0.148 0.000
Node number 9: 1 observations
predicted class=2 expected loss=0 P(node) =0.01041667
class counts: 0 1 0
probabilities: 0.000 1.000 0.000
Node number 10: 6 observations
predicted class=1 expected loss=0 P(node) =0.0625
class counts: 6 0 0
probabilities: 1.000 0.000 0.000
Node number 11: 8 observations
predicted class=3 expected loss=0.125 P(node) =0.08333333
class counts: 1 0 7
probabilities: 0.125 0.000 0.875
Node number 12: 30 observations
predicted class=2 expected loss=0.3333333 P(node) =0.3125
class counts: 2 20 8
probabilities: 0.067 0.667 0.267
Node number 13: 18 observations
predicted class=3 expected loss=0.3888889 P(node) =0.1875
class counts: 2 5 11
probabilities: 0.111 0.278 0.611
count ncat improve index adj
Zadavanje 96 -1 14.5515873 4.500 0.00000000
KOL1 96 -1 4.3269231 17.040 0.00000000
KVIZ 96 -1 2.6123188 0.330 0.00000000
KOL2 96 1 2.2432524 21.160 0.00000000
KOL2 0 -1 0.6354167 18.605 0.16666667
KOL1 0 -1 0.6250000 17.040 0.14285714
KVIZ 0 -1 0.6145833 0.565 0.11904762
Zadavanje 42 1 3.5952381 1.000 0.00000000
KOL1 42 -1 2.0761905 17.125 0.00000000
KVIZ 42 -1 1.3461092 8.890 0.00000000
KOL2 42 1 0.8701299 23.190 0.00000000
KVIZ 0 1 0.7857143 0.785 0.35714286
KOL2 0 1 0.7857143 16.540 0.35714286
KOL1 0 1 0.7619048 5.810 0.28571429
KVIZ 28 -1 1.3994709 8.890 0.00000000
KOL1 28 -1 1.3994709 26.570 0.00000000
KOL2 28 1 1.3994709 14.805 0.00000000
Zadavanje 28 -1 0.5952381 2.500 0.00000000
KOL1 14 -1 5.2500000 16.330 0.00000000
KVIZ 14 -1 2.8000000 0.330 0.00000000
KOL2 14 -1 2.8000000 7.550 0.00000000
KVIZ 0 -1 0.8571429 0.330 0.66666667
KOL2 0 -1 0.8571429 7.550 0.66666667
KOL1 54 1 3.4305556 16.370 0.00000000
KOL2 54 1 1.9732026 21.120 0.00000000
KVIZ 54 -1 1.5555556 1.400 0.00000000
Zadavanje 54 -1 1.4217625 6.500 0.00000000
KOL2 0 1 0.9444444 14.970 0.50000000
KOL1 48 -1 3.0583333 23.960 0.00000000
KVIZ 48 -1 2.1947674 1.400 0.00000000
KOL2 48 -1 1.8728992 23.435 0.00000000
Zadavanje 48 -1 1.2015217 6.500 0.00000000
KVIZ 0 -1 0.7916667 6.900 0.44444444
KOL2 0 -1 0.7291667 24.305 0.27777778
Zadavanje 0 -1 0.6458333 8.500 0.05555556
Prediktori
Response - Klasifikacija studenata na temelju broja bodova dobivenih na procjeni rješenja problema
Hiperparametri - za odabir optimalne kombinacije hiperparametara napravljen je cross-validation s 10 kutija pri čemu je na slučajni način isprobano 5000 kombinacija hiperparametara mtry, min_n i trees preko uzorkovanja na latinskoj hiperkocki. Najbolji model je odabran s obzirom na roc_auc metriku.
Response - Klasifikacija studenata na temelju broja bodova dobivenih na procjeni rješenja problema
Hiperparametri - za odabir optimalne kombinacije hiperparametara napravljen je cross-validation s 10 kutija pri čemu je na slučajni način isprobano 5000 kombinacija hiperparametara tree_depth, min_n i cost_complexity preko uzorkovanja na latinskoj hiperkocki. Najbolji model je odabran s obzirom na roc_auc metriku.
prob | KVIZ | KOL1 | KOL2 | Zadavanje | Rjesavanje | Procjena | UKUPNO |
---|---|---|---|---|---|---|---|
0% | 0.0000 | 0.0000 | 0.0000 | 0.00 | 0.0000 | 0.0000 | 0.0000 |
33% | 2.8883 | 18.3661 | 20.7949 | 4.00 | 0.0000 | 0.7493 | 53.7915 |
67% | 6.0312 | 23.3678 | 22.6717 | 6.39 | 9.2429 | 1.9778 | 64.5833 |
100% | 9.2700 | 27.4700 | 30.0000 | 9.00 | 17.4100 | 2.8000 | 93.2500 |
KVIZ | KOL1 | KOL2 | Zadavanje | Rjesavanje | Procjena | UKUPNO | |
---|---|---|---|---|---|---|---|
Min. :0.000 | Min. : 0.00 | Min. : 0.00 | Min. :0.000 | Min. : 0.000 | Min. :0.000 | Min. : 0.00 | |
1st Qu.:1.975 | 1st Qu.:17.49 | 1st Qu.:19.58 | 1st Qu.:3.000 | 1st Qu.: 0.000 | 1st Qu.:0.000 | 1st Qu.:50.80 | |
Median :4.405 | Median :21.12 | Median :21.82 | Median :5.000 | Median : 7.370 | Median :1.550 | Median :60.59 | |
Mean :4.438 | Mean :19.84 | Mean :20.83 | Mean :5.136 | Mean : 6.454 | Mean :1.272 | Mean :58.22 | |
3rd Qu.:6.537 | 3rd Qu.:23.89 | 3rd Qu.:23.48 | 3rd Qu.:7.000 | 3rd Qu.:11.918 | 3rd Qu.:2.170 | 3rd Qu.:68.11 | |
Max. :9.270 | Max. :27.47 | Max. :30.00 | Max. :9.000 | Max. :17.410 | Max. :2.800 | Max. :93.25 |
# A tibble: 20 × 9
mtry trees min_n .metric .estimator mean n std_err .config
<int> <int> <int> <chr> <chr> <dbl> <int> <dbl> <chr>
1 3 606 10 roc_auc hand_till 0.750 10 0.0587 Preprocessor1_Model…
2 3 784 7 roc_auc hand_till 0.749 10 0.0581 Preprocessor1_Model…
3 3 1155 8 roc_auc hand_till 0.748 10 0.0570 Preprocessor1_Model…
4 3 472 10 roc_auc hand_till 0.748 10 0.0578 Preprocessor1_Model…
5 3 521 13 roc_auc hand_till 0.746 10 0.0546 Preprocessor1_Model…
6 3 457 14 roc_auc hand_till 0.746 10 0.0537 Preprocessor1_Model…
7 3 763 5 roc_auc hand_till 0.745 10 0.0579 Preprocessor1_Model…
8 3 686 7 roc_auc hand_till 0.745 10 0.0588 Preprocessor1_Model…
9 3 603 9 roc_auc hand_till 0.745 10 0.0573 Preprocessor1_Model…
10 3 489 5 roc_auc hand_till 0.745 10 0.0591 Preprocessor1_Model…
11 3 481 3 roc_auc hand_till 0.745 10 0.0608 Preprocessor1_Model…
12 3 443 11 roc_auc hand_till 0.745 10 0.0561 Preprocessor1_Model…
13 2 710 5 roc_auc hand_till 0.744 10 0.0610 Preprocessor1_Model…
14 3 1060 7 roc_auc hand_till 0.744 10 0.0574 Preprocessor1_Model…
15 3 1214 4 roc_auc hand_till 0.744 10 0.0572 Preprocessor1_Model…
16 3 475 6 roc_auc hand_till 0.744 10 0.0582 Preprocessor1_Model…
17 3 867 7 roc_auc hand_till 0.744 10 0.0576 Preprocessor1_Model…
18 3 508 4 roc_auc hand_till 0.744 10 0.0609 Preprocessor1_Model…
19 3 571 10 roc_auc hand_till 0.744 10 0.0568 Preprocessor1_Model…
20 3 559 10 roc_auc hand_till 0.743 10 0.0565 Preprocessor1_Model…
# A tibble: 9 × 4
.metric .estimator trening test
<chr> <chr> <dbl> <dbl>
1 sens macro 0.950 0.496
2 precision macro 0.946 0.515
3 spec macro 0.975 0.750
4 accuracy multiclass 0.948 0.5
5 f_meas macro 0.947 0.483
6 mcc multiclass 0.923 0.260
7 kap multiclass 0.922 0.247
8 roc_auc hand_till 0.996 0.687
9 mn_log_loss multiclass 0.418 0.964
# A tibble: 20 × 9
cost_complexity tree_depth min_n .metric .estimator mean n std_err
<dbl> <int> <int> <chr> <chr> <dbl> <int> <dbl>
1 1.92e- 2 12 10 roc_auc hand_till 0.758 10 0.0350
2 1.97e- 2 7 8 roc_auc hand_till 0.757 10 0.0347
3 2.67e- 2 7 9 roc_auc hand_till 0.757 10 0.0347
4 2.04e- 2 10 8 roc_auc hand_till 0.757 10 0.0347
5 2.31e- 2 15 10 roc_auc hand_till 0.757 10 0.0347
6 2.40e- 2 11 8 roc_auc hand_till 0.757 10 0.0347
7 2.63e- 2 7 9 roc_auc hand_till 0.757 10 0.0347
8 1.96e- 2 7 8 roc_auc hand_till 0.757 10 0.0347
9 2.00e- 2 9 8 roc_auc hand_till 0.757 10 0.0347
10 3.84e- 9 8 8 roc_auc hand_till 0.75 10 0.0382
11 2.37e- 8 8 8 roc_auc hand_till 0.75 10 0.0382
12 3.74e- 7 8 8 roc_auc hand_till 0.75 10 0.0382
13 1.56e-10 8 8 roc_auc hand_till 0.75 10 0.0382
14 2.23e- 5 8 8 roc_auc hand_till 0.75 10 0.0382
15 1.66e- 6 8 8 roc_auc hand_till 0.75 10 0.0382
16 1.62e- 6 8 8 roc_auc hand_till 0.75 10 0.0382
17 6.66e-10 8 8 roc_auc hand_till 0.75 10 0.0382
18 1.97e- 2 11 11 roc_auc hand_till 0.748 10 0.0358
19 2.77e- 2 7 17 roc_auc hand_till 0.746 10 0.0269
20 3.44e- 2 10 17 roc_auc hand_till 0.746 10 0.0269
# ℹ 1 more variable: .config <chr>
# A tibble: 9 × 4
.metric .estimator trening test
<chr> <chr> <dbl> <dbl>
1 sens macro 0.837 0.589
2 precision macro 0.853 0.593
3 spec macro 0.919 0.791
4 accuracy multiclass 0.844 0.588
5 f_meas macro 0.842 0.587
6 mcc multiclass 0.765 0.379
7 kap multiclass 0.762 0.377
8 roc_auc hand_till 0.939 0.784
9 mn_log_loss multiclass 0.424 3.89
n= 96
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 96 59 1 (0.38541667 0.29166667 0.32291667)
2) Zadavanje< 2.5 23 2 1 (0.91304348 0.00000000 0.08695652) *
3) Zadavanje>=2.5 73 44 3 (0.21917808 0.38356164 0.39726027)
6) KOL2>=22.07 35 22 1 (0.37142857 0.34285714 0.28571429)
12) KVIZ< 8.485 28 15 1 (0.46428571 0.39285714 0.14285714)
24) Rjesavanje< 4.46 11 2 1 (0.81818182 0.09090909 0.09090909) *
25) Rjesavanje>=4.46 17 7 2 (0.23529412 0.58823529 0.17647059)
50) KVIZ< 4.79 8 4 1 (0.50000000 0.25000000 0.25000000) *
51) KVIZ>=4.79 9 1 2 (0.00000000 0.88888889 0.11111111) *
13) KVIZ>=8.485 7 1 3 (0.00000000 0.14285714 0.85714286) *
7) KOL2< 22.07 38 19 3 (0.07894737 0.42105263 0.50000000)
14) KOL2< 14.97 5 1 2 (0.20000000 0.80000000 0.00000000) *
15) KOL2>=14.97 33 14 3 (0.06060606 0.36363636 0.57575758)
30) KOL1>=20.005 23 11 2 (0.04347826 0.52173913 0.43478261)
60) KOL1< 21.875 6 0 2 (0.00000000 1.00000000 0.00000000) *
61) KOL1>=21.875 17 7 3 (0.05882353 0.35294118 0.58823529)
122) KOL1>=23.645 12 6 2 (0.08333333 0.50000000 0.41666667)
244) KOL2< 20.32 4 0 2 (0.00000000 1.00000000 0.00000000) *
245) KOL2>=20.32 8 3 3 (0.12500000 0.25000000 0.62500000) *
123) KOL1< 23.645 5 0 3 (0.00000000 0.00000000 1.00000000) *
31) KOL1< 20.005 10 1 3 (0.10000000 0.00000000 0.90000000) *
Call:
rpart::rpart(formula = ..y ~ ., data = data, cp = ~0.0191628354815078,
maxdepth = ~12, minsplit = min_rows(10, data))
n= 96
CP nsplit rel error xerror xstd
1 0.22033898 0 1.0000000 1.0000000 0.08082380
2 0.08474576 1 0.7796610 0.8644068 0.08287122
3 0.06779661 4 0.5254237 0.8135593 0.08303355
4 0.05084746 5 0.4576271 0.8135593 0.08303355
5 0.03389831 7 0.3559322 0.7796610 0.08296144
6 0.01916284 10 0.2542373 0.7118644 0.08238230
Variable importance
Zadavanje KOL1 KOL2 KVIZ Rjesavanje
26 22 21 17 15
Node number 1: 96 observations, complexity param=0.220339
predicted class=1 expected loss=0.6145833 P(node) =1
class counts: 37 28 31
probabilities: 0.385 0.292 0.323
left son=2 (23 obs) right son=3 (73 obs)
Primary splits:
Zadavanje < 2.5 to the left, improve=12.677450, (0 missing)
KVIZ < 6.845 to the left, improve= 5.412500, (0 missing)
KOL1 < 19.93 to the left, improve= 4.991071, (0 missing)
Rjesavanje < 1.845 to the left, improve= 3.268382, (0 missing)
KOL2 < 17.98 to the left, improve= 2.350289, (0 missing)
Surrogate splits:
Rjesavanje < 18.585 to the right, agree=0.812, adj=0.217, (0 split)
KVIZ < 0.295 to the left, agree=0.792, adj=0.130, (0 split)
KOL2 < 17.98 to the left, agree=0.792, adj=0.130, (0 split)
KOL1 < 2.44 to the left, agree=0.781, adj=0.087, (0 split)
Node number 2: 23 observations
predicted class=1 expected loss=0.08695652 P(node) =0.2395833
class counts: 21 0 2
probabilities: 0.913 0.000 0.087
Node number 3: 73 observations, complexity param=0.08474576
predicted class=3 expected loss=0.6027397 P(node) =0.7604167
class counts: 16 28 29
probabilities: 0.219 0.384 0.397
left son=6 (35 obs) right son=7 (38 obs)
Primary splits:
KOL2 < 22.07 to the right, improve=2.5065610, (0 missing)
KVIZ < 8.485 to the left, improve=2.4579850, (0 missing)
Rjesavanje < 8.77 to the left, improve=1.8379820, (0 missing)
KOL1 < 22.585 to the left, improve=1.1109250, (0 missing)
Zadavanje < 8.5 to the right, improve=0.7982613, (0 missing)
Surrogate splits:
KVIZ < 5.215 to the right, agree=0.671, adj=0.314, (0 split)
KOL1 < 18.315 to the right, agree=0.616, adj=0.200, (0 split)
Rjesavanje < 8.695 to the left, agree=0.589, adj=0.143, (0 split)
Zadavanje < 7.5 to the right, agree=0.562, adj=0.086, (0 split)
Node number 6: 35 observations, complexity param=0.08474576
predicted class=1 expected loss=0.6285714 P(node) =0.3645833
class counts: 13 12 10
probabilities: 0.371 0.343 0.286
left son=12 (28 obs) right son=13 (7 obs)
Primary splits:
KVIZ < 8.485 to the left, improve=4.4142860, (0 missing)
Rjesavanje < 4.46 to the left, improve=4.1057970, (0 missing)
KOL1 < 23.17 to the left, improve=1.7065790, (0 missing)
KOL2 < 23.435 to the left, improve=1.2915030, (0 missing)
Zadavanje < 6.5 to the left, improve=0.8666667, (0 missing)
Surrogate splits:
KOL1 < 26.44 to the left, agree=0.857, adj=0.286, (0 split)
Node number 7: 38 observations, complexity param=0.06779661
predicted class=3 expected loss=0.5 P(node) =0.3958333
class counts: 3 16 19
probabilities: 0.079 0.421 0.500
left son=14 (5 obs) right son=15 (33 obs)
Primary splits:
KOL2 < 14.97 to the left, improve=2.3505580, (0 missing)
Rjesavanje < 15.18 to the right, improve=1.9263160, (0 missing)
KVIZ < 5.42 to the left, improve=1.2120300, (0 missing)
Zadavanje < 5.5 to the left, improve=0.9922249, (0 missing)
KOL1 < 20.005 to the right, improve=0.9118230, (0 missing)
Surrogate splits:
KOL1 < 12.96 to the left, agree=0.947, adj=0.6, (0 split)
Node number 12: 28 observations, complexity param=0.08474576
predicted class=1 expected loss=0.5357143 P(node) =0.2916667
class counts: 13 11 4
probabilities: 0.464 0.393 0.143
left son=24 (11 obs) right son=25 (17 obs)
Primary splits:
Rjesavanje < 4.46 to the left, improve=3.9698240, (0 missing)
KVIZ < 5.46 to the left, improve=3.6047620, (0 missing)
KOL2 < 23.265 to the left, improve=1.1945050, (0 missing)
Zadavanje < 8.5 to the left, improve=0.9047619, (0 missing)
KOL1 < 18.66 to the left, improve=0.7077922, (0 missing)
Surrogate splits:
Zadavanje < 4.5 to the left, agree=0.750, adj=0.364, (0 split)
KOL2 < 23.265 to the left, agree=0.714, adj=0.273, (0 split)
KOL1 < 13.52 to the left, agree=0.679, adj=0.182, (0 split)
KVIZ < 1.385 to the left, agree=0.643, adj=0.091, (0 split)
Node number 13: 7 observations
predicted class=3 expected loss=0.1428571 P(node) =0.07291667
class counts: 0 1 6
probabilities: 0.000 0.143 0.857
Node number 14: 5 observations
predicted class=2 expected loss=0.2 P(node) =0.05208333
class counts: 1 4 0
probabilities: 0.200 0.800 0.000
Node number 15: 33 observations, complexity param=0.05084746
predicted class=3 expected loss=0.4242424 P(node) =0.34375
class counts: 2 12 19
probabilities: 0.061 0.364 0.576
left son=30 (23 obs) right son=31 (10 obs)
Primary splits:
KOL1 < 20.005 to the right, improve=3.4279310, (0 missing)
Rjesavanje < 8.77 to the left, improve=2.4868690, (0 missing)
KOL2 < 18.95 to the right, improve=2.3164980, (0 missing)
Zadavanje < 6.5 to the left, improve=1.5757580, (0 missing)
KVIZ < 5.42 to the left, improve=0.6279315, (0 missing)
Surrogate splits:
KOL2 < 18.95 to the right, agree=0.818, adj=0.4, (0 split)
Node number 24: 11 observations
predicted class=1 expected loss=0.1818182 P(node) =0.1145833
class counts: 9 1 1
probabilities: 0.818 0.091 0.091
Node number 25: 17 observations, complexity param=0.03389831
predicted class=2 expected loss=0.4117647 P(node) =0.1770833
class counts: 4 10 3
probabilities: 0.235 0.588 0.176
left son=50 (8 obs) right son=51 (9 obs)
Primary splits:
KVIZ < 4.79 to the left, improve=2.8692810, (0 missing)
Rjesavanje < 11.69 to the right, improve=1.0637250, (0 missing)
Zadavanje < 8.5 to the left, improve=0.9327731, (0 missing)
KOL2 < 23.79 to the left, improve=0.6192810, (0 missing)
KOL1 < 21.29 to the right, improve=0.4526144, (0 missing)
Surrogate splits:
KOL2 < 23.79 to the left, agree=0.824, adj=0.625, (0 split)
Zadavanje < 6 to the left, agree=0.765, adj=0.500, (0 split)
KOL1 < 18.52 to the left, agree=0.647, adj=0.250, (0 split)
Rjesavanje < 8.695 to the right, agree=0.647, adj=0.250, (0 split)
Node number 30: 23 observations, complexity param=0.05084746
predicted class=2 expected loss=0.4782609 P(node) =0.2395833
class counts: 1 12 10
probabilities: 0.043 0.522 0.435
left son=60 (6 obs) right son=61 (17 obs)
Primary splits:
KOL1 < 21.875 to the left, improve=3.406650, (0 missing)
Zadavanje < 5.5 to the left, improve=2.585921, (0 missing)
KOL2 < 21.67 to the left, improve=2.328218, (0 missing)
KVIZ < 4.585 to the left, improve=2.287220, (0 missing)
Rjesavanje < 8.77 to the left, improve=1.409365, (0 missing)
Surrogate splits:
Zadavanje < 4.5 to the left, agree=0.826, adj=0.333, (0 split)
Rjesavanje < 1.845 to the left, agree=0.826, adj=0.333, (0 split)
KVIZ < 4.585 to the left, agree=0.783, adj=0.167, (0 split)
Node number 31: 10 observations
predicted class=3 expected loss=0.1 P(node) =0.1041667
class counts: 1 0 9
probabilities: 0.100 0.000 0.900
Node number 50: 8 observations
predicted class=1 expected loss=0.5 P(node) =0.08333333
class counts: 4 2 2
probabilities: 0.500 0.250 0.250
Node number 51: 9 observations
predicted class=2 expected loss=0.1111111 P(node) =0.09375
class counts: 0 8 1
probabilities: 0.000 0.889 0.111
Node number 60: 6 observations
predicted class=2 expected loss=0 P(node) =0.0625
class counts: 0 6 0
probabilities: 0.000 1.000 0.000
Node number 61: 17 observations, complexity param=0.03389831
predicted class=3 expected loss=0.4117647 P(node) =0.1770833
class counts: 1 6 10
probabilities: 0.059 0.353 0.588
left son=122 (12 obs) right son=123 (5 obs)
Primary splits:
KOL1 < 23.645 to the right, improve=2.1078430, (0 missing)
KOL2 < 21.67 to the left, improve=2.1078430, (0 missing)
Rjesavanje < 9.61 to the right, improve=0.9108734, (0 missing)
Zadavanje < 5.5 to the left, improve=0.9027149, (0 missing)
KVIZ < 5.42 to the left, improve=0.6078431, (0 missing)
Surrogate splits:
KOL2 < 21.85 to the left, agree=0.824, adj=0.4, (0 split)
Node number 122: 12 observations, complexity param=0.03389831
predicted class=2 expected loss=0.5 P(node) =0.125
class counts: 1 6 5
probabilities: 0.083 0.500 0.417
left son=244 (4 obs) right son=245 (8 obs)
Primary splits:
KOL2 < 20.32 to the left, improve=2.5833330, (0 missing)
Zadavanje < 6.5 to the left, improve=1.1666670, (0 missing)
KOL1 < 25.15 to the right, improve=0.5833333, (0 missing)
KVIZ < 6.055 to the left, improve=0.4333333, (0 missing)
Rjesavanje < 5.68 to the right, improve=0.3888889, (0 missing)
Surrogate splits:
Rjesavanje < 15.18 to the right, agree=0.833, adj=0.50, (0 split)
KVIZ < 7.19 to the right, agree=0.750, adj=0.25, (0 split)
Zadavanje < 8 to the right, agree=0.750, adj=0.25, (0 split)
Node number 123: 5 observations
predicted class=3 expected loss=0 P(node) =0.05208333
class counts: 0 0 5
probabilities: 0.000 0.000 1.000
Node number 244: 4 observations
predicted class=2 expected loss=0 P(node) =0.04166667
class counts: 0 4 0
probabilities: 0.000 1.000 0.000
Node number 245: 8 observations
predicted class=3 expected loss=0.375 P(node) =0.08333333
class counts: 1 2 5
probabilities: 0.125 0.250 0.625
count ncat improve index adj
Zadavanje 96 -1 12.6774494 2.500 0.00000000
KVIZ 96 -1 5.4125000 6.845 0.00000000
KOL1 96 -1 4.9910714 19.930 0.00000000
Rjesavanje 96 -1 3.2683824 1.845 0.00000000
KOL2 96 -1 2.3502885 17.980 0.00000000
Rjesavanje 0 1 0.8125000 18.585 0.21739130
KVIZ 0 -1 0.7916667 0.295 0.13043478
KOL2 0 -1 0.7916667 17.980 0.13043478
KOL1 0 -1 0.7812500 2.440 0.08695652
KOL2 73 1 2.5065609 22.070 0.00000000
KVIZ 73 -1 2.4579849 8.485 0.00000000
Rjesavanje 73 -1 1.8379818 8.770 0.00000000
KOL1 73 -1 1.1109255 22.585 0.00000000
Zadavanje 73 1 0.7982613 8.500 0.00000000
KVIZ 0 1 0.6712329 5.215 0.31428571
KOL1 0 1 0.6164384 18.315 0.20000000
Rjesavanje 0 -1 0.5890411 8.695 0.14285714
Zadavanje 0 1 0.5616438 7.500 0.08571429
KVIZ 35 -1 4.4142857 8.485 0.00000000
Rjesavanje 35 -1 4.1057971 4.460 0.00000000
KOL1 35 -1 1.7065789 23.170 0.00000000
KOL2 35 -1 1.2915033 23.435 0.00000000
Zadavanje 35 -1 0.8666667 6.500 0.00000000
KOL1 0 -1 0.8571429 26.440 0.28571429
Rjesavanje 28 -1 3.9698243 4.460 0.00000000
KVIZ 28 -1 3.6047619 5.460 0.00000000
KOL2 28 -1 1.1945055 23.265 0.00000000
Zadavanje 28 -1 0.9047619 8.500 0.00000000
KOL1 28 -1 0.7077922 18.660 0.00000000
Zadavanje 0 -1 0.7500000 4.500 0.36363636
KOL2 0 -1 0.7142857 23.265 0.27272727
KOL1 0 -1 0.6785714 13.520 0.18181818
KVIZ 0 -1 0.6428571 1.385 0.09090909
KVIZ 17 -1 2.8692810 4.790 0.00000000
Rjesavanje 17 1 1.0637255 11.690 0.00000000
Zadavanje 17 -1 0.9327731 8.500 0.00000000
KOL2 17 -1 0.6192810 23.790 0.00000000
KOL1 17 1 0.4526144 21.290 0.00000000
KOL2 0 -1 0.8235294 23.790 0.62500000
Zadavanje 0 -1 0.7647059 6.000 0.50000000
KOL1 0 -1 0.6470588 18.520 0.25000000
Rjesavanje 0 1 0.6470588 8.695 0.25000000
KOL2 38 -1 2.3505582 14.970 0.00000000
Rjesavanje 38 1 1.9263158 15.180 0.00000000
KVIZ 38 -1 1.2120301 5.420 0.00000000
Zadavanje 38 -1 0.9922249 5.500 0.00000000
KOL1 38 1 0.9118230 20.005 0.00000000
KOL1 0 -1 0.9473684 12.960 0.60000000
KOL1 33 1 3.4279315 20.005 0.00000000
Rjesavanje 33 -1 2.4868687 8.770 0.00000000
KOL2 33 1 2.3164983 18.950 0.00000000
Zadavanje 33 -1 1.5757576 6.500 0.00000000
KVIZ 33 -1 0.6279315 5.420 0.00000000
KOL2 0 1 0.8181818 18.950 0.40000000
KOL1 23 -1 3.4066496 21.875 0.00000000
Zadavanje 23 -1 2.5859213 5.500 0.00000000
KOL2 23 -1 2.3282182 21.670 0.00000000
KVIZ 23 -1 2.2872200 4.585 0.00000000
Rjesavanje 23 -1 1.4093645 8.770 0.00000000
Zadavanje 0 -1 0.8260870 4.500 0.33333333
Rjesavanje 0 -1 0.8260870 1.845 0.33333333
KVIZ 0 -1 0.7826087 4.585 0.16666667
KOL1 17 1 2.1078431 23.645 0.00000000
KOL2 17 -1 2.1078431 21.670 0.00000000
Rjesavanje 17 1 0.9108734 9.610 0.00000000
Zadavanje 17 -1 0.9027149 5.500 0.00000000
KVIZ 17 -1 0.6078431 5.420 0.00000000
KOL2 0 -1 0.8235294 21.850 0.40000000
KOL2 12 -1 2.5833333 20.320 0.00000000
Zadavanje 12 -1 1.1666667 6.500 0.00000000
KOL1 12 1 0.5833333 25.150 0.00000000
KVIZ 12 -1 0.4333333 6.055 0.00000000
Rjesavanje 12 1 0.3888889 5.680 0.00000000
Rjesavanje 0 1 0.8333333 15.180 0.50000000
KVIZ 0 1 0.7500000 7.190 0.25000000
Zadavanje 0 1 0.7500000 8.000 0.25000000
---
title: "DSTG - projekt2023"
output:
flexdashboard::flex_dashboard:
social: menu
orientation: columns
vertical_layout: fill
source_code: embed
---
```{css, echo=FALSE}
.sidebar { overflow: auto; }
.dataTables_scrollBody {
height:95% !important;
max-height:95% !important;
}
.chart-stage-flex {
overflow:auto !important;
}
```
```{r setup, include=FALSE}
library(psych)
library(tidyverse)
library(readxl)
library(tidymodels)
library(vip)
library(knitr)
library(kableExtra)
library(Boruta)
library(ggbump)
library(ggsankey)
library(ggridges)
library(corrplot)
library(ggsankey)
library(rpart.plot)
library(kableExtra)
library(DALEXtra)
rf_metrike <- metric_set(roc_auc, sens, precision, spec, accuracy, f_meas, mcc, kap, mn_log_loss)
update_geom_defaults(geom = "tile", new = list(color = "black"))
Zadavanje_fit <- readRDS("modeli/RF_fit_Zadavanje.rds")
Zadavanje_tuning <- readRDS("modeli/RF_tuning_Zadavanje.rds")
Zadavanje_work <- readRDS("modeli/RF_work_Zadavanje.rds")
Zadavanje_permfit <- readRDS("modeli/RFperm_fit_Zadavanje.rds")
Zadavanje_Boruta <- readRDS("modeli/Boruta_Zadavanje.rds")
Zadavanje_fit_tree <- readRDS("modeli/stablo_fit_Zadavanje.rds")
Zadavanje_tuning_tree <- readRDS("modeli/stablo_tuning_Zadavanje.rds")
Zadavanje_work_tree <- readRDS("modeli/stablo_work_Zadavanje.rds")
Rjesavanje_fit <- readRDS("modeli/RF_fit_Rjesavanje.rds")
Rjesavanje_tuning <- readRDS("modeli/RF_tuning_Rjesavanje.rds")
Rjesavanje_work <- readRDS("modeli/RF_work_Rjesavanje.rds")
Rjesavanje_permfit <- readRDS("modeli/RFperm_fit_Rjesavanje.rds")
Rjesavanje_Boruta <- readRDS("modeli/Boruta_Rjesavanje.rds")
Rjesavanje_fit_tree <- readRDS("modeli/stablo_fit_Rjesavanje.rds")
Rjesavanje_tuning_tree <- readRDS("modeli/stablo_tuning_Rjesavanje.rds")
Rjesavanje_work_tree <- readRDS("modeli/stablo_work_Rjesavanje.rds")
Procjena_fit <- readRDS("modeli/RF_fit_Procjena.rds")
Procjena_tuning <- readRDS("modeli/RF_tuning_Procjena.rds")
Procjena_work <- readRDS("modeli/RF_work_Procjena.rds")
Procjena_permfit <- readRDS("modeli/RFperm_fit_Procjena.rds")
Procjena_Boruta <- readRDS("modeli/Boruta_Procjena.rds")
Procjena_fit_tree <- readRDS("modeli/stablo_fit_Procjena.rds")
Procjena_tuning_tree <- readRDS("modeli/stablo_tuning_Procjena.rds")
Procjena_work_tree <- readRDS("modeli/stablo_work_Procjena.rds")
podaci <- read_excel("podaci/DSTG_projekt2023.xlsx") %>%
mutate_at(vars(KP1:Procjena), as.numeric) %>%
filter(Status == "redovni", Upis == "prvi upis") %>%
mutate(KVIZ = rowSums(across(KP1:KP10))) %>%
select(KVIZ, KOL1, KOL2, Zadavanje, Rjesavanje, Procjena, UKUPNO)
testRes <- cor.mtest(podaci, conf.level = 0.975)
qvant <- podaci %>% reframe(across(KVIZ:UKUPNO, ~quantile(., prob = c(0,0.33,0.67,1)))) %>%
mutate(prob = c('0%', '33%', '67%', '100%')) %>% relocate(prob)
podaci_sankey <- podaci %>% replace(is.na(.), 0) %>% select(KOL1:KOL2, Zadavanje, Rjesavanje, Procjena) %>%
mutate(KOL1 = case_when(KOL1 <= pull(qvant[2, "KOL1"]) ~ "1",
KOL1 <= pull(qvant[3, "KOL1"]) ~ "2",
.default = "3"),
KOL2 = case_when(KOL2 <= pull(qvant[2, "KOL2"]) ~ "1",
KOL2 <= pull(qvant[3, "KOL2"]) ~ "2",
.default = "3"),
Zadavanje = case_when(Zadavanje <= pull(qvant[2, "Zadavanje"]) ~ "1",
Zadavanje <= pull(qvant[3, "Zadavanje"]) ~ "2",
.default = "3"),
Rjesavanje = case_when(Rjesavanje <= pull(qvant[2, "Rjesavanje"]) ~ "1",
Rjesavanje <= pull(qvant[3, "Rjesavanje"]) ~ "2",
.default = "3"),
Procjena = case_when(Procjena <= pull(qvant[2, "Procjena"]) ~ "1",
Procjena <= pull(qvant[3, "Procjena"]) ~ "2",
.default = "3")) %>%
mutate_at(vars(KOL1:Procjena), ~fct_relevel(., c("1","2","3"))) %>%
make_long(KOL1, KOL2, Zadavanje, Rjesavanje, Procjena)
postoci <- podaci_sankey %>% group_by(x, node) %>% summarize(n = n()) %>%
ungroup(node) %>% mutate(pct2 = n / sum(n) * 100, pct = round(pct2))
podaci_sankey <- podaci_sankey %>% left_join(postoci, by = c("x","node"))
podaci_long <- podaci %>%
pivot_longer(everything(), names_to = "varijabla", values_to = "vrijednost")
DSTG_blank <- data.frame(
varijabla = factor(rep(c("KOL1", "KOL2", "KVIZ", "Procjena",
"Rjesavanje", "UKUPNO", "Zadavanje"), each = 2)),
x = c(0,30,0,30,0,10,0,3,0,18,0,100,0,9),
y = 0
)
zadavanje_explainer <- explain_tidymodels(
Zadavanje_fit %>% extract_workflow(),
data = Zadavanje_work %>% select(KVIZ:KOL2),
y = Zadavanje_work %>% select(Zadavanje),
verbose = FALSE
)
pdp_zadavanje_all <- model_profile(
zadavanje_explainer,
variables = NULL,
N = NULL
)
cond_zadavanje_all <- model_profile(
zadavanje_explainer,
variables = NULL,
N = NULL,
type = "conditional"
)
acc_zadavanje_all <- model_profile(
zadavanje_explainer,
variables = NULL,
N = NULL,
type = "accumulated"
)
ob2_zadavanje <- predict_parts(
explainer = zadavanje_explainer,
new_observation = Zadavanje_work %>% select(KVIZ:KOL2) %>% slice(2)
)
ob45_zadavanje <- predict_parts(
explainer = zadavanje_explainer,
new_observation = Zadavanje_work %>% select(KVIZ:KOL2) %>% slice(45)
)
ob88_zadavanje <- predict_parts(
explainer = zadavanje_explainer,
new_observation = Zadavanje_work %>% select(KVIZ:KOL2) %>% slice(88)
)
ob2_zadavanje_shap <- predict_parts(
explainer = zadavanje_explainer,
new_observation = Zadavanje_work %>% select(KVIZ:KOL2) %>% slice(2),
type = "shap",
B = 20
)
ob45_zadavanje_shap <- predict_parts(
explainer = zadavanje_explainer,
new_observation = Zadavanje_work %>% select(KVIZ:KOL2) %>% slice(45),
type = "shap",
B = 20
)
ob88_zadavanje_shap <- predict_parts(
explainer = zadavanje_explainer,
new_observation = Zadavanje_work %>% select(KVIZ:KOL2) %>% slice(88),
type = "shap",
B = 20
)
krivulje_zadavanje <- as_tibble(pdp_zadavanje_all$agr_profiles) %>% mutate(vrsta = "PDP") %>%
bind_rows(as_tibble(cond_zadavanje_all$agr_profiles) %>% mutate(vrsta = "M")) %>%
bind_rows(as_tibble(acc_zadavanje_all$agr_profiles) %>% mutate(vrsta = "ALE"))
rjesavanje_explainer <- explain_tidymodels(
Rjesavanje_fit %>% extract_workflow(),
data = Rjesavanje_work %>% select(KVIZ:Zadavanje),
y = Rjesavanje_work %>% select(Rjesavanje),
verbose = FALSE
)
pdp_rjesavanje_all <- model_profile(
rjesavanje_explainer,
variables = NULL,
N = NULL
)
cond_rjesavanje_all <- model_profile(
rjesavanje_explainer,
variables = NULL,
N = NULL,
type = "conditional"
)
acc_rjesavanje_all <- model_profile(
rjesavanje_explainer,
variables = NULL,
N = NULL,
type = "accumulated"
)
ob2_rjesavanje <- predict_parts(
explainer = rjesavanje_explainer,
new_observation = Rjesavanje_work %>% select(KVIZ:Zadavanje) %>% slice(2)
)
ob45_rjesavanje <- predict_parts(
explainer = rjesavanje_explainer,
new_observation = Rjesavanje_work %>% select(KVIZ:Zadavanje) %>% slice(45)
)
ob88_rjesavanje <- predict_parts(
explainer = rjesavanje_explainer,
new_observation = Rjesavanje_work %>% select(KVIZ:Zadavanje) %>% slice(88)
)
ob2_rjesavanje_shap <- predict_parts(
explainer = rjesavanje_explainer,
new_observation = Rjesavanje_work %>% select(KVIZ:Zadavanje) %>% slice(2),
type = "shap",
B = 20
)
ob45_rjesavanje_shap <- predict_parts(
explainer = rjesavanje_explainer,
new_observation = Rjesavanje_work %>% select(KVIZ:Zadavanje) %>% slice(45),
type = "shap",
B = 20
)
ob88_rjesavanje_shap <- predict_parts(
explainer = rjesavanje_explainer,
new_observation = Rjesavanje_work %>% select(KVIZ:Zadavanje) %>% slice(88),
type = "shap",
B = 20
)
krivulje_rjesavanje <- as_tibble(pdp_rjesavanje_all$agr_profiles) %>% mutate(vrsta = "PDP") %>%
bind_rows(as_tibble(cond_rjesavanje_all$agr_profiles) %>% mutate(vrsta = "M")) %>%
bind_rows(as_tibble(acc_rjesavanje_all$agr_profiles) %>% mutate(vrsta = "ALE"))
procjena_explainer <- explain_tidymodels(
Procjena_fit %>% extract_workflow(),
data = Procjena_work %>% select(KVIZ:Rjesavanje),
y = Procjena_work %>% select(Procjena),
verbose = FALSE
)
pdp_procjena_all <- model_profile(
procjena_explainer,
variables = NULL,
N = NULL
)
cond_procjena_all <- model_profile(
procjena_explainer,
variables = NULL,
N = NULL,
type = "conditional"
)
acc_procjena_all <- model_profile(
procjena_explainer,
variables = NULL,
N = NULL,
type = "accumulated"
)
ob2_procjena <- predict_parts(
explainer = procjena_explainer,
new_observation = Procjena_work %>% select(KVIZ:Rjesavanje) %>% slice(2)
)
ob45_procjena <- predict_parts(
explainer = procjena_explainer,
new_observation = Procjena_work %>% select(KVIZ:Rjesavanje) %>% slice(45)
)
ob88_procjena <- predict_parts(
explainer = procjena_explainer,
new_observation = Procjena_work %>% select(KVIZ:Rjesavanje) %>% slice(88)
)
ob2_procjena_shap <- predict_parts(
explainer = procjena_explainer,
new_observation = Procjena_work %>% select(KVIZ:Rjesavanje) %>% slice(2),
type = "shap",
B = 20
)
ob45_procjena_shap <- predict_parts(
explainer = procjena_explainer,
new_observation = Procjena_work %>% select(KVIZ:Rjesavanje) %>% slice(45),
type = "shap",
B = 20
)
ob88_procjena_shap <- predict_parts(
explainer = procjena_explainer,
new_observation = Procjena_work %>% select(KVIZ:Rjesavanje) %>% slice(88),
type = "shap",
B = 20
)
krivulje_procjena <- as_tibble(pdp_procjena_all$agr_profiles) %>% mutate(vrsta = "PDP") %>%
bind_rows(as_tibble(cond_procjena_all$agr_profiles) %>% mutate(vrsta = "M")) %>%
bind_rows(as_tibble(acc_procjena_all$agr_profiles) %>% mutate(vrsta = "ALE"))
```
# Distribucije varijabli {data-navmenu="DESKRIPTIVA"}
## Column 1
### Distribucije varijabli
```{r warning=FALSE, fig.width=18, fig.height=9}
ggplot(podaci_long, aes(x = vrijednost)) +
geom_histogram(aes(y=after_stat(density)),
color="#ec8ae5", fill="#48d09b", alpha=0.7) +
geom_density(alpha=.2, fill="yellow") + geom_blank(data = DSTG_blank, aes(x=x,y=y)) +
scale_x_continuous(name = "bodovi") +
facet_wrap(vars(varijabla), scales = "free", ncol = 4) +
geom_rug(alpha = 0.3)
```
# Korelacije {data-navmenu="DESKRIPTIVA"}
## Column 1
### Korelacije (redovni studenti, prvi upis)
```{r warning=FALSE, fig.width=10, fig.height=7}
corrplot(round(cor(podaci), 2), type = "lower",
diag=FALSE, addCoef.col = 'black', tl.srt = 45)
```
## Column 2
### p-vrijednosti korelacija
```{r warning=FALSE, fig.width=10, fig.height=7}
corrplot(round(cor(podaci), 2), type = "lower",
diag=FALSE, tl.srt = 45, p.mat = testRes$p, insig = 'p-value', sig.level = -1)
```
# Sankey dijagram {data-navmenu="DESKRIPTIVA"}
## Columnn 1
### Sankey dijagram (redovni studenti, prvi upis)
```{r warning=FALSE, fig.width=10, fig.height=7}
ggplot(podaci_sankey, aes(x = x, next_x = next_x, node = node, next_node = next_node,
fill = factor(node),
label = paste0(node, ' (', pct, '%)'))) +
geom_sankey(flow.alpha = 0.5, node.color = "black", show.legend = FALSE) +
geom_sankey_label(size = 3, color = "black", fill= "white", hjust = -0.35) +
theme_bw() + theme_sankey(base_size = 16) +
theme(axis.title = element_blank(), axis.text.y = element_blank(),
axis.ticks = element_blank(), panel.grid = element_blank())
```
# Cronbach alpha {data-navmenu="DESKRIPTIVA"}
## Column 1
### Cronbach alpha
```{r}
psych::alpha(podaci, check.keys = TRUE)
```
# Opis modela {data-navmenu="ZADAVANJE"}
## Column 1
### Zadavanje problema (slučajna šuma)
**Prediktori**
- KVIZ - suma svih bodova na kvizovima
- KOL1 - ukupni broj bodova na prvom kolokviju
- KOL2 - ukupni broj bodova na drugom kolokviju
**Response** - Klasifikacija studenata na temelju broja bodova dobivenih na zadavanju problema
- **1** - ako je na zadavanju problema ukupni broj bodova unutar intervala [0%, 33%]
- **2** - ako je na zadavanju problema ukupni broj bodova unutar intervala (33%, 67%]
- **3** - ako je na zadavanju problema ukupni broj bodova unutar intervala (67%, 100%]
**Hiperparametri** - za odabir optimalne kombinacije hiperparametara napravljen je
cross-validation s 10 *kutija* pri čemu je na slučajni način isprobano 5000 kombinacija
hiperparametara *mtry*, *min_n* i *trees* preko uzorkovanja na latinskoj hiperkocki.
Najbolji model je odabran s obzirom na *roc_auc* metriku.
- *mtry* - uvijek je jednak 2
- *trees* - uzimani su prirodni brojevi između 400 i 1500.
- *min_n* - uzimani su prirodni brojevi između 2 i 40.
### Zadavanje problema (stablo odlučivanja)
**Prediktori**
- KVIZ - suma svih bodova na kvizovima
- KOL1 - ukupni broj bodova na prvom kolokviju
- KOL2 - ukupni broj bodova na drugom kolokviju
**Response** - Klasifikacija studenata na temelju broja bodova dobivenih na zadavanju problema
- **1** - ako je na zadavanju problema ukupni broj bodova unutar intervala [0%, 33%]
- **2** - ako je na zadavanju problema ukupni broj bodova unutar intervala (33%, 67%]
- **3** - ako je na zadavanju problema ukupni broj bodova unutar intervala (67%, 100%]
**Hiperparametri** - za odabir optimalne kombinacije hiperparametara napravljen je
cross-validation s 10 *kutija* pri čemu je na slučajni način isprobano 5000 kombinacija
hiperparametara *tree_depth*, *min_n* i *cost_complexity* preko uzorkovanja na latinskoj hiperkocki.
Najbolji model je odabran s obzirom na *roc_auc* metriku.
- *tree_depth* - uzimani su prirodni brojevi između 1 i 15.
- *cost_complexity* - uzimani su realni brojevi između 1e-10 i 0.1.
- *min_n* - uzimani su prirodni brojevi između 2 i 40.
## Column 2
### 3-kvantili za kreiranje klasa
```{r}
qvant %>% kbl() %>%
kable_paper("hover", full_width = F)
```
### summary
```{r}
summary(podaci) %>% kbl() %>%
kable_paper("hover", full_width = F)
```
# hiperparametri - slučajna šuma {data-navmenu="ZADAVANJE"}
## Column 1
### Testirani hiperparametri
```{r}
Zadavanje_tuning %>%
collect_metrics() %>%
filter(.metric == "roc_auc", trees > 0) %>%
pivot_longer(cols = trees:min_n) %>%
mutate(best_mod = mean == max(mean)) %>%
ggplot(aes(x = value, y = mean)) +
#geom_line(alpha = 0.5, size = 1.5) +
geom_point(aes(color = best_mod), size = 0.3) +
facet_wrap(~name, scales = "free_x") +
scale_x_continuous() +
labs(y = "roc auc", x = "", color = "Best Model")
```
## Column 2
### 20 najboljih modela za roc_auc metriku
```{r}
print(Zadavanje_tuning %>% show_best(metric = 'roc_auc', n = 20), n = 20)
```
# efikasnost - slučajna šuma {data-navmenu="ZADAVANJE"}
## Column 1
### Metrike na testnom i trening skupu {data-height=200}
```{r}
tab1 <- Zadavanje_fit %>% collect_predictions() %>%
rf_metrike(truth = Zadavanje, estimate = .pred_class, .pred_1:.pred_3)
tab2 <- Zadavanje_work %>%
rf_metrike(truth = Zadavanje, estimate = .pred_class, .pred_1:.pred_3)
tab1 %>% inner_join(tab2, by = ".metric") %>%
select(.metric, .estimator = .estimator.x, trening = .estimate.y, test = .estimate.x)
```
### Confusion matrix
```{r}
Zadavanje_fit %>% collect_predictions() %>%
conf_mat(truth = Zadavanje, estimate = .pred_class) %>% autoplot("heatmap") +
scale_fill_gradient(low = "#87DEE7",
high = "#FFFFCC")
```
## Column 2
### ROC curve
```{r}
Zadavanje_fit %>% collect_predictions() %>% roc_curve(Zadavanje, .pred_1:.pred_3) %>% autoplot()
```
### Gain curve
```{r}
Zadavanje_fit %>% collect_predictions() %>% gain_curve(Zadavanje, .pred_1:.pred_3) %>% autoplot()
```
# važnost prediktora - slučajna šuma {data-navmenu="ZADAVANJE"}
## Columnn 1
### Gini
```{r}
Zadavanje_fit %>% extract_fit_parsnip() %>% vip()
```
### permutacija
```{r}
Zadavanje_permfit %>% extract_fit_parsnip() %>% vip()
```
## Column 2
### Boruta
```{r}
plot(Zadavanje_Boruta, cex.axis=.7, las=2, xlab="",
colCode = c("green", "orange", "#f6546a", "#2acaea"))
```
### Boruta (history)
```{r}
plotImpHistory(Zadavanje_Boruta, colCode = c("green", "orange", "#f6546a", "#2acaea"))
```
# PDP-M-ALE plot - slučajna šuma {data-navmenu="ZADAVANJE"}
## Column {.tabset .tabset-fade}
### PDP plot
```{r fig.width=12}
plot(pdp_zadavanje_all, geom = "profiles")
```
### M plot
```{r fig.width=12}
plot(cond_zadavanje_all, geom = "profiles")
```
### ALE plot
```{r fig.width=12}
plot(acc_zadavanje_all, geom = "profiles")
```
### svi
```{r fig.width=12}
oznake <- c("klasa 1", "klasa 2", "klasa 3")
names(oznake) <- c("workflow.1", "workflow.2", "workflow.3")
ggplot(krivulje_zadavanje, aes(x = `_x_`, y = `_yhat_`, color = vrsta)) +
geom_line() +
facet_grid(cols = vars(`_vname_`), rows = vars(`_label_`), scales = "free_x",
labeller = labeller(`_label_` = oznake)) +
xlab("bodovi") + ylab("predikcija vjerojatnosti")
```
# Break-Down SHAP - slučajna šuma {data-navmenu="ZADAVANJE"}
## Column 1 {.tabset .tabset-fade}
### BD 2
```{r}
plot(ob2_zadavanje)
```
### BD 45
```{r}
plot(ob45_zadavanje)
```
### BD 88
```{r}
plot(ob88_zadavanje)
```
## Column 2 {.tabset .tabset-fade}
### SHAP 2
```{r}
plot(ob2_zadavanje_shap)
```
### SHAP 45
```{r}
plot(ob45_zadavanje_shap)
```
### SHAP 88
```{r}
plot(ob88_zadavanje_shap)
```
# hiperparametri - stablo odlučivanja {data-navmenu="ZADAVANJE"}
## Column 1
### Testirani hiperparametri
```{r}
Zadavanje_tuning_tree %>%
collect_metrics() %>%
filter(.metric == "roc_auc") %>%
pivot_longer(cols = cost_complexity:min_n) %>%
mutate(best_mod = mean == max(mean)) %>%
ggplot(aes(x = value, y = mean)) +
#geom_line(alpha = 0.5, size = 1.5) +
geom_point(aes(color = best_mod), size = 0.3) +
facet_wrap(~name, scales = "free_x") +
scale_x_continuous() +
labs(y = "roc auc", x = "", color = "Best Model")
```
## Column 2
### 20 najboljih modela za roc_auc metriku
```{r}
print(Zadavanje_tuning_tree %>% show_best(metric = 'roc_auc', n = 20), n = 20)
```
# efikasnost - stablo odlučivanja {data-navmenu="ZADAVANJE"}
## Column 1
### Metrike na testnom i trening skupu {data-height=200}
```{r}
tab1 <- Zadavanje_fit_tree %>% collect_predictions() %>%
rf_metrike(truth = Zadavanje, estimate = .pred_class, .pred_1:.pred_3)
tab2 <- Zadavanje_work_tree %>%
rf_metrike(truth = Zadavanje, estimate = .pred_class, .pred_1:.pred_3)
tab1 %>% inner_join(tab2, by = ".metric") %>%
select(.metric, .estimator = .estimator.x, trening = .estimate.y, test = .estimate.x)
```
### Confusion matrix
```{r}
Zadavanje_fit_tree %>% collect_predictions() %>%
conf_mat(truth = Zadavanje, estimate = .pred_class) %>% autoplot("heatmap") +
scale_fill_gradient(low = "#87DEE7",
high = "#FFFFCC")
```
## Column 2
### ROC curve
```{r}
Zadavanje_fit_tree %>% collect_predictions() %>% roc_curve(Zadavanje, .pred_1:.pred_3) %>% autoplot()
```
### Gain curve
```{r}
Zadavanje_fit_tree %>% collect_predictions() %>% gain_curve(Zadavanje, .pred_1:.pred_3) %>% autoplot()
```
# važnost prediktora - stablo odlučivanja {data-navmenu="ZADAVANJE"}
## Columnn 1 {data-width=300}
### Važnost prediktora
```{r}
Zadavanje_fit_tree %>% extract_fit_parsnip() %>% vip()
```
## Column 2 {data-width=500 .tabset .tabset-fade}
### Stablo graf
```{r fig.width=10}
Zadavanje_fit_tree %>%
extract_fit_engine() %>%
rpart.plot(roundint = FALSE, digits = 3)
```
### Stablo tekst
```{r}
Zadavanje_fit_tree %>%
extract_fit_engine()
```
### Stablo detalji
```{r}
info1 <- summary(Zadavanje_fit_tree %>%
extract_fit_engine())$splits
info1
```
# Opis modela {data-navmenu="RJEŠAVANJE"}
## Column 1
### Rješavanje problema (slučajna šuma)
**Prediktori**
- KVIZ - suma svih bodova na kvizovima
- KOL1 - ukupni broj bodova na prvom kolokviju
- KOL2 - ukupni broj bodova na drugom kolokviju
- Zadavanje - ukupni broj bodova na zadavanju problema
**Response** - Klasifikacija studenata na temelju broja bodova dobivenih na rješavanju problema
- **1** - ako je na rješavanju problema ukupni broj bodova unutar intervala [0%, 33%]
- **2** - ako je na rješavanju problema ukupni broj bodova unutar intervala (33%, 67%]
- **3** - ako je na rješavanju problema ukupni broj bodova unutar intervala (67%, 100%]
**Hiperparametri** - za odabir optimalne kombinacije hiperparametara napravljen je
cross-validation s 10 *kutija* pri čemu je na slučajni način isprobano 500 kombinacija
hiperparametara *mtry*, *min_n* i *trees*. Najbolji model je odabran s obzirom na *roc_auc*
metriku.
- *mtry* - uzimane su vrijednosti iz skupa {2,3}
- *trees* - uzimani su prirodni brojevi između 400 i 1500.
- *min_n* - uzimani su prirodni brojevi između 2 i 40.
### Rješavanje problema (stablo odlučivanja)
**Prediktori**
- KVIZ - suma svih bodova na kvizovima
- KOL1 - ukupni broj bodova na prvom kolokviju
- KOL2 - ukupni broj bodova na drugom kolokviju
- Zadavanje - ukupni broj bodova na zadavanju problema
**Response** - Klasifikacija studenata na temelju broja bodova dobivenih na rješavanju problema
- **1** - ako je na rješavanju problema ukupni broj bodova unutar intervala [0%, 33%]
- **2** - ako je na rješavanju problema ukupni broj bodova unutar intervala (33%, 67%]
- **3** - ako je na rješavanju problema ukupni broj bodova unutar intervala (67%, 100%]
**Hiperparametri** - za odabir optimalne kombinacije hiperparametara napravljen je
cross-validation s 10 *kutija* pri čemu je na slučajni način isprobano 5000 kombinacija
hiperparametara *tree_depth*, *min_n* i *cost_complexity* preko uzorkovanja na latinskoj hiperkocki.
Najbolji model je odabran s obzirom na *roc_auc* metriku.
- *tree_depth* - uzimani su prirodni brojevi između 1 i 15.
- *cost_complexity* - uzimani su realni brojevi između 1e-10 i 0.1.
- *min_n* - uzimani su prirodni brojevi između 2 i 40.
## Column 2
### 3-kvantili za kreiranje klasa
```{r}
qvant %>% kbl() %>%
kable_paper("hover", full_width = F)
```
### summary
```{r}
summary(podaci) %>% kbl() %>%
kable_paper("hover", full_width = F)
```
# hiperparametri - slučajna šuma {data-navmenu="RJEŠAVANJE"}
## Column 1
### Testirani hiperparametri
```{r}
Rjesavanje_tuning %>%
collect_metrics() %>%
filter(.metric == "roc_auc", trees > 0) %>%
pivot_longer(cols = mtry:min_n) %>%
mutate(best_mod = mean == max(mean)) %>%
ggplot(aes(x = value, y = mean)) +
#geom_line(alpha = 0.5, size = 1.5) +
geom_point(aes(color = best_mod), size = 0.3) +
facet_wrap(~name, scales = "free_x") +
scale_x_continuous() +
labs(y = "roc auc", x = "", color = "Best Model")
```
## Column 2
### 20 najboljih modela za roc_auc metriku
```{r}
print(Rjesavanje_tuning %>% show_best(metric = 'roc_auc', n = 20), n = 20)
```
# efikasnost - slučajna šuma {data-navmenu="RJEŠAVANJE"}
## Column 1
### Metrike na testnom i trening skupu {data-height=200}
```{r}
tab1 <- Rjesavanje_fit %>% collect_predictions() %>%
rf_metrike(truth = Rjesavanje, estimate = .pred_class, .pred_1:.pred_3)
tab2 <- Rjesavanje_work %>%
rf_metrike(truth = Rjesavanje, estimate = .pred_class, .pred_1:.pred_3)
tab1 %>% inner_join(tab2, by = ".metric") %>%
select(.metric, .estimator = .estimator.x, trening = .estimate.y, test = .estimate.x)
```
### Confusion matrix
```{r}
Rjesavanje_fit %>% collect_predictions() %>%
conf_mat(truth = Rjesavanje, estimate = .pred_class) %>% autoplot("heatmap") +
scale_fill_gradient(low = "#87DEE7",
high = "#FFFFCC")
```
## Column 2
### ROC curve
```{r}
Rjesavanje_fit %>% collect_predictions() %>% roc_curve(Rjesavanje, .pred_1:.pred_3) %>% autoplot()
```
### Gain curve
```{r}
Rjesavanje_fit %>% collect_predictions() %>% gain_curve(Rjesavanje, .pred_1:.pred_3) %>% autoplot()
```
# važnost prediktora - slučajna šuma {data-navmenu="RJEŠAVANJE"}
## Columnn 1
### Gini
```{r}
Rjesavanje_fit %>% extract_fit_parsnip() %>% vip()
```
### permutacija
```{r}
Rjesavanje_permfit %>% extract_fit_parsnip() %>% vip()
```
## Column 2
### Boruta
```{r}
plot(Rjesavanje_Boruta, cex.axis=.7, las=2, xlab="",
colCode = c("green", "orange", "#f6546a", "#2acaea"))
```
### Boruta (history)
```{r}
plotImpHistory(Rjesavanje_Boruta, colCode = c("green", "orange", "#f6546a", "#2acaea"))
```
# PDP-M-ALE plot - slučajna šuma {data-navmenu="RJEŠAVANJE"}
## Column {.tabset .tabset-fade}
### PDP plot
```{r fig.width=12}
plot(pdp_rjesavanje_all, geom = "profiles")
```
### M plot
```{r fig.width=12}
plot(cond_rjesavanje_all, geom = "profiles")
```
### ALE plot
```{r fig.width=12}
plot(acc_rjesavanje_all, geom = "profiles")
```
### svi
```{r fig.width=12}
oznake <- c("klasa 1", "klasa 2", "klasa 3")
names(oznake) <- c("workflow.1", "workflow.2", "workflow.3")
ggplot(krivulje_rjesavanje, aes(x = `_x_`, y = `_yhat_`, color = vrsta)) +
geom_line() +
facet_grid(cols = vars(`_vname_`), rows = vars(`_label_`), scales = "free_x",
labeller = labeller(`_label_` = oznake)) +
xlab("bodovi") + ylab("predikcija vjerojatnosti")
```
# Break-Down SHAP - slučajna šuma {data-navmenu="RJEŠAVANJE"}
## Column 1 {.tabset .tabset-fade}
### BD 2
```{r}
plot(ob2_rjesavanje)
```
### BD 45
```{r}
plot(ob45_rjesavanje)
```
### BD 88
```{r}
plot(ob88_rjesavanje)
```
## Column 2 {.tabset .tabset-fade}
### SHAP 2
```{r}
plot(ob2_rjesavanje_shap)
```
### SHAP 45
```{r}
plot(ob45_rjesavanje_shap)
```
### SHAP 88
```{r}
plot(ob88_rjesavanje_shap)
```
# hiperparametri - stablo odlučivanja {data-navmenu="RJEŠAVANJE"}
## Column 1
### Testirani hiperparametri
```{r}
Rjesavanje_tuning_tree %>%
collect_metrics() %>%
filter(.metric == "roc_auc") %>%
pivot_longer(cols = cost_complexity:min_n) %>%
mutate(best_mod = mean == max(mean)) %>%
ggplot(aes(x = value, y = mean)) +
#geom_line(alpha = 0.5, size = 1.5) +
geom_point(aes(color = best_mod), size = 0.3) +
facet_wrap(~name, scales = "free_x") +
scale_x_continuous() +
labs(y = "roc auc", x = "", color = "Best Model")
```
## Column 2
### 20 najboljih modela za roc_auc metriku
```{r}
print(Rjesavanje_tuning_tree %>% show_best(metric = 'roc_auc', n = 20), n = 20)
```
# efikasnost - stablo odlučivanja {data-navmenu="RJEŠAVANJE"}
## Column 1
### Metrike na testnom i trening skupu {data-height=200}
```{r}
tab1 <- Rjesavanje_fit_tree %>% collect_predictions() %>%
rf_metrike(truth = Rjesavanje, estimate = .pred_class, .pred_1:.pred_3)
tab2 <- Rjesavanje_work_tree %>%
rf_metrike(truth = Rjesavanje, estimate = .pred_class, .pred_1:.pred_3)
tab1 %>% inner_join(tab2, by = ".metric") %>%
select(.metric, .estimator = .estimator.x, trening = .estimate.y, test = .estimate.x)
```
### Confusion matrix
```{r}
Rjesavanje_fit_tree %>% collect_predictions() %>%
conf_mat(truth = Rjesavanje, estimate = .pred_class) %>% autoplot("heatmap") +
scale_fill_gradient(low = "#87DEE7",
high = "#FFFFCC")
```
## Column 2
### ROC curve
```{r}
Rjesavanje_fit_tree %>% collect_predictions() %>% roc_curve(Rjesavanje, .pred_1:.pred_3) %>% autoplot()
```
### Gain curve
```{r}
Rjesavanje_fit_tree %>% collect_predictions() %>% gain_curve(Rjesavanje, .pred_1:.pred_3) %>% autoplot()
```
# važnost prediktora - stablo odlučivanja {data-navmenu="RJEŠAVANJE"}
## Columnn 1 {data-width=300}
### Važnost prediktora
```{r}
Rjesavanje_fit_tree %>% extract_fit_parsnip() %>% vip()
```
## Column 2 {data-width=500 .tabset .tabset-fade}
### Stablo graf
```{r fig.width = 12}
Rjesavanje_fit_tree %>%
extract_fit_engine() %>%
rpart.plot(roundint = FALSE, digits = 3)
```
### Stablo tekst
```{r}
Rjesavanje_fit_tree %>%
extract_fit_engine()
```
### Stablo detalji
```{r}
info2 <- summary(Rjesavanje_fit_tree %>%
extract_fit_engine())$splits
info2
```
# Opis modela {data-navmenu="PROCJENA"}
## Column 1
### Procjena rješenja problema (slučajna šuma)
**Prediktori**
- KVIZ - suma svih bodova na kvizovima
- KOL1 - ukupni broj bodova na prvom kolokviju
- KOL2 - ukupni broj bodova na drugom kolokviju
- Zadavanje - ukupni broj bodova na zadavanju problema
- Rješavanje - ukupni broj bodova na rješavanju problema
**Response** - Klasifikacija studenata na temelju broja bodova dobivenih na procjeni rješenja problema
- **1** - ako je na procjeni rješenja problema ukupni broj bodova unutar intervala [0%, 33%]
- **2** - ako je na procjeni rješenja problema ukupni broj bodova unutar intervala (33%, 67%]
- **3** - ako je na procjeni rješenja problema ukupni broj bodova unutar intervala (67%, 100%]
**Hiperparametri** - za odabir optimalne kombinacije hiperparametara napravljen je
cross-validation s 10 *kutija* pri čemu je na slučajni način isprobano 5000 kombinacija
hiperparametara *mtry*, *min_n* i *trees* preko uzorkovanja na latinskoj hiperkocki.
Najbolji model je odabran s obzirom na *roc_auc* metriku.
- *mtry* - uzimane su vrijednosti iz skupa {2,3}
- *trees* - uzimani su prirodni brojevi između 400 i 1500.
- *min_n* - uzimani su prirodni brojevi između 2 i 40.
### Procjena rješenja problema (stablo odlučivanja)
- KVIZ - suma svih bodova na kvizovima
- KOL1 - ukupni broj bodova na prvom kolokviju
- KOL2 - ukupni broj bodova na drugom kolokviju
- Zadavanje - ukupni broj bodova na zadavanju problema
- Rješavanje - ukupni broj bodova na rješavanju problema
**Response** - Klasifikacija studenata na temelju broja bodova dobivenih na procjeni rješenja problema
- **1** - ako je na procjeni rješenja problema ukupni broj bodova unutar intervala [0%, 33%]
- **2** - ako je na procjeni rješenja problema ukupni broj bodova unutar intervala (33%, 67%]
- **3** - ako je na procjeni rješenja problema ukupni broj bodova unutar intervala (67%, 100%]
**Hiperparametri** - za odabir optimalne kombinacije hiperparametara napravljen je
cross-validation s 10 *kutija* pri čemu je na slučajni način isprobano 5000 kombinacija
hiperparametara *tree_depth*, *min_n* i *cost_complexity* preko uzorkovanja na latinskoj hiperkocki.
Najbolji model je odabran s obzirom na *roc_auc* metriku.
- *tree_depth* - uzimani su prirodni brojevi između 1 i 15.
- *cost_complexity* - uzimani su realni brojevi između 1e-10 i 0.1.
- *min_n* - uzimani su prirodni brojevi između 2 i 40.
## Column 2
### 3-kvantili za kreiranje klasa
```{r}
qvant %>% kbl() %>%
kable_paper("hover", full_width = F)
```
### summary
```{r}
summary(podaci) %>% kbl() %>%
kable_paper("hover", full_width = F)
```
# hiperparametri - slučajna šuma {data-navmenu="PROCJENA"}
## Column 1
### Testirani hiperparametri
```{r}
Procjena_tuning %>%
collect_metrics() %>%
filter(.metric == "roc_auc", trees > 0) %>%
pivot_longer(cols = mtry:min_n) %>%
mutate(best_mod = mean == max(mean)) %>%
ggplot(aes(x = value, y = mean)) +
#geom_line(alpha = 0.5, size = 1.5) +
geom_point(aes(color = best_mod), size = 0.3) +
facet_wrap(~name, scales = "free_x") +
scale_x_continuous() +
labs(y = "roc auc", x = "", color = "Best Model")
```
## Column 2
### 20 najboljih modela za roc_auc metriku
```{r}
print(Procjena_tuning %>% show_best(metric = 'roc_auc', n = 20), n = 20)
```
# efikasnost - slučajna šuma {data-navmenu="PROCJENA"}
## Column 1
### Metrike na testnom i trening skupu {data-height=200}
```{r}
tab1 <- Procjena_fit %>% collect_predictions() %>%
rf_metrike(truth = Procjena, estimate = .pred_class, .pred_1:.pred_3)
tab2 <- Procjena_work %>%
rf_metrike(truth = Procjena, estimate = .pred_class, .pred_1:.pred_3)
tab1 %>% inner_join(tab2, by = ".metric") %>%
select(.metric, .estimator = .estimator.x, trening = .estimate.y, test = .estimate.x)
```
### Confusion matrix
```{r}
Procjena_fit %>% collect_predictions() %>%
conf_mat(truth = Procjena, estimate = .pred_class) %>% autoplot("heatmap") +
scale_fill_gradient(low = "#87DEE7",
high = "#FFFFCC")
```
## Column 2
### ROC curve
```{r}
Procjena_fit %>% collect_predictions() %>% roc_curve(Procjena, .pred_1:.pred_3) %>% autoplot()
```
### Gain curve
```{r}
Procjena_fit %>% collect_predictions() %>% gain_curve(Procjena, .pred_1:.pred_3) %>% autoplot()
```
# važnost prediktora - slučajna šuma {data-navmenu="PROCJENA"}
## Columnn 1
### Gini
```{r}
Procjena_fit %>% extract_fit_parsnip() %>% vip()
```
### permutacija
```{r}
Procjena_permfit %>% extract_fit_parsnip() %>% vip()
```
## Column 2
### Boruta
```{r}
plot(Procjena_Boruta, cex.axis=.7, las=2, xlab="",
colCode = c("green", "orange", "#f6546a", "#2acaea"))
```
### Boruta (history)
```{r}
plotImpHistory(Procjena_Boruta, colCode = c("green", "orange", "#f6546a", "#2acaea"))
```
# PDP-M-ALE plot - slučajna šuma {data-navmenu="PROCJENA"}
## Column {.tabset .tabset-fade}
### PDP plot
```{r fig.width=12}
plot(pdp_procjena_all, geom = "profiles")
```
### M plot
```{r fig.width=12}
plot(cond_procjena_all, geom = "profiles")
```
### ALE plot
```{r fig.width=12}
plot(acc_procjena_all, geom = "profiles")
```
### svi
```{r fig.width=12}
oznake <- c("klasa 1", "klasa 2", "klasa 3")
names(oznake) <- c("workflow.1", "workflow.2", "workflow.3")
ggplot(krivulje_procjena, aes(x = `_x_`, y = `_yhat_`, color = vrsta)) +
geom_line() +
facet_grid(cols = vars(`_vname_`), rows = vars(`_label_`), scales = "free_x",
labeller = labeller(`_label_` = oznake)) +
xlab("bodovi") + ylab("predikcija vjerojatnosti")
```
# Break-Down SHAP - slučajna šuma {data-navmenu="PROCJENA"}
## Column 1 {.tabset .tabset-fade}
### BD 2
```{r}
plot(ob2_procjena)
```
### BD 45
```{r}
plot(ob45_procjena)
```
### BD 88
```{r}
plot(ob88_procjena)
```
## Column 2 {.tabset .tabset-fade}
### SHAP 2
```{r}
plot(ob2_procjena_shap)
```
### SHAP 45
```{r}
plot(ob45_procjena_shap)
```
### SHAP 88
```{r}
plot(ob88_procjena_shap)
```
# hiperparametri - stablo odlučivanja {data-navmenu="PROCJENA"}
## Column 1
### Testirani hiperparametri
```{r}
Procjena_tuning_tree %>%
collect_metrics() %>%
filter(.metric == "roc_auc") %>%
pivot_longer(cols = cost_complexity:min_n) %>%
mutate(best_mod = mean == max(mean)) %>%
ggplot(aes(x = value, y = mean)) +
#geom_line(alpha = 0.5, size = 1.5) +
geom_point(aes(color = best_mod), size = 0.3) +
facet_wrap(~name, scales = "free_x") +
scale_x_continuous() +
labs(y = "roc auc", x = "", color = "Best Model")
```
## Column 2
### 20 najboljih modela za roc_auc metriku
```{r}
print(Procjena_tuning_tree %>% show_best(metric = 'roc_auc', n = 20), n = 20)
```
# efikasnost - stablo odlučivanja {data-navmenu="PROCJENA"}
## Column 1
### Metrike na testnom i trening skupu {data-height=200}
```{r}
tab1 <- Procjena_fit_tree %>% collect_predictions() %>%
rf_metrike(truth = Procjena, estimate = .pred_class, .pred_1:.pred_3)
tab2 <- Procjena_work_tree %>%
rf_metrike(truth = Procjena, estimate = .pred_class, .pred_1:.pred_3)
tab1 %>% inner_join(tab2, by = ".metric") %>%
select(.metric, .estimator = .estimator.x, trening = .estimate.y, test = .estimate.x)
```
### Confusion matrix
```{r}
Procjena_fit_tree %>% collect_predictions() %>%
conf_mat(truth = Procjena, estimate = .pred_class) %>% autoplot("heatmap") +
scale_fill_gradient(low = "#87DEE7",
high = "#FFFFCC")
```
## Column 2
### ROC curve
```{r}
Procjena_fit_tree %>% collect_predictions() %>% roc_curve(Procjena, .pred_1:.pred_3) %>% autoplot()
```
### Gain curve
```{r}
Procjena_fit_tree %>% collect_predictions() %>% gain_curve(Procjena, .pred_1:.pred_3) %>% autoplot()
```
# važnost prediktora - stablo odlučivanja {data-navmenu="PROCJENA"}
## Columnn 1 {data-width=300}
### Važnost prediktora
```{r}
Procjena_fit_tree %>% extract_fit_parsnip() %>% vip()
```
## Column 2 {data-width=700 .tabset .tabset-fade}
### Stablo graf
```{r fig.width=16, fig.height=8}
Procjena_fit_tree %>%
extract_fit_engine() %>%
rpart.plot(roundint = FALSE, digits = 3, tweak = 1.1)
```
### Stablo tekst
```{r}
Procjena_fit_tree %>%
extract_fit_engine()
```
### Stablo detalji
```{r}
info3 <- summary(Procjena_fit_tree %>%
extract_fit_engine())$splits
info3
```