Adviseme Pilots

2018-10-19

Pilots

so far very small (10 items, 40 students)
aim for 300 students next year

How are we going to evaluate this project?

reliability
validity
usability

Reliability

consistency of results
are we measuring anyting?

Bayesian networks

updates of the student model

based on the task models

split half reliability

if the items in the test measure the same thing
there should be a relation between the results on two halfs of the test

Student model

Task models

Validity

comparison with expert judgements

"replicate current practice" (Sangwin, this morning)

Validity

diagnostic relevance
what to report?