Thursday, April 28, 2011

Updated Experiment Info--Pilot Testing

I'm interested in testing how device size, specifically screen & input size affects a search & selection task. This could have important implications for which tasks we choose to perform on which devices.

[
Hypotheses]
1. Small-screen users will use more category heuristics than large-screen users.
2. Small-screen users will be more satisfied with their selection because they had fewer choices.
3. Large-screen users will perform better, as rated by independent raters.
4. Small-screen users will feel like they need more time more than large-screen users

[Methods]
See Google doc for pre & post task survey as well as task instructions.

[Materials]
I am using a iphone 4G as my 'small screen' device and 27" monitor as my large-screen.
I am using safari web browser for both conditions.

[Measurements]

1. The number of heuristics that the subject used to refine the search. (HOW besides watching?)
2. How confident the user is about his/her selections
3. How well the selections perform, as rated by independent raters*
4. Whether the user feels like he needed more time

*I intend on asking mechanical turkers to rate the dress and shirt selections. I will give them the same prompts that the subjects received. I will show them two at a time and ask which one seems more appropriate? (or a better choice?)

[Calculations & Results]

I don't have meaningful results because I haven't run anything on mechanical turk yet. But I intend on using chi square tests once I have my independent raters.

I will also use chi square to calculate confidence, category heuristics and time.

My pilot results:
Average small-screen confidence: 2.5
Average large-screen confidence:
3.6

[Further Questions]
How can I track the ways in which the subject refined his/her search besides watching?
How should I have mechanical turkers rate the selections? (ask ‘more appropriate’, ‘better’, ‘Jamie/Matt will like more’?)
Do I have them rate between two? (Strict ordering does not give me as much information)
Should I have both conditions fill out the pre & post task on the big monitor? Or do all of it on the small device?
Should I use a laptop/monitor instead of phone/monitor to help control for processing & network speed?



1 comment:

  1. Hi Melissa. Big topic change!

    Measurements/hypotheses 2 and 3 look nice and clear cut.

    I have questions about 1 and 4.

    I'm actually not sure what you mean by, "the number of heuristics that the subject used to refine the search." I assumed this had a precise definition I didn't know, but then the fact that you don't know how to measure it made me wonder. Or is your question ("How can I track the ways in which the subject refined his/her search besides watching?") more about logistics than about identifying heuristics? I can't tell.

    On measurement 4, what if finding shirts/dresses takes everyone less than 10 minutes? Alternatively, what if everyone needs much more time than that? Then there might well be differences, but they wouldn't necessarily be captured here.

    On your question about whether you should use a laptp/monitor to control for processing and network speed---this seems like a pretty big sacrifice to make. You could take some measures to control the speed on the desktop. Or, you could use tethering, though you might need a different mobile device (I don't actually know), which would complicate everything.

    The Macy's page for dresses is terrifying, by the way. Several of those women look like they'd be very upset if I chose the wrong dress.

    ReplyDelete