Crowd-Powered Systems

Spring 2015 :: ECE 69500 :: Purdue University

This is an archived site from Spring 2015 ()

Warm-ups

These assignments will help you learn enough web development to create your project, while also ensuring that every group member knows a common set of tools.

Warmup #0: self-introduction in HTML

Due Sun 1/18

Write a brief self-introduction to yourself

The body of your HTML should contain just one <div> tag, which should be 500 pixels wide and 250 pixels high. This will allow us to have a single page with everyone's self-introduction. Here is a template to start with.

Update:  To turn in, send email to aq@purdue.edu with a URL to the publicly accessible page, as well as a ZIP file with the relevant files. This was announced in class on 1/15. There will be no penalty for those who sent only one or the other.

Update:  See all self-introductions

Warmup #1: post a HIT via AMT web site

Due Sun 1/25

Think of an idea for a task, post it to AMT, and keep track of the time and cost.

Think of a small, real task that you could ask workers to do. It should be something specific to you and potentially useful.  We want to avoid posting tasks that are completely useless. That said, it's okay if you only post a small part of the overall job (i.e., a few tasks).  Think of this as a practice run.

Here are some ideas to get you thinking: (Some of these are easier than others.)

Post your task on Mechanical Turk using the web interface.  Keep track of the time you spend creating the interface.  The Requester Best Practices Guide may also be helpful.  Aim to pay $8-9 per hour.  Always include a feedback box in the task so that workers can tell you if something was unclear. Label it as "optional".  You can be reimbursed up to $5.00 for this assignment.  After receiving the results, check their accuracy yourself (if it makes sense for your task).

Send me an email with the following:

  1. What did you ask the workers to do?
  2. Give a screenshot of your task interface.
  3. Do you trust the results? Why?
  4. How long did it take to create the user interface?
  5. How long did it take to get your results (from when you submitted the HITs)?
  6. How long did it take before the first HIT was accepted?
  7. Give a table with the assignment ID, worker ID, accept time, and submit time of each assignment.
  8. What was the average/min/max time per assignment?
  9. Give the amount of time spent by each worker.
  10. What was your total cost?
  11. What was the hourly rate on each assignment? … for each worker? … overall?
  12. How long would it have taken you to do the same work yourself?
  13. Were there any tasks that you thought of using, but were impractical? Describe.
  14. Attach a ZIP file containing the HTML template you used and the input file.

For this assignment, you do not need to submit a URL.

You have two options for paying for HITs:

  1. Pay for them using your personal credit card and request reimbursement from Erica Cox in the EE business office (EE 132). You will need to give her documentation.
  2. I put credit on your account using my personal credit card, you give me your documentation, and I request documentation from the business office.

Note: In general, IRB approval is needed for any “human subjects research”. That would include anything you might eventually include in published research results. Approval is not needed for an exercise such as this, as long as the sole purpose is your training. See the Purdue IRB web site—especially their Determination of Human Subjects Research Worksheet—for more information about this.

Warmup #2: design and implement task UI

Due Sun 2/1
In this assignment, you will create a more feature-rich task interface and serve it from a web application. At this step, the focus is on the design and implementation, especially on the client side (web browser). You will be scratching the surface on several of the key technologies you need to know. (In warm-up #3, you will go deeper on some of them.)

Learning goals
Steps
  1. Choose a task. Feel free to re-use the problem you chose for warm-up #0, pick from the examples in that assignment, or choose something of your own. However, it must meet the requirements below.
  2. Design and implement your task UI. You will probably want to sketch your idea on paper before you start implementing. Then, using HTML and CSS, create a UI form. You must incorporate at least two of the following:
  3. The jQuery library will most likely be a big help here, although you are free to use something else or even build from scratch.
  4. Create a very simple Python+Flask web application to serve your task UI. This shouldn't require more than about 10-15 source lines of Python code (excluding comments/whitespace), and most of that will be boilerplate which you can find in the Flask tutorial. For now, your application should serve a random instance of the task each time it is loaded. For example, if you were having workers tag photographs, your application would insert a random image URL into your task template.
Requirements – summary of the above

Non-requirements – you do not need to do any of these

New to Python and/or JavaScript?… Now is a great time to start learning. Fortunately, both are much easier to learn than C, C++, C#, or Java, so if you are confident with one of those, you should be fine. To help you get started quickly, I recommend starting with a short tutorial that covers all of the main aspects of the language. The resources page has a short tutorial for each of these languages. I strongly recommend that you type each example (not copy-paste) and test that it works. Feel free to send me questions (even easy ones) by email.

Warmup #3: implement server backend + add UI tracking

Due Sun 2/8

You are now ready to start collecting results. We will use an SQLite database to collect the results. In addition to the user's inputs, you will also capture some basic information about the worker's interaction with your interface.

  1. Extend your interface from warm-up #2 with a submit button. Your form should use the POST method.
  2. Add a new route to accept the data. The new route should have the same URL as your task UI, but should accept its input as POST. The data should be stored in the database, along with the IP address, host name, and submit time.
  3. Extend your task route so that it passes the time the page was sent from the server (UTC, according to server)
  4. Extend your interface further so that it tracks the page load time and clicks (see below) in memory. These should be stored as JSON in an <input type="hidden" value="…"> element.
  5. Just before the form is submitted, the task UI should store the submit time in a <input type="hidden" …> element so that it can be passed to the server, as well.
  6. Extend the server a bit more so that it stores the tracking data. To keep things simple, you may store the tracking data as JSON (as is) in the database, if you like.
  7. Add one more route to your web application for a results page which displays all of the results received so far.
Requirements – summary of the above

Non-requirements – you do not need to do any of these:

Q & A

What's the difference between GET and POST?
There are two key differences: (a) With GET, parameters are included in the URL, whereas POST requests pass their parameters in the body of the HTTP request. (b) GET requests are expected to have no side effects, whereas POST requests may be used for submitting orders, creating accounts, and so forth. This article explains this further.
How do I specify that a function should be used for GET or POST (but not both)?
In your @app.route('…') decorator, add a methods="…" keyword argumentwith a list of the methods to be accepted. This article gives example code.
How do I pass data from the server (Python) to the browser (JavaScript)?
One way is to use your Flask/Jinja template to insert the data as JSON into a <script> tag. This post has some example code. As an alternative for data that you simply want to pass back to the server when the form is submitted, you could skip the <script> tag and just insert it into value="…" attribute of an <input type="hidden" name="…" value="…"> element.
How do I pass data from the browser (JavaScript) to the server (Python)?
Convert the data to JSON using JSON.stringify(…) and store it in the value="…" attribute of an <input type="hidden" name="…" value="…"> element. For example, if your data was called tracking_data and your hidden input had name="tracking_data_input" (i.e., <input type="hidden" name="tracking_data_input" value="">), you could store the data with jQuery("input[name=tracking_data_input]").val(JSON.stringify(tracking_data));. Then, in your Python+Flask code, you would use either request.args (for GET requests) or request.form (for POST requests) to get the JSON text, and then json.loads(…) to convert it to a Python object.
How do I detect when the user has clicked the mouse (in JavaScript)?
Create an event handler using the jQuery .on(…) method.. For example, to capture every click to a button, you would use $("input[type=button]").on("click", function(evt) { /* do stuff */ });
Why would I need to worry about passing data between the browser and server?
For this warm-up, you will need to pass the page sent time to the browser, so that it can pass it back to the server when the form is submitted. Also, the tracking data will need to be passed from the server to the browser when the form is submitted.
How do I get the user agent (browser type)?
User request.user_agent.string from your Python code. This is oddly hard to find in the Flask/Werkzeug documentation but very easy to find by searching Google for [flask user agent].
How do I get the user's IP address?
Google [flask ip address] and click the first result.
How do I use an SQLite database with Flask?
This article gives you all of the building blocks you need.

Note: The Q&A section may continue to be updated as questions come up.

Warmup #4: peer testing/feedback + improve UI

Due Sun 2/15

Let's give each other feedback on our tasks and then make some improvements. Below are links to everyone's tasks from warm-up #3.

Do at least one instance of each task and send at least 3 items of constructive, actionable feedback to the creator with cc to aq@purdue.edu by Friday night with subject line "warm-up #4 feedback [ECE 695 Crowd-Powered Systems]".

Respond to each item of feedback by addressing it or deciding not to. Send a summary to aq@purdue.edu with subject line "warm-up #4 response.

Warmup #5: server push

Due Sun 3/1

Server push is a style of web programming that involves sending messages or events from the server to browsers that are listening. It is used in a wide range of social and collaboration applications, including crowdsourcing and human computation. For our purposes, it can be especially useful when you want to support real-time collaboration between workers who are participating at the same time. It is also useful for creating a web-based status monitor with which you can watch the progress of your tasks in real-time.

For this warm-up, you will create a web application that uses server push. The preferred platform is Tornado. Tornado is essentially an alternative to Flask, but it works asynchronously. In essence, that means it is better for web applications that will spend a lot of time waiting for something (i.e., browsers waiting for another message to be sent) See the second section of the Tornado user's guide for more about that. If you are nervous about learning something new, give Tornado a try. I expect you will find it similar enough to Flask that it is easy to learn. It will make other aspects of this easier than with Flask because Tornado was specifically designed for this type of application.

See the tools page for more on the technology.

The simple_tornado_chat example may help you understand how this works. (added 2/23/2015)

Requirements

You can create an application that fits your interest, as long as it meets the following requirements:

Ideas

This doesn't have to be fancy. It can be based on your existing task UI or something new, and possibly much simpler. The primary goal is to learn the technology.

Turn-in   added 2/23/2015

Warmup #6: coding with AMT

Due Sun 3/15

This assignment is similar to warm-ups #2-4, except that everything is to be done programmatically, including posting HITs, receiving results, and analyzing costs/times. Also, you will post the HITs to the worker sandbox and enter the results yourself. The goal is to to get comfortable with the tools needed to work with AMT programmatically.

Steps

It is suggested that you complete each step before moving on to the next.
  1. Create a task web application that asks workers to classify or tag at least 20 inputs in batches of 4 inputs per HIT. Some example topics are listed below.
    • The web application runs on crowd.ecn.purdue.edu (aka hci.ecn.purdue.edu) on your assigned port.
    • The inputs (e.g., words to be classified) must be stored in your Python code, not embedded in your template.
    • A query parameter selects which 4 inputs should be included in the batch, without creating any insecure direct object references. One way to do this would be to assign a random 6-letter code to each batch of 4 inputs (e.g., http://…/?batch=spremf to categorize the words "kiwi", "run", "wear", and "care"). Another way to do this would be to use the hitId (described below).
    • A few other technical requirements are listed below under Requirements.
  2. Augment your application to respond to the parameters that AMT will append to the URL: assignmentId, hitId, workerId, turkSubmitTo.
    • assignmentId must be passed when the form is submitted. To do this, use JavaScript or server-side code to insert it into a <input type="hidden" name="assignmentId" value="…"> element. This is the unique ID for the assignment. When the worker is just previewing the HIT, assignmentId is set to “ASSIGNMENT_ID_NOT_AVAILABLE”.
    • When assignmentId is set to “ASSIGNMENT_ID_NOT_AVAILABLE”, all of your input controls should be disabled and there should be a prominent notice at the top.
    • Task interface is served via GET at "/##/" where "##" is the last two digits of your assigned port number. For example, if you were assigned port 8089, then you would serve your task at "/89/".
    • When assignmentId is empty ("") or non-existant, set your form to instead submit via POST to the same URL, which then displays whatever was submitted. (A table or even just JSON is fine for this.)
    • hitId and workerId may be ignored. These are sometimes used for logging, or for customizing the contents of the HIT.
    • turkSubmitTo must be used to form the action="…" attribute of the <form action="…">. Specifically, the action attribute will be turkSubmitTo + "/mturk/externalSubmit".
    • See the AWS documentation for ExternalQuestion for more information.
  3. Create a set of scripts for working with AMT.
    • post.py - Post HITs for all of the inputs to the worker sandbox (i.e., service_type="sandbox") and prints the URL for the HITType.
    • cancel.py - Cancel all of your HITs.
    • report.py - Print the following to the console:
      • all results, including input and answer
      • cost, total possible
      • cost, total of all assignments submitted so far
      • time, average/min/max seconds per assignment
      • time, average spent for each worker
      • hourly rate, for each assignment
      • hourly rate, for each worker
      • hourly rate, overall (across all assignments)
      • time, uptake by worker (submission time of assignment - creation time for the first assignment created by each worker)

Example topics

You can choose one of these topics, or invent a classification/tagging task of your own.
  1. Classify English words by part of speech, e.g., adjective, adverb, …, preposition, and/or pronoun (using checkboxes)
  2. Classify names as male and/or female, or unsure (using checkboxes)
  3. Classify the tone of a sentence as positive, negative, neutral, or unsure.

Requirements

Non-requirements

You do NOT need to do any of the following:

Turn-in

  1. Run your application persistently (e.g., via nohup, tmux, or screen).
  2. Send the following to aq@purdue.edu:
    1. HTTPS URL for your task UI
    2. ZIP file with your code. (Remove your AWS keys before sending!)
    3. Transcript of a console session demonstrating your ./post.py and ./report.py.
Warm-up assignments may be modified up to 1 week prior to the due date.