CrowdForge
CrowdForge copied to clipboard
Django framework for crowdsourcing complex tasks using MTurk
CrowdForge: Crowdsourcing Complex Work (released for non-commercial use creativecommons.org/licenses/by-nc/3.0/)
Getting CrowdForge to run locally:
- Prerequisites: python, django, boto (tested with python-2.7, django-1.2.5 and boto-1.8d). Known issues with boto-1.9b.
-
NOTE: There's a bug in boto-1.8d that you need to tweak. Change connection.py:466 to say
value = self.connection.get_utf8_value(value)
(instead of)
value = self.get_utf8_value(value)
- Download: go to https://github.com/borismus/CrowdForge and get the tarball.
- Extract: let's say to ~/crowdforge
- Configure: tweak settings.py
- specify AWS keys found at URL: https://aws-portal.amazon.com/gp/aws/developer/account/index.html?ie=UTF8&action=access-key
- specify absolute URL to your templates directory
- Create database: run ./manage.py syncdb, create new account
- Run server: run ./manage.py runserver
- Test!
- Setup cron jobs!
Testing CrowdForge
- Point browser to localhost:8000/admin (or whatever your production server URL is) and login.
- Add a new Problem instance. Give it a name and these parameters
- flow: SimpleFlow,
- partition: create article
- map: collect a fact
- reduce: write a paragraph
- Run
./manage.py pollmanually, and a HIT should be created (check the MTurk server)
- expected output:
- Do a sample HIT from the new HIT group
- Run
./manage.py pollagain, and a Result should be created locally (look in db or check http://localhost:8000/turk/problem/YOUR_PROBLEM_ID)
- expected output: results retrieved [<Result: Result for "Create an Article Ou... #141KMADQXJEYF9EQWEG9UA7S6J5JEW">]
Setting up Cron Jobs
- Basically, you want to run
./manage.py pollperiodically - So just tweak your crontab
crontab -e- Create a line that says something like "*/15 * * * * /path/to/manage.py poll >> /path/to/crowdforge.log"
Getting CrowdForge deployed on a production server:
- Get CrowdForge running locally (as per instructions above)
- Get django and crowdforge running on a public facing server
- Tweak settings.py:
- specify external URL of your server in settings.py
- switch from sandbox to production
- change the database engine to be mysql or postgresql
- Resync database
- Do a test
- Setup cron jobs
Advanced: Make your own Hit Types
- Open management console (localhost:8000/admin or whatever)
- Add a new Hit Type.
- Specify title, description and body. These fields can all be parametrized, depending on your flow.
Advanced: Make your own flow
- Open crowdforge/flows.py
- Create your own subclass of Flow. See SimpleFlow for an example.
- Don't forget to register your new flow using
register('MyNewFlow', MyNewFlow)