mech_turk/ArchitectureOverview

This page gives an overview for the Mechanical Turk tools and gives context for mechanical turk packages (mech_turk, cv_mech_turk2, mech_turk_ros_msgs, mech_turk_ros). The actions discussed here are mapped to the commands and APIs in these packages.

The API and command line users will have these main questions:

How do I get the data to the server?
How do I get the results back?
How do I process the results?

The user will know the specific data required for the task (e.g. images) or specific format for work units. This is task-specific: e.g. attributes task.

Some space at the top







Is necessary

Overview

In this figure, the actors and components are shown in red, the responsibilities are shown in blue and the most common actions are shown in black.

The system has 3 key human players: Worker, Manager and User. The worker is in charge of doing the work. The Manager is in charge of setting up funding, tasks, annotation sessions and allowing users access to those. The user is in charge of getting the data, vetting the results and using the annotation results further.

The worker interacts with the UI component for the actual work and the wiki to understand the instructions. Amazon allows the workers to find tasks and handles payments.

The Manager is in charge of defining the task, making payments to Amazon and creating annotation sessions. A session is a placeholder for tasks with a cap for the total number of tasks.

The user submits data to mechanical turk, performs QA and gets the results for further processing. Most of the interactions for the user happen through the mech_turk API and command line tools. Grading is a notable exception and is done on the server via web interface.

Money flow

The money are paid for the correctly performed work. The Administrator/Manager puts the money into Amazon account, sets up the access identifiers on the django_crowd server and creates session for the Users. The users consumes money by submitting tasks to sessions and then approving the results.

Example user workflow

Get session ID from the Manager (e.g. SESSION1).
Submit images to the session:

rosrun cv_mech_turk2 submit_images.py --session=SESSION1 myimages/*.jpg

Wait for annotation to finish
Download all results

rosrun cv_mech_turk2 session_results.py --session=SESSION1 --filter=none

Go to the server and perform manual grading from the session dashboard
Download only good results

rosrun cv_mech_turk2 session_results.py --session=SESSION1 --filter=good

In this example, the user was working with the annotation server. The connection to the server was handled by the *connection API*. The actual server name was assumed to be "default" and expanded into full username/password as described in *Authentication and server names*.

The user submitted images through "image-specific" command *cv_mech_turk2.submit_images*. It knows about the specific format of the tasks (image+some parameters) and accepts image names with a number of convenience arguments.

For the tasks without such a function, it's necessary to submit raw data to the server. In that case we would use mech_turk.submit_work_units .

To get the results from the server, we used "cv_mech_turk2.session_results". Again, it knows about the task specifics. It will download the images and the annotations back. It will also convert the annotations to match the image size.

For other tasks, we can access raw submissions through "mech_turk.get_raw_session_results" in the same way.

Interaction components

The following picture gives a view, where main interaction components fit within the presented big picture.

ROS API

ROS integration defines ROS messages for image annotations and 2 ways of annotating images: streaming and action.

ROS messages are defined in mech_turk_ros_msgs: ImageReference, BoundingBox, ObjectAnnotation and ExternalAnnotation. ExternalAnnotation contains image reference and a list of objects. Each object has unique ID, name, bounding box and an optional outline.

Streaming annotation forwards all images on a topic to the annotation server. This mode is implemented in snapper node. The annotations are published on a topic by the annotation server. To cross the firewall boundary, link_topic node polls the annotation server and registers the remote publisher in the local core.

Action-based annotation is implemented in AnnotateImageAction. The action takes an image and produces ExternalAnnotation. The action server allows multiple concurrent goals to be active. The goals succeed when the annotation is received from the server. The action submits the images directly to the server, but relies on link_topic and annotation streaming to receive the annotations from the server.

Command line tools

Command line tools are currently the main way to interact with the server:

submit_images.py (cv_mech_turk2) - sends images to the server
session_results.py (cv_mech_turk2) - downloads annotated images from the server
convert_outlines_to_masks.py (cv_mech_turk2) - converts outlines in XML files to mask images.
submit_work_units.py (mech_turk) - sends raw data to the server
get_raw_session_results.py - retrieves raw session results

Authentication and server names

Much of the server API requires authenticated user. To avoid typing the passwords all the time, the mech_turk package uses a config file ("~/.ros/.mech_turk/auth.txt") to keep the user credentials. The file has very simple format:

server_alias:{
server: server_name[:port],
user:   user_name,
pwd:    user_password
}
...

e.g.

test: {
server: vm6.willowgarage.com,
user: mt_tester,
pwd: some-not-important-password
}
default: {
server: mech_turk.willowgarage.com,
user: mt_worker,
pwd: some-not-important-password
}

After a server alias is declared in the auth.txt, it can be used in all mech turk commands using "--server=ALIAS". For convenience, "--server=default" is assumed as a default value.

Connection API

Connection to the server can be performed in 2 modes: XML-RPC (authentication required) and HTTP (open access). XML-RPC is required for sensitive operations (e.g. creating a session, while HTTP can be used for accessing some data(e.g. images) and submitting to a session.

The connection is created through *mech_turk.rpc_connection.connect* and *mech_turk.http_connection.connect* .

The HTTP communication object is a convenience wrapper around urllib2 to avoid duplication of the request handling code.

The RPC communication object is a wrapper around xmlrpc library with support for authentication. It returns the RPC proxy properly authenticated against the server.