WriTracker: WEncoder

Back to WriTracker home

Analyze the recorded handwriting

WEncoder lets you review the handwriting recorded by WRecorder and indicate what was written: mark the character boundaries, errors, and self-corrections.

WEncoder will automatically encode several aspects of each trial:

  • It identifies strokes (stroke = a consecutive pen movement, without the pen being lifted from the paper)
  • It groups the strokes into characters (for example, the uppercase letter “P” typically consists of two strokes: a straight line and half a circle).
  • It identifies situations in which the participants corrected themselves.

WEncoder lets you review these decisions. Hopefully, in most cases they will be correct, so you only have to click OK to accept them. If they are incorrect, you can modify them. You can also indicate how the participant spelled the target word.

Figure 1. WEncoder main window

Main features:

  • Easy loading of sessions saved by WRecorder – there’s no need for any transformation.
  • Review strokes: you can remove strokes that should be ignored, split a stroke into two (e.g., in case of connected writing), and merge several strokes into one.
  • Group strokes into characters: WEncoder automatically groups several strokes into a single character. You can review and modify this automatic grouping.
  • Mark self-corrections: if the participant returned to a previously-written character and corrected it, you can flag the correction stroke. Later you can easily exclude the correction strokes (e.g., so you can view how the target was written before being corrected), and you can easily focus on these corrections in the statistical analyses.
  • Spelling: examine whether the target was spelled correctly, and mark the precise location of spelling errors.
  • Simple output format: WEncoder saves the stroke-level, character-level, and trial-level information, each as a CSV file, so they can be easily imported into any statistical software.
  • Runs on Windows and MacOS.

Using WEncoder

General description of the application

WEncoder lets you load one WRecorder session at a time and review the trials one by one. For each trial, you can easily edit how the trajectory is divided into strokes, how strokes are grouped to characters, and which strokes are self-corrections. You also review the trial’s spelling and mark spelling errors.

Below, the term raw-data folder indicates the directory that contains the raw handwriting data, as saved by WRecorder. WEncoder reads this data. The encoded trials – strokes, characters, spelling errors etc. – are saved in a separate directory, which we hereby call encoded-data folder.

Loading a session

When you start WEncoder, you will be prompted to choose two directories, one after another:

  1. The raw-data folder, i.e., directory that contains the handwriting data saved by WRecorder
  2. The encoded-data folder, in which WEncoder will save the encoded trials (strokes, characters, spelling errors, etc.).

WEncoder lets you encode the trials one by one. You can stop the program and continue later. To continue, just choose again the same raw-data and encoded-data folders. WEncoder will identify that you have already started encoding this session, and will ask you if you want to restart (delete all your previous work), to continue the encoding from the last trial encoded, or to quit.

Encoding strokes and characters

For each trial, you see a window that shows precisely what the participant wrote (Figure 1). In this window, each stroke is indicated by a series of dots in the same color. Strokes belonging to the same character will have different shades of the same basic color (red or cyan), and the different characters are shown in two alternating colors. The strokes are numbered – e.g., the numbering “3.2” means the second stroke in the third character. The numbering is placed near the end of the stroke.

Note that the stroke is not shown as a line, but as a series of dots. This way, you can get an impression of the writing speed. When the participant writes, WRecorder samples the stylus in fixed rate – once every 5 milliseconds. In WEncoder, each dot represents one sampling of the participant’s handwriting, i.e., large spacing between adjacent dots means that the participant moved the stylus quickly.

The characters and strokes are numbered, starting from 1. WEncoder shows the character and stroke number near the end of each stroke (Figure 1). E.g., the number 3.1 means first stroke in the third character. If the character extends another character (see below what “extends” means), the extended character number will be written in parenthesis, e.g., “3.1(E2)” means that this character (#3) extends character #2.

WEncoder automatically splits the handwriting into strokes and characters. You can manually override this default encoding, using the following buttons:

Merge characters (keyboard shortcut: M): Use this feature to merge several characters into a single character. After pressing the button, click the characters you want to merge, then hit ENTER to confirm or ESC to abort.

Split character (keyboard shortcut: C): Use this feature when a character contains multiple strokes and you want to separate them into 2 different characters. After pressing the button, choose the character you want to split by clicking it. The specific stroke you click determines how the character will be split. Hit ENTER to confirm or ESC to abort.

Note that the “split character” feature only affects the grouping of strokes to characters – it does not affect the stroke boundaries. To split a single stroke into two characters, use the “split stroke” feature:

Split stroke (keyboard shortcut: S): This lets you split a stroke into 2 separate strokes, which will be put in 2 different characters. This feature is useful if WEncoder incorrectly identified 2 characters as a single stroke – e.g., because the pen was not lifted from the paper between the characters.

Figure 2. Splitting a stroke into two strokes.

After pressing the button, you choose the stroke to split by clicking it and then hitting the ENTER key. The split stroke window will appear (Figure 2). Click the dot on which the stroke will be split into 2, and click ENTER to confirm or ESC to cancel.

Delete stroke (keyboard shortcut: D): Sometimes you might want to completely delete a stroke – e.g., if the pen touched the paper accidentally, or if the participant wrote something irrelevant. After clicking the button, choose the stroke to delete by clicking it and hitting ENTER to confirm (ESC will abort). A deleted stroke is not really deleted – it is changed into a space, as if the pen was moving above the paper.

Split trial (keyboard shortcut: T): In some cases, user errors may cause WRecorder to accidentally save two trials as a single one. This may happen, for example, if the participant wrote 2 words but the WRecorder experimenter did not switch a trial between them. The “split trial” feature lets you fix such errors and indicate that the sequence of characters should actually be treated as two separate trials. 

After clicking this button, choose the last character of the first trial, then hit ENTER to confirm the separation into 2 trials (or ESC to abort).

Reset current trial (keyboard shortcut: R): Clicking this button will undo all changes made in the strokes and characters of present trial (including “split trial”), and will restore WEncoder’s default encoding of strokes and characters. Self-corrections and responses (see below) will also be reset.

Coding corrections

A “correction” refers to a situation in which the participant fixes character they already wrote. Self-corrections can be made immediately after writing a character (e.g., writing an incorrect letter or digit and then fixing it); or they can be made later (i.e., return to a previously-written letter or digit and fix it).

In WEncoder, the way to treat such corrections is to indicate that a particular character (which consists of the correcting strokes) extends another character. For example, suppose a participant intended to write Q, but instead wrote a circle (i.e. an O) and only then added the extra line. The circle and the line should be coded as two separate characters: the circle character is the base, and the line character extends it.

To define extending two characters as extending each other, click the extend char button (keyboard shortcut: X), choose the two characters, and hit ENTER to confirm or ESC to cancel. The second character will be marked as extending the first one. To “disconnect” an extending character from its base, click the extend char button, select the character in question (this will mark both the base and its extender), and hit ENTER.

If the participant made several corrections, you can mark several characters as extending by repeatedly clicking extend char.

WEncoder marks extending characters by plotting them using different colors (red / purple) from standard characters.

If you uncheck the “show extending characters” checkbox, the extending characters will be hidden. They are not deleted or changed into space – they are just hidden from the display.

Coding spelling errors

WEncoder asks that you code the trial’s spelling and spelling errors. To indicate what the participant wrote, click the Enter response button and type what the participant wrote. The number of characters in the response that you write here must match the number of characters in the participant’s actual response (extending characters don’t count here, as they correct the base character’s letter).

Automatic encoding of trials

WEncoder automatically encodes each trial before letting you modify it manually.

Grouping strokes to characters: To group strokes into characters, WEncoder examines the horizontal overlap between strokes. If the degree of overlap exceeds a certain threshold, WEncoder will assign both strokes to a single character. The value of this minimum-overlap threshold can be configured in the settings screen. The threshold is defined as percentage: the amount of horizontal overlap between the two characters, divided by their joint horizontal extent.

There is no automatic coding for spelling errors and correction (extending) characters.

Completing a trial

To complete a trial, you must declare whether it was OK or erroneous. In both cases, the trial will count as completed and your entire coding will be saved. The definition of a trial as OK / erroneous affects only the flagging of the trial’s status, which is written in the trials.csv file.

To indicate that the trial is correct, click the Accept as OK button (keyboard shortcut: A). To indicate that the trial has an error, choose the error type (in the errors dropdown) and click the Error button (keyboard shortcut: O).

The list of available error types can be configured via the settings screen.

Navigating between trials

Skip current trial (keyboard shortcut: K) – do not encode the current trial, continue to the next trial. The skipped trial will not be saved to the encoded-data folder.

Previous trial (keyboard shortcut: P) – return to the previous trial.

Go to specific trial (keyboard shortcut: G) – jump to a particular trial by specifying its trial number. Trial numbers are written in the trials.csv file, and appear on WEncoder’s window title bar.

If you return to an already-encoded trial (using the “previous” or “go to” buttons) and encode it again, the new encoding will override the previous one.

The Settings screen

The settings screen allows configuring the following parameters:

Typing in the participant’s response is…: here you can indicate whether entering the response is mandatory for all trials, only for error-free trials, or completely optional. You can always enter a response; this setting only controls whether WEncoder will ensure that you do so.

Minimal overlap between 2 strokes in the same character: if the horizontal overlap between two strokes, computed as percentage of their overall horizontal span, exceeds this value – WEncoder‘s automatic coding will assign the 2 strokes to the same character.

Error codes: A comma-separated list of values that will appear in the error-code dropdown in WEncoder‘s main window.

Dot size: changes the trajectory dot size in WEncoder‘s display.

Result files

The results folder contains the following files:

trials.csv: contains information about the trials written in the session. The file contains one row per trial, with the following columns:

  • trial_id: the trial’s serial number
  • sub_trial_num: If the trial was split using the “split trial” feature, the trial’s split parts will be numbered here from 1 onwards.
  • target_id: the unique ID of the target of this trial (copied from the target_ID column in the targets file)
  • target: the target of this trial (copied from the target_value column in the targets file)
  • response: what the participant wrote (as entered by you in WEncoder).
  • time_in_session: the duration (in seconds) from the time when the session started until the time when the trial started.
  • rc: the trial’s coding – either OK or an error code.
  • has_corrections: 1 if the trial contains a correction (extending) characters, 0 if not. 
  • traj_file_name: the name of the CSV file that contains this trial’s trajectory information.
  • time_in_day: the time in which the trial started (HH:MM:SS, 24-hour format).
  • date: the date in which the trial started (yyyy-mm-dd).
  • sound_file_length: the duration (in seconds) of the played sound file. 0 if no sound file was played.

See here an example trials.csv file.

characters.csv: this CSV file contains multiple rows per trial – one row for each character. It has the following columns:

  • trial_id: the trial’s serial number
  • sub_trial_num: If the trial was split using the “split trial” feature, the trial’s split parts will be numbered here from 1 onwards.
  • target_id: the unique ID of the target of this trial (copied from the target_ID column in the targets file)
  • target: the target of this trial (copied from the target_value column in the targets file)
  • char_num: the character number
  • char: the character (as entered by you when indicating the response)
  • x, y, width, height: the character position and size. Technically, this is the size of a rectangular “bounding box” that surrounds 95% of the character’s pixels in the horizontal axis and 95% of the pixels in the vertical axis.
  • pre_char_distance, post_char_distance: the distance between adjacent characters (more precisely: between their bounding boxes). Distance=0 indicates that the bounding boxes touch each other; a negative distance indicates that they overlap.
  • pre_char_delay, post_char_delay: the delay (in milliseconds) between the end of one character and the beginning of the next character
  • extends: if this character is a correction and extends another character, this is the char_num of the base character.

See here an example characters.csv file.

strokes.csv:  this CSV file contains multiple rows per trial – one row for each stroke. Essentially, this file only defines how strokes are mapped to characters. It has the following columns:

  • trial_id: the trial’s serial number
  • sub_trial_num: If the trial was split using the “split trial” feature, the trial’s split parts will be numbered here from 1 onwards.
  • char_num: the character number (if you manually change this, you’re “moving” the stroke to a different character)
  • stroke: the stroke number within the trial.
  • on_paper: 1 if the pen was on paper during this stroke, 0 if not (a “space” stroke).

See here an example strokes.csv file.

Trajectory files: These CSV files contain the full handwriting trajectory. There is one trajectory file per trial (or sub-trial, if the trial has several sub-trials). The trajectory file name is listed in the traj_file_name column in the trials.csv file.

In the trajectory files, each row represents the sampling of the pen position and pressure at one time point during the trial (similar to WRecorder‘s output). The file has the following columns:

  • char_num: the character number in the current trial.
  • stroke: the serial number of the stroke.
  • pen_down: 1 if an actual writing occurred, 0 if the pen was hovering above the tablet.
  • x, y: the pen coordinates. WEncoder keeps the coordinates that were recorded by WRecorder.
  • pressure: the pen’s pressure against the tablet, as a value between 0 and 100 (0 = the pen did not touch the paper, 100 = strongest pressure).
  • time: the time relative to the trial’s start time.
  • correction: 1 if the current stroke represents a self-correction, 0 otherwise. . 

See here an example trajectory file.