Allows you to remove unneeded columns or rows (Simple Select)
Modify data types (eg. change string to integer to use in calculations)
Rename, resequence & describe columns
Sort by field name (Ascending / alphabetical)
Add prefix to field names
Splits data stream into 2 streams based on a conditional expression.
Records that satisfy condition flow out of the true side, the rest flow out of the false side.
IsNull / IsNotNull
Smaller than, Greater than or equal to.
Replaces problematic numeric values (nulls) for specified columns with another value such as the median / specified value.
Can either replace values or create a new column with new values
Can create a column next to original showing wether it has been imputed (1 = yes, 0 = no)
Allows you to preform calculations / operations to create new data or update existing fields.
PADLEFT: Add a "x" to the
of any number less than certain length
MOD: Gets remainder (n-currentfield, d-60 / divide by)
Automatically set field type for each string field to the smallest possible size & type that will accommodate the data in each column
Eg. State - reduced string length - 2 spaces (CO)
Multi Field Formula
Applies same expression to all selected columns (same formulas)
" to refer to all the columns you want to change
Lets you extract a specified portion of the records in a data stream.
(First # of records, Last # of records, 1 in every #
Assigns a unique identifier to each record.
Ascending: Low to high (John - Samantha)
Random % Sample
Will output a specified number or percent of records obtained via a random sample of the input data.
Takes first record with unique # (When you have multiple records with the same ID/name)
In & Out Tools
"Include Sub Directories" will return all files within named folder
Returns a list of files that are contained in a directory & relevant attribute files
File specification can be changed to only return specific file types.
Provides a complete view of your data
Placed after tools
Brings data into workflow by reading from a file / connecting to database
Allows you to create a stream of data inside a workflow, without needing separate database or file.
Send the contents of a data stream to a file / database.
Moves values held in multiple horizontal fields into a single column.
Name: The name of the data field
Value: The value of the corresponding data field.
Key Field: any fields that you want to keep in the dataset and do not want to transpose
Creates one new column for each categorical value held in a single existing column, pivoting the data from a vertical layout to a more horizontal layout
Example: instead of having two rows, one with the name of the president, one with the name of the VP, both labelled one. it is condensed to one row.
Performs a host of Summary calculations, including: summing, min/max, grouping, counting, string concatenating, math, spatial object processing, and much more.
A summarize tool only shows results. If you wanted original + results use a join tool.
Returns a simple count of the number of records passing through a data stream.
Combines two data streams based on common fields or record position.
In the joined output, each row will contain the data from both inputs
The left and right outputs will contain the records that do not match from the left and right inputs.
A join tool can be used with a union tool produce outer joins.
Append the fields from a source input to every record in a target input. Each record of the target input will be duplicated for every record in the source input.
Combines multiple data streams based on common field names or positions. In the output, each column will contain data from each input and records will be "stacked" vertically.
Leverages the powerful pattern matching abilities of regular expression (regex) syntax for parsing, matching or replacing string.
Boolean value of true when if specified string matches regex, & false if it does not. (looks for full string instead of partial matches)
Returns each of the groups defined by regex. Partial matches will be considered but only the first match will be retuned.
Groups can rearranged or replaced with strings . In replacement window, $ followed by a number represents the value contained within that group of the expression.
Places each separate instance of the expression found in the string into its own column or row.
Text To Columns
Splits text from one field into multiple columns
Can ignore delimiters within brackets
Can also split to rows instead of columns
Skip empty fields - makes it so that consecutive delimiters are treated as a single delimiter