Please enable JavaScript.
Coggle requires JavaScript to display documents.
Alteryx Book Chap. 3 - Unisex Baby Name (3.4 Multi-Row Formula (the multi…
Alteryx Book Chap. 3 - Unisex Baby Name
3.2 Imputation
imputation tool replaces specific value in numeric fields. most commonly used to handle Null values in formulas
an
action
tool can be connected to the
lightning bolt anchor
to modify how this tool works in apps and macros
input
: any data stream with a numeric field
output
: data stream with at least one inputed field
tool will only modify an individual value and can't be used to impute range of values, such as all negatives
Imputation configuration window
fields to impute
: list of numerical fields from incoming data stream that one can choose to impute
incoming value to replace
: allows us to choose to replace nulls or a specific numeric value that exists in the incoming dataset
replace w value
: allows us to replace the value selected in Incoming value to replace with either the average (mean) median (middle) mode (most common) or a custom value that we select
included imputed value indicator field
: creates field for each of the fields to impute with 1s and 0s, where 1 indicates the data was imputed and 0 that it wasnt
output imputed values as a separate field
: creates a new field for each of the fields to impute instead of replacing the values in the original field
3.3 Multi-Field Formula
multi-field formula tool provides the ability to create a formula that will be reused across multiple field
/ / is the comment character
an
action
tool can be connected to the
lightning bolt anchor
to modify how this tool works in apps and macros
input
: data stream with at least two fields of same types
output
: original data stream with fields modified or original data stream with a new field for every selected field from the input
multi-field formula configuration window
select...fields
: populates the list with the fields that match what we have selected in the drop-down. The calculation is performed on each of the fields selected from the generated list.
copy output and add...as a...
: when selected, creates a new field as the output instead of replacing the ones that we are performing the calculation on. Decimal, in the example, will be appended to the field name as a Suffix or Prefix
change output type to...size...
: when selected, we can change the metadata of the output that we create
the last 2 sections are similar to the formula window that we are familiar with, except that since we are working across multiple fields, we have the ability to use the
CurrentField
in our calculations
formulas are arbitrarily complex and can slow down the data stream. Care must be taken to make sure the output field has a field type compatible with the result created
3.4 Multi-Row Formula
the multi-row formula tool provides the ability to create a formula that will reference other records
/ / is the comment character
an
action
tool can be connected to the
lightning bolt anchor
to modify how this tool works in apps and macros
application questions can be connected to the top
black question anchor
for "use those answers" in this tool
output
: the original data stream with a new field
multi-row formula configuration
update existing field
allows us to replace the entries in the field selected in the list below
create new field
is the other option for the output, which allows us to create a new field instead of updating the current one. With this selection, we need to name the field and set the metadata of the field
num rows
changes the # of rows before and after the current row we have access to in the formula
values for rows that don't exist
allows us to define the behavior for records that don't actually exist
group by
allows us to reset the calculation based on changes in values in the fields we select
the bottom two sections are similar to the formula boxes we are used to seeing except that we have the added ability to refer to field values in different rows
values updated by moving down the records to modify the current record based on the result of the formula in the previous record
formulas can be arbitrarilty complex, can slow down the data stream. Care has to be taken to ensure the output field created has a file type that is compatible with the result being created.
3.5 Text Input
an
action
tool can connect to the
lightning bolt anchor
to modify how this tool works in apps and macros
the
text input configuration
window allows us to create a data source that is internal to the module
import
allows us to import a dataset
copy
allows us to copy the cells we highlighted in the table
paste
allows us to paste values into the table
delete
allows us to delete everything, or the row or column we have selected
insert
allows us to create a row or column in the dataset
3.6 Gender Swapped
add
browse, select, run
workflow
modify
select tool
so
female and male
are converted into
double
convert
year
to
double
since we know we need to work with 2 10 year periods
add
imputation
tool: converts all of null fields in
female
and
male
fields to 0s
create filter that limits data set to
years before 1890
or
years after 2003
to isolate 2 date ranges 1880 to 1889 and 2004 to 2013, add filter to end of data stream
/ / : line is a comment: tells Alteryx to ignore that specific line from the calculation
/ / [Year] < 1890 | | [Year] > 2003 / /
or [Year] < 1890 OR [Year] > 2003
create a
flag
for first and last 10 years for dataset. creates comparison between the two time frames. create a string field that will have
first 10 years
or
last 10 years
in the field
change change field type to string w conditional formula to test if record is in first 10 years
now that we id data with
year bucket
field, we can summarize data to find total # of babies given each name during 10 years we want to analyze
3 more items...
3.7 Data Cleansing
data cleansing tool
fixes common data quality issues using a variety of parameters
an
action
tool can connect to the
lightning bolt anchor
to modify how this tool works in apps and macros. tool has capabilities to remove nulls, eliminate extra white space, clear numbers from string entry from data stream
data cleansing configuration
select fields to cleanse
using configuration options below (use multiple data cleansing tools in your workflow
select an option to replace
nulls with blanks (for string data) or replace nulls with 0 (for numeric data)
select an option to remove
letters, numbers, punctuation, multiple forms of whitespace
select an option
to change the capitalization of string data types
3.8 Generate Row
generate rows
tool creates new rows of data at record level. creates seq of #, transactions, dates
generate rows tool follows
a process to generate rows of data: consists of initial expression, a loop expression that will build subsequent rows, based on a condition that will ultimately build rows until the condition is false, when it will terminate the loop
an
actiontool can connect to the lightning bolt anchor
to modify how this tool works in apps and macros
an input connection in this tool is optional
generate rows configuration
create new filed
configured as an input. If this method is chosen, specify new Field Name and appropriate type and size
update existing fields
: assesses rows coming in and adds new records
specify initialization expression
to start creation of new rows - can place value here, create expression, or open expression editor
specify condition expression
where condition is true/false. if true, additional rows will be generated according to loop expression, until condition is false, at which generation of rows will terminate
specify loop expression
will generate subsequent rows until condition is false
3.9 Multi-Field Binning
multi-field binning
is a macro that replicates some of the functionality of the title tool but has the added feature of being able to tile or bin on multiple fields at the same time
tool built for predictive toolset and will only accept numeric fields for binning. fields are sorted ascending by default
action tool can connect to the lightning bolt anchor
to modify how this tool works in apps and macros
Multi field binning configuration
select fields for binning
is used to select numeric fields for binning
tile method
chooses the tile methods for binning: 1. equal records or 2. equal intervals
the numeric up/down selector
can be used to specify how many bins to create
3.10 Select Records
the
select records
tool returns records and ranges of records that are specified, including discontinuous ranges of records
an
action
tool can connect to the
lightning bolt anchor
to modify how this tool works in apps and macros
select records configuration
enter records or range of records to return
1 single digit like "3" returns only row 3
a minus sign in front of a number like "-2" returns everything from row 1 through row 2
a range of numbers like "17-20" returns rows 17, 18 ,19, 20
a plus sign after a number returns everything from that # row through the last row
3.11 What about me?
necessary steps
import the data
clean up the fields
don't need to convert data types bc they are both strings. If they weren't it would cause an issue in joining the data
use
join
tool to filter data to appropriate records. keep name field coming out of right
text input
add
text input
with column called
name
, and name as the only record
input a name
change
text input
name to Andrea
limit the records by that name
create calculated fields that show us the percent breakdown
add
multi-field formula
tool after J output from
join
with configuratino to create % total for Male and Female babies each year
export the data