Please enable JavaScript.
Coggle requires JavaScript to display documents.
Re-Apply Manual Edit on Change/Re-Export Dataset (Definiiton (SourceData -…
Re-Apply Manual Edit on Change/Re-Export Dataset
SourceData
Identical Dataset
SourceData from previous cycle is exactly the same with SourceData for current cycle
Disjoint Dataset :
Totally Different Data from previous. Dataset comes in batch so everything from previous dataset is not seen in incoming dataset
Part of data is the same with previous part dataset while some part is different
Manual Edit
ManualEdits Operations
Insert - When new row/column is insert/added to the SourceData
Update - When value of specific cells is change to another value
Delete - When column/row is deleted
ManualEdits Consistency
Consistent - ManualEdits for particular cells is the same throughout all cycle
Inconsistent - ManualEdits for particular cells is different in some cycle or throughout most cycle
EditsKB
Insert in EditsKB
When working source data > incoming source data
Update in EditsKB
when field value in working source data not equal to incoming source data, for match key
Delete in EditsKB
when working source data < incoming source data
Apply Edits
When K.sourceB < K.sourceA and K.sourceA == K.EditsKB
Insert record to sourceB
Edits is insertion
When K.sourceB > K.sourceA and K.sourceB == K.EditsKB
Edits is deletion
Delete record from sourceB
When K.sourceB == K.sourceA and K.sourceB = K.EditsKB
update value to value recorded in editsKB
Edits is update
Assumption
SourceB is incoming data before edits and SourceA is data after edit or after edits is apply
Definiiton
SourceData - Dataset extracted from integration system or consolidated data from several department received through email
F(sourcedataB) is source data before edit
F(sourcedataA) is source data after edit.This is also working data
WorkingData - SourceData after manual edits is apply or perform
EditsKB- Knowledge Base is change respresentation of manual edits
Editor - Data scientist/domain expert who manually assess and edit the dataset
: