Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

CBMS Deduplicator

No description
by

on 10 September 2015

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of CBMS Deduplicator

CBMS Deduplicator
CBMS Deduplicator
CBMS deduplicator is a simple tool developed to help you and LGUs pinpoint the duplicates
Steps in using the CBMS Deduplicator
1. Install the file "vcredist_x86.exe"
The installer was developed by the CBMS INCT and is shared in your training kits/USBs
Steps:
2. Install the deduplicator tool "CBMSDeduplicator.exe"
4. Select the file that will be checked for duplicates by doing the following:
Steps
Partnership for Economic Policy (PEP) Asia
Community-based Monitoring System (CBMS) International Network
DLSU Angelo King Institute for Economic and Business Studies
10th Floor Angelo King International Center
Estrada corner Arellano Avenue, Malate, Manila, Philippines 1004
Tel. No.: (632) 5262067 or 2305100 loc. 2461
Fax Number: (632) 5262067
Website: http://pep-net.org
Email: cbms.network@gmail.com
Facebook Page: http://facebook.com/CBMSPhilippines
Facebook Interactive Group: https://www.facebook.com/groups/CBMSNetwork/
Thank you!!!
3. Once installed, launch the program by clicking on the shortcut icon in the desktop or by going to
Start
> All Programs > CBMS Deduplicator
4.1. Click
File
from the main menu, then click
Open
4.2. Find the path where you saved your extracted .can file. In this example, it is saved in
C:\CBMSDatabase\Tanauan City
4.3. Select the file
'main.csv'
and click
Open
Steps:
5. The CBMS Deduplicator will display information on the duplicates in two lists:
5a. List of HH cases with identical ID variables
This lists the duplicates detected in terms of
brgy, purok, hcn
but usually has different
respondent's name
and other contents
5b. List of HH cases with identical ID data contents for most variables
This list contains matched households with identical ID data contents for most variables. It shows the main ids and highlights the difference between the data.
List of HH cases with identical ID variables
Steps
The first list below shows the mainids (in dup.ids column) with duplicate
brgy, purok, hcn
. The duplicated
brgy, purok, hcn
can be viewed by opening the 'main.csv' file

dup.rows
- shows where the duplicates are in the 'main.csv' file
dup.ids
- these are the mainids of the duplicated brgy, purok, hcn
Description of the columns:

main.id
- the first main.id in the in the 'main.csv' file detected to have a duplicate brgy, purok, hcn
main.row
- the main row wherein the first duplicate can be found
Steps
To check if they are really duplicates:
i. Open the 'main.csv' file
ii. Find (CTRL + F) the first mainid to have a duplicate brgy, purok, hcn
iii. Or check the row based from the list displayed in the Deduplicator

Note: Add 2 in the identified row/s in the Deduplicator. From the example, the row is 285, + 2 = 287th row in the main.csv file
To check if they are really duplicates:
v. By doing this, you would notice that they are indeed duplicates and would need a
new hcn
to remove the duplicates
iv. Highlight all the observation with duplicates and check for the brgy, purok, hcn, respondent and other contents
6. Assign
new hcns
by going to the Deduplicator and double-clicking the
for_editing
column. HCN window will be displayed:
Steps
7. Enter [] if the hcn will be retained while enter a new hcn if it will be edited
Guidelines in assigning new hcn:
i. Ensure that the new hcn to be assigned will not be used by the enumerators, if it is an ongoing data collection
ii. Ensure that you are going to assign a new hcn not yet used based from the main.csv file
iii. Check the last hcn used from the main.csv file by sorting the hcn column from smallest to largest
In the main.csv file, highlight the
hcn
column
Click
Data
in the menu and click the icon with AZ as displayed below:
Choose
Expand the selection
Click
Sort
Browse the file until the end where you can find the last hcn used
Make this as basis for assigning new HCNs. Also consider if the data collection is ongoing, then assign a higher hcn so that it will not overlap
Steps
9. The newly entered/assigned hcn which will replace the old one will be shown in the
for_editing
column
8. Enter the new hcns in the HCN window then click
Save
10. Repeat the steps for the remaining duplicates in the first list
11. To save, click
File > Save HCN Script list
12. Use this as filename:
<LGUname>_forreplacement.txt
13. Send this file to the CBMS Network and wait for an email if the changes have already been implemented in your LGU's .can file
Steps
List of HH cases with identical ID and data contents for most variables
1. Double-click the first row from the second list
2. The duplicates window will pop-out showing two data. The discrepancy between the data will be highlighted in red
Steps
The figure below shows the similarities between the two data
Steps
3. Figure out which of the duplicates will be deleted by comparing the date and time the data was sent, by checking for the contents of the variables, by asking the field editor/enumerators, etc.
4. Highlight the column to be deleted and click save
Steps
5. Click
File
> Save duplicates list
6. Use the filename:
<LGUname>_fordeletion.txt
Steps
7. Send this file to cbms.network@gmail.com for implementation in the CBMS Portal and wait for an email if changes have already been implemented to the LGU's data
8. The clean .can file with no duplicates can now be exported/downloaded and can now be processed in StatSim
Full transcript