NTARI Data Prep Checklist
- Calvin Secrest

- Jul 2
- 2 min read
Purpose: Ensure all datasets submitted to NTARI’s Quantum Community Detection platform are clean, ethical, and ready for quantum and classical analysis.

1. Data Source Verification
Is your dataset from a public or open-source origin?
If the data involves human subjects, is it fully anonymized?
Have you confirmed the dataset complies with NTARI's Data Ethics Policy?
2. File Format
Dataset is saved in CSV or GraphML format.
File encoding is UTF-8.
File is under 100MB (larger datasets require special handling).
Filename uses clear naming convention:
projectname_networktype_date.csv (e.g., ntari-forum_edgelist_2025-05.csv) 3. Graph Structure
Your data should represent a network graph (nodes and edges). Confirm:
File includes source and target columns (node relationships).
Optional columns (e.g., weight, type) are labeled clearly.
There are no missing values in key columns.
Nodes are consistently named (no mix of full names/usernames/random IDs).
No circular self-loops unless intentional (e.g., user1 → user1).
Sample CSV format:
source,target,weight
Alice,Bob,3
Bob,Carol,1
Carol,Alice,2 4. Data Cleaning
Removed duplicate rows or redundant edges.
Trimmed whitespace from node names.
Converted all node labels to consistent case (e.g., all lowercase).
Ensured numerical values (e.g., weights) are in the correct format.
Checked that special characters don’t break parsing.
5. Metadata Submission
Include a brief metadata file (README.md or in Q-Zoo group message) with:
Dataset title
Source (URL, research paper, or origin)
Description (what the dataset represents)
Date range of the data
Known limitations (e.g., sampling bias, incomplete fields)
6. Final Checks
Tested the file upload through Q-Zoo Portal https://ntari.org/group-page/q-zoo-quantum-network-theory or the internal tool.
Confirmed file opens in a spreadsheet editor (e.g., Excel, LibreOffice).
Uploaded the file to the shared NTARI Drive or submitted via the web form.
Notified the #data-insights channel in Slack with your submission summary.
Questions?
Reach out in Q-Zoo group page using #data-insights or email: tech@ntari.org
Thank you for helping us build quantum-powered tools for collective intelligence!




Comments