Webinar #20 – Organizing data in spreadsheets

Friday, September 24th at 10am PDT/ 11am MDT/ 12pm CDT/ 1pm EDT
1-hour presentation followed by 30 minutes of discussion

Summary of this webinar:
Spreadsheets are widely used software tools for data entry, storage, analysis, and visualization. Focusing on the data entry and storage aspects, this presentation will offer practical recommendations for organizing spreadsheet data to reduce errors and ease later analyses. The basic principles are: be consistent, write dates like YYYY-MM-DD, do not leave any cells empty, put just one thing in a cell, organize the data as a single rectangle (with subjects as rows and variables as columns, and with a single header row), create a data dictionary, do not include calculations in the raw data files, do not use font color or highlighting as data, choose good names for things, make backups, use data validation to avoid data entry errors, and save the data in plaintext files. Broman KW, Woo KH (2018) Data organization in spreadsheets. The American Statistician 78:2–10
(https://doi.org/gdz6cm)

Presented by:
Karl Broman, PhD
Professor
Department of Biostatistics & Medical Informatics
University of Wisconsin-Madison

Webinar flyer (pdf)

Link to course material: https://github.com/OSGA-OPAR/quant-genetics-webinars/tree/master/2021-09-24

This webinar series is sponsored by the NIDA Center of Excellence in Omics, Systems Genetics, and the Addictome (P30 DA044223).