Large open datasets on health present unique opportunities for studies that can make valuable contributions to the field of health and medicine. These studies need to be well designed, implemented and reported. In recent years, there has been an alarming increase in analyses that do not include reliable or novel findings. This editorial provides guidance to authors on conducting and reporting high-quality secondary analyses using these datasets.