Secondary data sources on the web

In class today someone asked about some of the best resources on the web for secondary data. Here are a few that come to mind.

Teaching Good Research Practice

I attended a webinar on how to teach students to document empirical research by Richard Ball and Norm Medeiros from Havorford College and hosted by the Interuniversity Consortium for Political and Social Research (ICPSR). This idea aims to counter current norms, policies and practices in teaching empirical research by having students submit all their statistical analyses with their final project. This should include all the necessary documentation to allow a third-party to replicate all statistical results, what Ball and Medeiros call “a soup-to-nuts approach”. This approach in turn enhances professional norms and practices through a trickle-up effect, students actually understand what they are doing, and students know they are being held accountable. The webinar used an example from an economics course, but it is easy to imagine the potential for social work education and research.

The slides are available on their YouTube channel. It’s worth checking out and rethinking how we can use this in our classrooms and research.

Dataverse up and running

In the spirit of open access of data I am pleased to report here that we have two datasets now uploaded to the Dataverse network. The process of preparing and uploading the data was rather straightforward (minus a glitch in the system that required technical assistance. The dataverse IT has been incredibly responsive.)

You can see the link on the SDRG website, or go directly to Dataverse with the links below.

Data visualization

Social work dissertation scholarship in Canada

Moving forward I want to explore the types of datasets that are made available in Dataverse. Just this week I saw that the Urban Institute Home Mortgage Disclosure Act data there (fascinating). I often talk about this data in my social development class and was pleased to see it on Dataverse. I think there are countless datasets available for teaching and research there.

I also continue to ponder what types of data could be uploaded and shared there. In theory I believe that replication data used in the analysis of each paper I’ve published should be made available to anyone. However, I suspect that the community agencies that I work with would have reservations sharing this data. Further, what to do about secondary public use microdata from Statistics Canada?

Amsterdam Manifesto on data citation and sharing

Below you can find the Amsterdam Manifesto on data citation and sharing. These principles allign well with the philsophy of replication and reproducible research that we’ve discussed so much at the brownbag.

For those who expressed concern about data sharing in social work HERE is a post by @carlystrasser who tackles many of the arguments against data sharing, including “my data is embarrasingly bad”.
Unfortunately, she does not discuss sensitive data topics with vulnerable populations.

The Amsterdam Manifesto on Data Citation Principles
We wish to promote best practices in data citation to facilitate access to data sets and to enable attribution and reward for those who publish data. Through formal data citation, the contributions to science by those that share their data will be recognized and potentially rewarded. To that end, we propose that:
1. Data should be considered citable products of research.
2. Such data should be held in persistent public repositories.
3. If a publication is based on data not included with the article, those data should be cited in the publication.
4. A data citation in a publication should resemble a bibliographic citation and be located in the publication’s reference list.
5. Such a data citation should include a unique persistent identifier (a DataCite DOI recommended, or other persistent identifiers already in use within the community).
6. The identifier should resolve to a page that either provides direct access to the data or information concerning its accessibility. Ideally, that landing page should be machine-actionable to promote interoperability of the data.
7. If the data are available in different versions, the identifier should provide a method to access the previous or related versions.
8. Data citation should facilitate attribution of credit to all contributors

This Manifesto was created during the Beyond the PDF 2 Conference in Amsterdam, 20 March 2013.
Original authors are Mercè Crosas, Todd Carpenter, David Shotton and Christine Borgman.

See more on the Amsterdam Manifesto HERE

data overload

I’ve just finished the International Society for Child Indicators conference in Seoul. One thing is clear: it’s a good time to be a child well-being researcher. Never before has there been so much high quality data made available by various international groups. And this trend will only increase in the future. This reality makes it possible to investigate many policy-relevant questions across the globe without leaving campus.
But this trend raises several questions/cocerns. Foremost is the real risk that data availability will drive the research enterprise rather than carefully constructed, theory-developing research questions.

Some of the new databases I learned about from the conference are presented below.

* International Survey of Children’s Well-being aka ChildrensWorlds project
* Save the Children data project in progress
* World Family Map Project
* Poverty and Social Exclusion Survey
* European Statistics on Income and Living Conditions
* Multiple Indicator Child Survey from UNICEF
* Child Trends Databank
* Health Behavior in School Aged Children
* Children’s Chances

Blog authors are solely responsible for the content of the blogs listed in the directory. Neither the content of these blogs, nor the links to other web sites, are screened, approved, reviewed or endorsed by McGill University. The text and other material on these blogs are the opinion of the specific author and are not statements of advice, opinion, or information of McGill.