JISCMail - PASS Archives

I would hazard a guess that most social scientists don’t understand/know the “behind the scenes” workings of the statistical tests they are employing or why they do what they do. That should be evident from the widespread misinterpretation of p-values alone. And my personal suspicion is that the primary reason Bayesian approaches have not been adopted more readily because they are “harder” - you can’t just order up a test in SPSS anymore and copy/paste a table somewhere. I suspect a lot of programming is the same way; if so, breaking down that barrier will ultimately be done by providing computer science education at the graduate level so that programming is seen as a “standard” skill set.

At the end of the day, most bread-and-butter social scientists just have ideas and want to provide support for their ideas so they can write about them. Many don’t much care for the “how” of doing that. You can find a lot of examples here: http://neoacademic.com/2011/11/16/computing-intraclass-correlations-icc-as-estimates-of-interrater-reliability-in-spss/

I have answered SO MANY QUESTIONS about ICC on that page, but the majority of people asking them are more concerned with “how do I create the number I need to put in my methods section to get this published” rather than actually caring at all about what that number represents or why it’s important.

Perhaps that’s a bit pessimistic, but still…

And there is a definitely some value to “if you build it, they will come” in the sense that if people have identified a research methods problem that they can’t easily solve, handing them software that makes it a push-button affair will lead to a high adoption rate of that software. Just look at Andrew Hayes’ PROCESS macro – suddenly, mediation tests are everywhere, and I think a lot of that is because 1) people wanted to do mediation tests before but 2) writing out formulas in R or Mplus is hard. If you only need to write x=predictor,y=outcome,m=mediator, that is within the coding skill level of a lot more researchers.

-Richard

---

Richard N. Landers, Ph.D.

Associate Professor, Industrial/Organizational Psychology

Associate Editor, Computers in Human Behavior

Associate Editor, Simulation & Gaming

Old Dominion University | Mills Godwin Building 346E, Norfolk VA 23529

Website: http://rlanders.net | Blog: http://neoacademic.com

Tw: @rnlanders | Ph: 757-683-4212 | Fx: 757-683-5087

From: Programming as Social Science (PaSS) [mailto:[log in to unmask]] On Behalf Of Juan C. Corrrea
Sent: Thursday, March 2, 2017 12:11 PM
To: [log in to unmask]
Subject: Re: programming & interpretive social science

Well, John...

There are a lot of "hidden" processes in almost all the analyses that can be done with Python or any other software including SPSS. Think for instance in the case of "factor analysis" for validating psychological tests. This is a standard analysis followed by many researchers. Yet, it is already known that many of them ignore the mathematical details of this procedure and many of them omit important outputs that must be included in the results (I know at least a couple of papers that say exactly this). Obviously, this a sad situation that might be prevented with proper training in matrix algebra which guides the good reflexion... In my experience, a good strategy that we social scientists can follow is to ask for the assistance of mathematicians or computer scientists if we don't understand the specifics of the techniques... The same applies to programming languages that include specific algorithms for conducting non-traditional analyses for our data...

Juan C. Correa

https://sites.google.com/site/jcorrean/Home

On 02/03/17 11:25, John D. Boy wrote:

Phillip, thanks for your thoughts (and really, no apologies needed!). You bring up a few issues I hadn’t considered yet, particularly with your thoughts on reflexivity/methodology. That’s a crucial point, I think, and one that’s not at all obvious to folks that don’t know what social research consists of. That’s one of the difficulties with planning this talk – it’s so hard for me to identify what it is about what we do that is surprising and non-trivial from the point of view of the mainstream audience at this event. Your “bridge to nowhere” example is a very welcome one, and one that helps establish the relevance of our presentation.

Juan, thanks also for elaborating your thoughts on this. I think I agree, though I think you might be giving the tools a little too much agency. Psychologists, sociologists and others already have a set of scholarly habits, and while new research instruments might re-mold those habits, often tools will be used or repurposed in ways that fit our practice. Of course, we do come under the thrall of the “magic” of new tools. That magic can be a bit double-edged. On the one hand it can lead to new and exciting pursuits, but it can also entice us to do a bunch of things just because they are technically possible, not because they make sense.

For instance, imagine I’ve just scraped some text and stored it in a matrix. Initially I was just going to output it and then use my established toolkit (say, content analysis). But now, because I have it stored in a matrix, and because I can easily call a library with a text classifier, I might be tempted to do just that. But my expertise is not in computational linguistics or machine learning, so I don’t entirely know what the classifier does, even if I read the code or look up the references. Still, I might try it out, but ultimately I have to reflect on whether the method or the results do justice to the problem I’m researching. At that point I may decide that yes, the automated text classifier is suited, or no, I should stick with my interpretative (manual) approach. In other words, I have to consider the problem Phil raises, that “the things that work for the ‘hard sciences’ don’t necessarily transpose to the social sciences.” So it’s a dialogical process, not a linear one, as you suggest.

John

http://uva.nl/profile/j.d.boy

From: Programming as Social Science (PaSS) [mailto:[log in to unmask]] On Behalf Of Juan C. Corrrea
Sent: donderdag 2 maart 2017 15:44
To: [log in to unmask]
Subject: Re: programming & interpretive social science

Hi John and Phil

I agree with Phil's argument: "the things that work for the 'hard sciences' don't necessarily easily transpose to the social sciences". I insist, however, that using programming languages like Python, R, JAVA, LaTex, etc. will be more and more frequent in our "soft sciences". Why? Because they are powerful tools that facilitate us the processing of scientific tasks that otherwise would be boring. Most of my colleagues deplore the use of these tools until they see their benefits. The most recent anecdote was the one related with using Rstudio for analyzing, visualizing and writing a scientific paper with the apa6 package for Latex which is available for Sweave documents in RStudio (note that this implies to discard the use of the frequent combination of SPSS, Excel and Word for analyzing data, plotting graphics and writing the paper). Once they saw all the magic working, they changed their minds (recognizing the need to learn)... These programming languages represent a significant change in relation to the tools we use to collect and analyze data, visualize relationships among variables and report our papers. Perhaps we (social scientists) are not well trained to use these tools. Perhaps we need more training and education to understand them well. But this depends more in our openness and less in how they operate.
Juan C. Correa
https://sites.google.com/site/jcorrean/Home
On 02/03/17 08:22, Phillip Brooker wrote:

Hi John

I've never presented at anything like PyData either, but am very much on board with the idea of your talk (would love to hear more about it!). I think a key thing to do in terms of framing the talk will be to have some really clear demonstrations of where and why social/human science requires different things from Python. One thing might be our central commitment to methodology (i.e. the social sciences are typically very reflexive in terms of how we think about methods) - in this regard, just using third-party software and algorithms often isn't good enough for our purposes, we need to be able to see and account for (and possibly also tweak) how these things work in order to do good social science research with them. And this sort of feeds into what we need out of tools like Python - libraries like matplotlib, even though it's a pretty standard tool across a range of disciplines, most will just draw on matplotlib as a graphing library without getting 'under the hood' and taking advantage of it being an open source thing. The fact that it's open source is interesting to us, however, in that that's what gives us the access we need to be reflexive about our methods (as long as we can read the code, that is). Even for something as established as the idea of a graph to visualise data, this is something we probably want to be critical about inasmuch as the algorithms that govern the visualisation thereby also govern our analyses of it! So this seems to me to be an aspect of programming where we have a very different interest in programming than 'hard science' researchers, and it leads us to do different things (which, I guess, we might call "having a more critical engagement with programming tools", in a typical sociological parlance?).

Contrast this with the grid computing project that the National Centre for e-Social Science (at Manchester) was involved in some years back (maybe 2007/2008 if I remember right?) - on a technical level they built a fully operational grid computing system that was, in principle, available for use by any social science researcher. It's the kind of thing that had been around for a good few years in the 'hard sciences' already (I think), and it was assumed that making this technical infrastructure available would lead (computational) social scientists to start asking different questions - "if you build it, they will come" kind of thing. But, the problem was that nobody was really interested in research questions where the kinds of processing power offered by grid computing would be a necessity! It didn't really address any of the problems social scientists were having in terms of getting to grips with different (i.e. digital) datasets, and nothing really came of it. Again, it kind of shows, in a roundabout way (sorry!) that the things that work for the 'hard sciences' don't necessarily easily transpose to the social sciences.

Anyway, apologies for the rambling thoughts John, would be very keen to hear your thoughts on the conference if/when you do end up going, please keep in touch!

Phil

From: Programming as Social Science (PaSS) <[log in to unmask]> on behalf of Juan C. Correa <[log in to unmask]>
Sent: 01 March 2017 17:39
To: [log in to unmask]
Subject: Re: programming & interpretive social science

Hi John

Your talk looks interesting. Yet, you have said something that I think is crucial: "The kind of knowledge generated in the human sciences is different, so the ways in which it is produced differs as well."

As a Psychologist I am convinced that Python and other computational tools are calling our attention to re-consider the traditional ways we social scientists follow to collect relevant data...

---------------------

Juan C. Correa
https://sites.google.com/site/jcorrean

From: Programming as Social Science (PaSS) <[log in to unmask]> on behalf of John D. Boy <[log in to unmask]>

Sent: Wednesday, March 1, 2017 12:18:50 PM
To: [log in to unmask]
Subject: programming & interpretive social science

I was very happy to find out about this list yesterday. I think Phillip and Jonathan have really identified an area where a lot of individual efforts and researchers are in search of coherence and community. Thanks!

Since joining, I've been toying around with a few things I'd like to send to the list -- there's so much to discuss! For practical reasons (because it's due in a few days), I thought I'd start by sending a draft of an abstract I intend to submit to PyData Amsterdam. I've never presented in that kind of venue, and I am curious whether this list might have any advice on how to frame my talk, which is really just intended to give the PyData audience insight into how some colleagues and I (tech-savvy sociologists who are more at home on the interpretive than on the quantitative side of the discipline) work with the scientific Python ecosystem to do our work.

Also, I will hang around in the IRC channel #pass on the OFTC network for the next couple of days, in case anybody wants to join. Might be a nice additional venue to further these kinds of discussions.

Anyway, below is what I have so far. I'm curious what you think.

Kind regards,
John

A Python Stack for the Human Sciences: Approaches and Applications

A thriving multitude of academics uses Python in their research work. While it is diverse and dynamic, the Scientific Python community is often dominated by physicists, biologists and, to a lesser degree, computational social scientists. The visibility of these disciplines and their applications means that Python is widely recognized as a powerful tool to create positive knowledge in the hard sciences and in data science. But Python has also been adopted by people in the human sciences, including digital humanists and interpretive social scientists. (We count ourselves among the latter group.) The kind of knowledge generated in the human sciences is different, so the ways in which it is produced differs as well. Scholarly habits differ, and so do the demands on the research process (e.g., with regard to reproducibility, the tradeoffs made between reliability and validity, or the role of theory and theory development in our scholarly pursuits). Scholars in the human sciences wield Python as a tool in ways that are unique and not often seen by others in the community. This talk provides insight into some approaches and applications of interpretive social scientists who use Python to study social, spatial and cultural patterns to support critical inquiry into everyday life.

--
Sent from my mobile device. Mobile key: 0x2ed71638