JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for JISC-REPOSITORIES Archives


JISC-REPOSITORIES Archives

JISC-REPOSITORIES Archives


JISC-REPOSITORIES@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

JISC-REPOSITORIES Home

JISC-REPOSITORIES Home

JISC-REPOSITORIES  February 2010

JISC-REPOSITORIES February 2010

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: Whether Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research

From:

Stevan Harnad <[log in to unmask]>

Reply-To:

Stevan Harnad <[log in to unmask]>

Date:

Mon, 8 Feb 2010 11:26:32 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (326 lines)

     ** APOLOGIES FOR CROSS-POSTING **

What follows below is -- I think readers will agree -- a conscientious  
and attentive series of responses to questions raised by Phil Davis  
about our paper testing whether the OA citation Advantage is just a  
side-effect of author self-selection (Gargouri et al, currently under  
refereeing) -- responses for which we did further analyses of our data  
(not included in the draft under refereeing).

Gargouri, Y., Hajjem, C., Lariviere, V., Gingras, Y., Brody, T., Carr,  
L. and Harnad, S. (2010) Self-Selected or Mandated, Open Access  
Increases Citation Impact for Higher Quality Research.(Submitted) http://eprints.ecs.soton.ac.uk/18346/

We are happy to have performed these further analyses, and we are very  
much in favor of this sort of open discussion and feedback on pre- 
refereeing preprints of papers that have been submitted and are  
undergoing peer review. They can only improve the quality of the  
eventual published version of articles.

However, having carefully responded to Phil's welcome questions,  
below, we will, at the end of this posting, ask Phil to respond in  
kind to a question that we raised about his own paper (Davis et al  
2008) a year and a half ago...

RESPONSES TO DAVIS'S QUESTIONS ABOUT OUR PAPER:

On 8-Jan-10, at 10:06 AM, Philip Davis wrote:

 > PD:
 > Stevan,
 > Granted, you may be more interested in what the referees of the paper
 > have to say than my comments; I'm interested in whether this paper is
 > good science, whether the methodology is sound and whether you  
interpret
 > your results properly.

We are very appreciative of your concern and hope you will agree that  
we have not been interested only in what the referees might have to  
say. (We also hope you will now in turn be equally responsive to a  
longstanding question about your own paper on this same topic.)

 > PD:
 > For instance, it is not clear whether your Odds Ratios are  
interpreted
 > correctly.  Based on Figure 4, OA article are MORE LIKELY to receive
 > zero citations than 1-5 citations (or conversely, LESS LIKELY to  
receive
 > 1-5 citations than zero citations).
 > You write: "For example, we can say for the first model that for a  
one
 > unit increase in OA, the odds of receiving 1-5 citations (versus zero
 > citations) increased by a factor of 0.957. Figure 4.. (p.9)

You are interpreting the figure incorrectly. It is the higher citation  
count that is in each case more likely, as co-author Yassine Gargouri  
pointed out to you in a subsequent response, to which you replied:

 > PD:
 > Yassine, Thank you for your response.  I find your odds ratio
 > methodology unnecessarily complex and unintuitive but now
 > understand your explanation, thank you.

Our article supports its conclusions with several different,  
convergent analyses. The logistical analysis with the odds ratio is  
one of them, and its results are fully corroborated by the other,  
simpler analyses we also reported, as well as the supplementary  
analyses we append here now.

 > PD:
 > Similarly in Figure 4 (if I understand the axes correctly), CERN  
articles
 > are more than twice as likely to be in the 20+ citation category
 > than in the 1-5 citation category, a fact that may distort further
 > interpretation of your data as it may be that institutional effects  
may
 > explain your Mandated OA effect.  See comments by Patrick Gaule and  
Ludo
 > Waltman on the review http://j.mp/8LK57u

Here is the analysis underlying Figure 4, re-done without CERN, and  
then again re-done without either CERN or Southampton. As will be  
seen, the outcome pattern, as well as its statistical significance,  
are the same whether or not we exclude these institutions.

SUPPLEMENTARY FIGURE S1:
http://eprints.ecs.soton.ac.uk/18346/7/Supp1_CERN%2DSOTON.pdf

On 11-Jan-10, at 12:37 PM, Philip Davis wrote:

 > PD:
 > Changing how you report your citation ratios, from the ratio of log
 > citations to the log of citation ratios is a very substantial  
change to
 > your paper and I am surprised that you point out this reporting  
error at
 > this point.

As noted in Yassine's reply to Phil, that formula was incorrectly  
stated in our text, once; in all the actual computations, results,  
figures and tables, however, the correct formula was used.

 > PD:
 > While it normalizes the distribution of the ratios, it is not without
 > problems, such as:
 >
 > 1. Small citation differences have very large leverage in your
 > calculations.  Example, A=2 and B=1, log (A/B)=0.3

The log of the citation ratio was used only in displaying the means  
(Figure 2), presented for visual inspection. The paired-sample t-tests  
of significance (Table 2) were based on the raw citation counts, not  
on log ratios, hence had no leverage in our calculations or their  
interpretations. (The paired-sample t-tests were also based only on  
2004-2006, because for 2002-2003 not all the institutional mandates  
were yet in effect.)

Moreover, both the paired-sample t-test results (2004-2006) and the  
pattern of means (2002-2006) converged with the results of the (more  
complicated) logistical regression analyses and subdivisions into  
citation ranges.

 > PD:
 > 2. Similarly, any ratio with zero in the denominator must be thrown  
out
 > of your dataset.  The paper does not inform the reader on how much  
data
 > was ignored in your ratio analysis and we have no information on the
 > potential bias this may have on your results.

As noted, the log ratios were only used in presenting the means, not  
in the significance testing, nor in the logistic regressions.

However, we are happy to provide the additional information Phil  
requests, in order to help readers eyeball the means. Here are the  
means from Figure 2, recalculated by adding 1 to all citation counts.  
This restores all log ratios with zeroes in the numerator (sic); the  
probability of a zero in the denominator is vanishingly small, as it  
would require that all 10 same-issue control articles have no citations!

The pattern is again much the same. (And, as noted, the significance  
tests are based on the raw citation counts, which were not affected by  
the log transformations that exclude numerator citation counts of zero.)

SUPPLEMENTARY FIGURE S2:
http://eprints.ecs.soton.ac.uk/18346/12/Supp2_Cites%2B1.pdf

This exercise suggested a further heuristic analysis that we had not  
thought of doing in the paper, even though the results had clearly  
suggested that the OA advantage is not evenly distributed across the  
full range of article quality and citeability: The higher quality,  
more citeable articles gain more of the citation advantage from OA.

In the following supplementary figure (S3), for exploratory and  
illustrative purposes only, we re-calculate the means in the paper's  
Figure 2 separately for OA articles in the citation range 0-4 and for  
OA articles in the citation range 5+.

SUPPLEMENTARY FIGURE S3:
http://eprints.ecs.soton.ac.uk/18346/17/Supp3_CiteRanges.pdf

The overall OA advantage is clearly concentrated on articles in the  
higher citation range. There is even what looks like an OA  
DISadvantage for articles in the lower citation range. This may be  
mostly an artifact (from restricting the OA articles to 0-4 citations  
and not restricting the non-OA articles), although it may also be  
partly due to the fact that when unciteable articles are made OA, only  
one direction of outcome is possible, in the comparison with citation  
means for non-OA articles in the same journal and year: OA/non-OA  
citation ratios will always be unflattering for zero-citation OA  
articles. (This can be statistically controlled for, if we go on to   
investigate the distribution of the OA effect across citation brackets  
directly.)

 > PD:
 > Have you attempted to analyze your citation data as continuous  
variables
 > rather than ratios or categories?

We will be doing this in our next study, which extends the time base  
to 2002-2008. Meanwhile, a preview is possible from plotting the mean  
number of OA and non-OA articles for each citation count. Note that  
zero citations is the biggest category for both OA and non-OA  
articles, and that the proportion of articles at each citation level  
decreases faster for non-OA articles than for OA articles; this is  
another way of visualizing the OA advantage. At citation counts of 30  
or more, the difference is quite striking, although of course there  
are few articles with so many citations:

SUPPLEMENTARY FIGURE 4:
http://eprints.ecs.soton.ac.uk/18346/22/Supp4_IndivCites.pdf

--------

REQUEST FOR RESPONSE TO QUESTION ABOUT DAVIS ET AL'S (2008) Paper:

Davis, PN, Lewenstein, BV, Simon, DH, Booth, JG, & Connolly, MJL (2008)
Open access publishing, article downloads, and citations: randomised
controlled trial British Medical Journal 337: a568
http://www.bmj.com/cgi/content/full/337/jul31_1/a568

Critique of Davis et al's paper:
"Davis et al's 1-year Study of Self-Selection Bias: No Self-Archiving
Control, No OA Effect, No Conclusion"
http://www.bmj.com/cgi/eletters/337/jul31_1/a568#199775

Davis et al had taken a 1-year sample of biological journal articles  
and randomly made a subset of them OA, to control for author self- 
selection. (This is comparable to our mandated control for author self- 
selection.) They reported that after a year, they found no significant  
OA Advantage for the randomized OA for citations (although they did  
find an OA Advantage for downloads) and concluded that this showed  
that the OA citation Advantage is just an artifact of author self- 
selection, now eliminated by the randomization.

What Davis et al failed to do, however, was to demonstrate, in the  
same sample and time-span, that author self-selection generates the OA  
citation Advantage. Without doing that, all they have shown is that in  
their sample and time-span, they found no significant OA citation  
Advantage. This is no great surprise, because their sample was small  
and their time-span was short, whereas the many of the other studies  
that have reported finding an OA Advantage were based on much larger  
samples and much longer time spans.

The question raised was about controlling for self-selected OA. If one  
tests for the OA Advantage, whether self-selected or randomized, there  
is a great deal of variability, across articles and disciplines,  
especially for the first year or so after publication. In order to  
have a statistically reliable measure of OA effects, the sample has to  
be big enough, both in number of articles and in the time allowed for  
any citation advantage to build up to become detectable and  
statistically reliable.

Davis et al need to do with their randomization methodology what we  
have done with our mandating methodology, namely, to demonstrate the  
presence of a self-selected OA Advantage in the same journals and  
years. Then they can compare that with randomized OA in those same  
journals and years, and if there is a significant OA Advantage for  
self-selected OA and no OA Advantage for randomized OA then they will  
have evidence that some or all of the OA Advantage is just a side- 
effect of self-selection. Otherwise, all they have shown is that with  
their journals, sample size and time-span, there is no detectable OA  
Advantage at all.

What Davis et al replied in their Authors' Response was instead this:

http://www.bmj.com/cgi/eletters/337/jul31_1/a568#200109

 > PD:
 > "Professor Harnad comments that we should have implemented
 > a self-selection control in our study. Although this is an
 > excellent idea, it was not possible for us to do so because,
 > at the time of our randomization, the publisher did not permit
 > author-sponsored open access publishing in our experimental
 > journals. Nonetheless, self-archiving, the type of open access
 > Prof. Harnad often refers to, is accounted for in our regression
 > model (see Tables 2 and 3)... Table 2  Linear regression output
 > reporting independent variable effects on PDF downloads for six
 > months after publication Self-archived: 6% of variance p = .361
 > (i.e., not statistically significant)... Table 3  Negative
 > binomial regression output reporting independent variable effects
 > on citations to articles aged 9 to 12 months Self-archived:
 > Incidence Rate 0.9 p = .716 (i.e., not statistically significant)...

This is not an adequate response. If a control condition was needed in  
order to make am outcome meaningful, it is not sufficient to reply  
that "the publisher and sample allowed us to do the experimental  
condition but not the control condition."

Nor is it an adequate response to reiterate that there was no  
significant self-selected self-archiving effect in the sample (as the  
regression analysis showed). That is in fact bad news for the  
hypothesis being tested.

Nor is it an adequate response to say, as Phil did in a later posting,  
that even after another half year or more had gone by, there there was  
still no significant OA Advantage. (That is just the sound of one hand  
clapping again, this time louder.)

The only way to draw meaningful conclusions from Davis et al's  
methodology is to demonstrate the self-selected self-archiving  
citation advantage, for the same journals and time-span, and then to  
show that randomization wipes it out.

Until then, our own results, which do demonstrate the self-selected  
self-archiving citation advantage for the same journals and time-span,  
show that mandating the self-archiving does not wipe it out.

Meanwhile, Davis et al's finding that although their randomized OA did  
not generate a citation increase, it did generate a download increase,  
suggests that with a larger sample and time-span there may well be  
scope for a citation advantage as well: Our own prior work and that of  
others has shown that higher earlier download counts tend to lead to  
higher later citation counts.

Bollen, J., Van de Sompel, H., Hagberg, A. and Chute, R. (2009)
A principal component analysis of 39 scientific impact measures
arXiv.org, arXiv:0902.2183v1 [cs.CY], 12 Feb. 2009, in PLoS ONE 4(6):  
e6022, http://dx.doi.org/10.1371/journal.pone.0006022

Brody, T., Harnad, S. and Carr, L. (2006) Earlier Web Usage Statistics  
as Predictors of Later Citation Impact. Journal of the American  
Association for Information Science and Technology (JASIST) 57(8)  
1060-1072.
http://eprints.ecs.soton.ac.uk/10713/

Lokker, C., McKibbon, K. A., McKinlay, R.J., Wilczynski, N. L. and  
Haynes, R. B. (2008)  Prediction of citation counts for clinical  
articles at two years using data available within three weeks of  
publication: retrospective cohort study
BMJ, 2008;336:655-657 http://www.bmj.com/cgi/content/abstract/336/7645/655

Moed, H. F. (2005) Statistical Relationships Between Downloads and  
Citations at the Level of Individual Documents Within a Single Journal  
(abstract only)
Journal of the American Society for Information Science and  
Technology, 56(10): 1088- 1097

O'Leary, D. E. (2008)  The relationship between citations and number  
of downloads
Decision Support Systems. 45(4): 972-980 http://dx.doi.org/10.1016/j.dss.2008.03.008

Watson, A. B. (2009) Comparing citations and downloads for individual  
articles
Journal of Vision, 9(4): 1-4
http://journalofvision.org/9/4/i/

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
November 2005
October 2005


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager