JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for JISC-REPOSITORIES Archives


JISC-REPOSITORIES Archives

JISC-REPOSITORIES Archives


JISC-REPOSITORIES@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

JISC-REPOSITORIES Home

JISC-REPOSITORIES Home

JISC-REPOSITORIES  November 2006

JISC-REPOSITORIES November 2006

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Self-Archiving Impact Advantage: Quality Advantage or Quality Bias?

From:

Stevan Harnad <[log in to unmask]>

Reply-To:

Stevan Harnad <[log in to unmask]>

Date:

Mon, 20 Nov 2006 21:49:42 +0000

Content-Type:

TEXT/PLAIN

Parts/Attachments:

Parts/Attachments

TEXT/PLAIN (215 lines)

    Self-Archiving Impact Advantage: Quality Advantage or Quality Bias?

                 Stevan Harnad

    SUMMARY: In astrophysics, Kurtz found that articles that were
    self-archived by their authors in Arxiv were downloaded and cited
    twice as much as those that were not. He traced this enhanced citation
    impact to two factors: (1) Early Access (EA): The self-archived
    preprint was accessible earlier than the publisher's version (which
    is accessible to all research-active astrophysicists as soon as
    it is published, thanks to Kurtz's ADS system). (Hajjem, however,
    found that in other fields, which self-archive only published
    postprints and do have accessibility/affordability problems with
    the publisher's version, self-archived articles still have enhanced
    citation impact.) Kurtz's second factor was: (2) Quality Bias (QB),
    a selective tendency for higher quality articles to be preferentially
    self-archived by their authors, as inferred from the fact that the
    proportion of self-archived articles turns out to be higher among
    the more highly cited articles.  (The very same finding is of course
    equally interpretable as (3) Quality Advantage (QA), a tendency for
    higher quality articles to benefit more than lower quality articles
    from being self-archived.) In condensed-matter physics, Moed has
    confirmed that the impact advantage occurs early (within 1-3 years of
    publication). After article-age is adjusted to reflect the date of
    deposit rather than the date of publication, the enhanced impact of
    self-archived articles is again interpretable as QB, with articles by
    more highly cited authors (based only on their non-archived articles)
    tending to be self-archived more.  (But since the citation counts
    for authors and for their articles are correlated, one would expect
    much the same outcome from QA too.) The only way to test QA vs. QB
    is to compare the impact of self-selected self-archiving with
    mandated self-archiving (and no self-archiving).  (The outcome is
    likely to be that both QA and QB contribute, along with EA, to the
    impact advantage.)

Michael Kurtz's papers have confirmed that in astronomy/astrophysics
(astro), articles that have been self-archived -- let's call this
"Arxived" to mark it as the special case of depositing in the central
Physics Arxiv -- are cited (and downloaded) twice as much as non-Arxived
articles. Let's call this the "Arxiv Advantage" (AA).
http://arxiv.org/

    Henneken, E. A., Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant,
    C., Thompson, D., and Murray, S. S. (2006) Effect of E-printing
    on Citation Rates in Astronomy and Physics. Journal of Electronic
    Publishing, Vol. 9, No. 2
    http://arxiv.org/abs/cs/0604061

    Henneken, E. A., Kurtz, M. J., Warner, S., Ginsparg, P., Eichhorn, G.,
    Accomazzi, A., Grant, C. S., Thompson, D., Bohlen, E. and Murray, S.
    S. (2006) E-prints and Journal Articles in Astronomy: a Productive
    Co-existence (submitted to Learned Publishing)
    http://arxiv.org/abs/cs/0609126

    Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant, C. S., Demleitner,
    M., Murray, S. S. (2005) The Effect of Use and Access on Citations.
    Information Processing and Management, 41 (6): 1395-1402
    http://cfa-www.harvard.edu/~kurtz/kurtz-effect.pdf

Kurtz analyzed AA and found that it consisted of at least 2 components:

(1) EARLY ACCESS (EA): There is no detectable AA for old articles in
astro: AA occurs while an article is young (1-3 years). Hence astro
articles that were made accessible as preprints before publication show
more AA: This is the Early Access effect (EA). But EA alone does not
explain why AA effects (i.e., enhanced citation counts) persist
cumulatively and even keep growing, rather than simply being a
phase-advancing of otherwise un-enhanced citation counts, in which case
simply re-calculating an article's age so as to begin at preprint
deposit time instead of publication time should eliminate all AA effects
-- which it does not.

(2) QUALITY BIAS (QB): (Kurtz called the second component
"Self-Selection Bias" for quality, but I call it self-selection Quality
Bias, QB): If we compare articles within roughly the same
citation/quality bracket (i.e., articles having the same number of
citations), the proportion of Arxived articles becomes higher in the
higher citation brackets, especially the top 200 papers. Kurtz
interprets this is as resulting from authors preferentially Arxiving
their higher-quality preprints (Quality Bias).

Of course the very same outcome is just as readily interpretable as
resulting from Quality Advantage (QA) (rather than Quality Bias (QB)):
i.e., that the Arxiving benefits better papers more. (Making a
low-quality paper more accessible by Arxiving it does not guarantee more
citations, whereas making a high-quality paper more accessible is more
likely to do so, perhaps roughly in proportion to its higher quality,
allowing it to be used and cited more according to its merit,
unconstrained by its accessibility/affordability.)

There is no way, on the basis of existing data, to decide between QA and
QB. The only way to measure their relative contributions would be to
control the self-selection factor: randomly imposing Arxiving on half of
an equivalent sample of articles of the same age (from preprinting age
to 2-3 years postpublication, reckoning age from deposit date, to
control also for age/EA effects), and comparing also with self-selected
Arxiving.

We are trying an approximation to this method, using articles deposited
in Institutional Repositories of institutions that mandate
self-archiving (and comparing their citation counts with those of
articles from the same journal/issue that have not been self-archived),
but the sample is still small and possibly unrepresentative, with many
gaps and other potential liabilities. So a reliable estimate of the
relative size of QA and QB still awaits future research, when
self-archiving mandates will have become more widely adopted.

Henk Moed's data on Arxiving in Condensed Matter physics (cond-mat)
replicates Kurtz's findings in astro (and Davis/Fromerth's, in math):

    Moed, H. F. (2006, preprint) The effect of 'Open Access' upon citation
    impact: An analysis of ArXiv's Condensed Matter Section
    http://arxiv.org/abs/cs.DL/0611060

    Davis, P. M. and Fromerth, M. J. (2007) Does the arXiv lead to
    higher citations and reduced publisher downloads for mathematics
    articles? Scientometics, accepted for publication. 
    http://arxiv.org/abs/cs.DL/0603056
    See critiques:
    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/subject.html#5221
    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/5440.html

Moed too has shown that in cond-mat the AA effect (which he calls CID
"Citation Impact Differential") occurs early (1-3 years) rather than
late (4-6 years), and that there is more Arxiving by authors of
higher-quality (based on higher citation counts for their non-Arxived
articles) than by lower-quality authors. But this too is just as readily
interpretable as the result of QB or QA (or both): We would of course
expect a high correlation between an author's individual articles'
citation counts and the author's average citation count, whether the
author's citation count is based on Arxived or non-Arxived articles.
These are not independent variables.

(Less easily interpretable -- but compatible with either QA or QB
interpretations -- is Moed's finding of a smaller AA for the "more
productive" authors. Moed's explanations in terms of co-authorships
between more productive and less productive authors, senior and junior,
seem a little complicated.)

The basic question is this: Once the AA has been adjusted for the
"head-start" component of the EA (by comparing articles of equal age --
the age of Arxived articles being based on the date of deposit of the
preprint rather than the date of publication of the postprint), how big
is that adjusted AA, at each article age? For that is the AA without any
head-start. Kurtz never thought the EA component was merely a head
start, however, for the AA persists and keeps growing, and is present in
cumulative citation counts for articles at every age since Arxiving
began. This non-EA AA is either QB or QA or both. (It also has an
element of Competitive Advantage, CA, which would disappear once
everything was self-archived, but let's ignore that for now.)

    Harnad, S. (2005) OA Impact Advantage = EA + (AA) + (QB) + QA +
    (CA) + UA. Preprint.
    http://eprints.ecs.soton.ac.uk/12085/

Moed's analysis, like Kurtz's, cannot decide between QB and QA. The fact
that most of the AA comes in an article's first 3 years rather than its
second 3 years simply shows that both astro and cond-mat are
fast-developing fields. The fact that highly-cited articles (Kurtz) and
articles by highly-cited authors (Moed) are more likely to be Arxived
certainly does not settle the question of cause and effect: It is just
as likely that better articles benefit more from Arxiving (QA) as that
better authors/articles tend to Arxive/be-Arxived more (QB).

Nor is Arxiv the only test of the self-archiving Open Access Advantage.
(Let's call this OAA, generalizing from the mere Arxiving Advantage,
AA): We have found an OAA with much the same profile as the AA in 10
further fields, for articles of all ages (from 1 year old to 10 years
old), and as far as we know, with the exception of Economics, these are
not fields with a preprinting culture (i.e., they don't self-archive
prepublication preprints but only postpublication postprints). Hence
the consistent pattern of OAA across all fields and across articles of
all ages is very unlikely to have been just a head-start (EA) effect.

    Hajjem, C., Harnad, S. and Gingras, Y. (2005) Ten-Year
    Cross-Disciplinary Comparison of the Growth of Open Access and How
    it Increases Research Citation Impact. IEEE Data Engineering Bulletin
    28(4) pp. 39-47.
    http://eprints.ecs.soton.ac.uk/11688/

Is the OAA, then, QB or QA (or both)? There is no way to determine this
unless the causality is controlled by randomly imposing the
self-archiving on a subset of a sufficiently large and representative
random sample of articles of all ages (but especially newborn ones) and
comparing the effect across time.

In the meantime, here are some factors worth taking into account:

(1) Both astro and and cond-mat are fields where it has been repeatedly
claimed that the accessibility/affordability problem for published
postprints is either nonexistent (astro) or less pronounced than in
other fields. Hence the only scope for an OAA in astro and cond-mat is
at the prepublication preprint stage.

(2) In many other fields, however, not only is there no prepublication
preprint self-archiving at all, but there is a much larger
accessibility/affordability barrier for potential users of the published
article. Hence there is far more scope for OAA and especially QA (and
CA): Access is a necessary (though not a sufficient) causal precondition
for impact (usage and citation).

It is hence a mistake to overgeneralize the phys/math AA findings to OAA
in general. We need to wait till we have actual data before we can draw
confident conclusions about the degree to which the AA or the OAA are a
result of QB or QA or both (and/or other factors, such as CA).

For the time being, I find the hypothesis of a causal QA (plus CA)
effect, successfully sought by authors because they are desirous of
reaching more users, far more plausible and likely than the hypothesis
of an a-causal QB effect in which the best authors are self-archiving
merely out of superstition or vanity! (And I suspect the truth is a
combination of both QA/CA and QB.)

Stevan Harnad

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
November 2005
October 2005


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager