Scientific Authorship

09 Jun 2016

Responsible Conduct of Research Course, Session 6: Authorship & Peer Review

Written as a make-up assignment for missing the session on 08 June 2016.

Write an essay (1-3 pages) describing how you would handle ethical questions related to authorship or peer review if you were a faculty member running your own research group.


Authorship is a critical aspect of the academic scientific process. Peer-reviewed publications are the currency of the academic world and as such have a large impact on an individual’s career. There are many ethical considerations to choosing the authors of a manuscript and their order, which largely fall onto the principal investigator to navigate. These include when should authorship decisions be discussed, what level of contribution warrants authorship, how should ordering be determined and particularly who should be the first author, and when are co-first authors warranted and how should they be ordered. In my limited experience I have observed a large amount variation between labs for the way each of these issues have been handled. Here I will present my current thinking on how I would manage such decisions if I had my own lab.

When should authorship be discussed?

Some of the most difficult situations of my scientific career have all revolved around authorship, particularly in who would be the first author, though perhaps that’s an indication that I’ve been relatively fortunate thus far. In general I think it’s best if each person enters into their contribution to the paper with a clear understanding of the benefit to them, and therefore that particularly the first author be assigned from the outset. There are situations when this breaks down, particularly when an individual leaves the lab before a manuscript has been published, and I’ll touch on that a bit in the “who should be first author?” section. I strongly disagree with letting the authorship “shake out” as the work progresses. That might be the most efficient way to motivate extra work from competing individuals in a lab, but I think it takes an unnecessary psychological toll.

Broadly speaking I don’t pay particular attention to the positions of authors 2 to n-1, and don’t think the precise order for those individuals needs to be determined beforehand.

What level of contribution warrants authorship?

At one of my first responsible conduct of research training courses I remember Susan Baserga, in the Molecular Biophysics and Biochemistry department at Yale, saying that in her lab co-authorship is earned by contributing a figure to a paper. While I think this definition roughly works well, there are many gray sitations. An example includes when a senior individual is training the first-author on a technique and the figure data are generated only through significant hand-holding. This is opposed to the the first author spending months or years adapting and modifying the methods to their system. In my mind the former clearly warrants co-authorship of the instructor and the latter perhaps not, and within that spectrum I would lean farther towards including the instructor as a co-author.

Another complication to the figure rule is the contribution of data for an analysis, particularly when the analysis constitutes a significant effort. This is often true of sequencing projects where the data generation can be fairly routine. In such cases I feel co-authorship of the data generators should be offerred if the protocol for generating the data had to be significantly optimized or modified, or if the size of the data warranted special consideration (e.g. thousands of processed samples).

A recent NEJM editorial by Longo and Drazen decried as “reseach parasites” those who use publicly available data, particularly from clinicians, without involving the data generators and offerring co-authorship. The backlash from this article demonstrates support for the position that such usage sans co-authorship is not only reasonable but should be enouraged, which I very much agree with. However, I do think that the contribution of a high-quality dataset to the community is undervalued; that is, lots of citations are of significantly less value than lots of publications. Such datasets are fairly uncommon, however, so perhaps special considerations should be made in those cases.

I think the greatest amount of variation occurs with the contribution of ideas, whether they be thesis committee or lab members. In Susan Baserga’s session I mentioned earlier I also recall her saying that “ideas are cheap”. Such contributions are often relegated to the Acknowledgements section and I wish that were better tracked. An extremely smart postdoc in Scott Strobel’s lab (where I completed my PhD) generously discussed technical details of experiments with individuals, which got him acknowledged on nearly every paper that came out of the lab. It would have been nice if that time could have been leveraged when he went on the job market. My position is to agree with Susan on this one – authorship should revolve around the data, but it would be great to give credit for ideas above and beyond the typical sort of lab meeting-type suggestions.

How should ordering be determined and particularly who should be the first author?

As I described above, I think the only authors that really matter are the first and last. In cases where an individual leaves before the manuscript is completed I agree with Scott Strobel’s position. In his system first authorship goes to the person that completes the work and prepares the manuscript for publication. At first I thought this was rather harsh to those that had left, but having now experienced a few arduous review processes I agree with him. If, however, an individual is stepping in to complete the additional experiments requested by reviewers, that’s co-author. However, this position is tied to the wetlab; clearly more computationally-focused manuscripts where the work can be completed remotely the original authorship should be retained.

When are co-first authors warranted and how should they be ordered?

I think co-first author should be retained for situations where two independent projects merge for a publication. That being said, at this point I have more co-first authored publications than sole first-author, which is perhaps why I feel so strongly about making sure such positions are clear ahead of time.

As for the ordering, Scott Strobel’s view was that if it isn’t alphabetical he wonders if it’s really an equal contribution. I think chronological ordering could also make sense, e.g. the co-first author that worked the project earlier is listed first. This ordering is important because many citation managers miss the co-first authors in both the in-line citation (e.g. (Smith and Doe et al, 2015) as opposed to Smith et al, 2015) and bibliography.


I expect my thinking on this matter will change over the years, particularly as the average number of authors continues to increase. More careful tracking of contributions through ORCIDs may make the process more quantifiable, which I think is always, literally always, a good thing. I also wonder about the effect of alternative forms of science communication on authorship, and whether comments on bioXriv, R packages on CRAN, papers reviewed on Publons or even followers on twitter may dilute a hiring committee or dean’s focus on the H index. I expect that as the forms of science communication diversify so too will CV’s.


comments powered by Disqus