Students vs. professionals in software engineering experiments
Experiments are an essential component of any engineering discipline. When the experiments involve people, as subjects in the experiment, it is crucial that the subjects are representative of the population of interest.
Academic researchers have easy access to students, but find it difficult to recruit professional developers, as subjects.
If the intent is to generalize the results of an experiment to the population of students, then using student as subjects sounds reasonable.
If the intent is to generalize the results of an experiment to the population of professional software developers, then using student as subjects is questionable.
What it is about students that makes them likely to be very poor subjects, to use in experiments designed to learn about the behavior and performance of professional software developers?
The difference between students and professionals is practice and experience. Professionals have spent many thousands of hours writing code, attending meetings discussing the development of software; they have many more experiences of the activities that occur during software development.
The hours of practice reading and writing code gives professional developers a fluency that enables them to concentrate on the problem being solved, not on technical coding details. Yes, there are students who have this level of fluency, but most have not spent the many hours of practice needed to achieve it.
Experience gives professional developers insight into what is unlikely to work and what may work. Without experience students have no way of evaluating the first idea that pops into their head, or a situation presented to them in an experiment.
People working in industry are well aware of the difference between students and professional developers. Every year a fresh batch of graduates start work in industry. The difference between a new graduate and one with a few years experience is apparent for all to see. And no, Masters and PhD students are often not much better and in some cases worse (their prolonged sojourn in academia means that have had more opportunity to pick up impractical habits).
It’s no wonder that people in industry laugh when they hear about the results from experiments based on student subjects.
Just because somebody has “software development” in their job title does not automatically make they an appropriate subject for an experiment targeting professional developers. There are plenty of managers with people skills and minimal technical skills (sub-student level in some cases)
In the software related experiments I have run, subjects were asked how many lines of code they had read/written. The low values started at 25,000 lines. The intent was for the results of the experiments to be generalized to the population of people who regularly wrote code.
Psychology journals are filled with experimental papers that used students as subjects. The intent is to generalize the results to the general population. It has been argued that students are not representative of the general population in that they have spent more time reading, writing and reasoning than most people. These subjects have been labeled as WEIRD.
I spend a lot of time reading software engineering papers. If a paper involves human subjects, the first thing I do is find out whether the subjects were students (usual) or professional developers (not common). Authors sometimes put effort into dressing up their student subjects as having professional experience (perhaps some of them have spent a year or two in industry, but talking to the authors often reveals that the professional experience was tutoring other students), others say almost nothing about the identity of the subjects. Papers describing experiments using professional developers, trumpet this fact in the abstract and throughout the paper.
I usually delete any paper using student subjects, some of the better ones are kept in a subdirectory called students
.
Software engineering researchers are currently going through another bout of hand wringing over the use of student subjects. One paper makes the point that a student based experiment is a good way of validating an experiment that will later involve professional developers. This is a good point, but ignored the problem that researchers rarely move on to using professional subjects; many researchers only ever intend to run student-based experiments. Also, they publish the results from the student based experiment, which are at best misleading (but academics get credit for publishing papers, not for the content of the papers).
Researchers are complaining that reviews are rejecting their papers on student based experiments. I’m pleased to hear that reviewers are rejecting these papers.
Good points. I think it means that software engineering researchers need access to software professionals for such experiments to have real meaning. Some larger companies could be able to work with the researchers perhaps. But those kinds of relationships seem to work best if the researcher has some personal connection to the company.
But students are indeed plentiful and cheap!
@Matt Doar
Many researchers (or at least the ones I talk to) seem to have a mental block on approaching people in industry. I don’t see many people from industry wanting to go to academic workshops and the like (which are incomprehensible without some familiarity with the field). Which means researchers are going to have to visit industry.
How to make contact with people in industry? One viable option, at least in large cities, is attending public technical meetings, e.g., meetup.com has all kinds of events listed.
The problem with student subjects is the huge variation in abilities, which adds noise to the data, and the large learning effect that occurs during the experiment (which skews the results).
I’ve been working in industry all my career, though I’ve tried to pay attention to, and to some degree stay in touch with, the academic world (including being an ACM member). There are others I’ve met in industry with similar interests, though I must admit we seem to be rather rare.
When I was in Toronto it was extremely difficult just to get a bunch of like-minded industry people into the same room at the same time, and neigh on impossible to attract any academics to the same meetings, unless perhaps they were more interested in transitioning into industry themselves. New trends and new tools and such will attract some industry people as they try to learn and keep up with their career development, for at least a short while until they go back to their 9-5 existence. Here in this small BC city where I now live there is a fair concentration of computing related industry, though their focus is primarily only on UI related things (web developers, game designers, etc.), so I’ve been less interactive with them.
Those few of us I mentioned first would go to lectures and symposiums at the local universities, but unfortunately those of us in more stressful 9-5 positions, or who were at offices in less central locations, would often be unable to attend. More often than not I’ve met with complete puzzlement from colleagues and flat-out “NO” from management when asking to attend academic conferences. Perhaps my ability to interact with the academic world had more to do with mostly working for myself than anything else though as I do know many colleagues would express some interest once they learned what I was interested in.
There’s one big yearly event in the Toronto area where there has been a much more direct attempt to bring together industry and researchers, and that’s IBM’s CASCON conference. It happens to be next week. Unfortunately I’ve been unable to attend since moving out to BC.
As for use of students as subjects in software engineering studies, well these days I would argue they are pure noise and nothing but, with barely the ability to be used to test the most basic mechanics of a study; though when I was a student I may have tried to argue otherwise. WEIRD indeed, though not really “western” (at least in heritage) any more. To evaluate student experience vs. professional experience I would also say a student’s lines of code read metric needs to be adjusted by dividing by at least 1000; though on the other hand students often focus on writing much “better” (more elegant, and sometimes more efficient) code than many industry professionals seem to do, at least in my limited experience.
@Greg A. Woods
Academics rarely attend events outside they narrow field (which is another problem). They recognize the need to spend time learning the background.
Academic work contains a few gold nuggets and lots of useless nonsense; it takes time to learn to recognize the nonsense.
The information content at industrial events is just as low.
Events are really about networking.