Training using a clicker is very popular, and is gaining ground amongst animal trainers, but here’s what may come as a surprise to you:
When scientists compare the effectiveness of using a clicker when training to training using only treats as rewards (or reinforcers, to be more precise) outside the laboratory, the results are inconclusive.
One study found the clicker led to faster learning, one that it led to slower learning, and four studies found no difference between the two treatments.
“What! That’s preposterous! I know it works!” some of the most enthusiastic clicker proponents out there may say, and simply ignore those types of publications.
“I knew it!” the anti-clicker crowd may say, and use those articles as ammunition.
For the rest, those types of inconclusive or contradictory data may sow a seed of doubt, a small nagging voice of uncertainty.
And since doubt will likely interfere with your training success, the purpose of this blog post is to dispel them.
Personally, I think those studies have some serious methodological problems that might explain the results. Maybe, as I walk you through my line of thought, you’ll agree with me.
For simplicity’s sake, the term “click” in this post could include any stimulus that’s been consistently paired with a treat, such as the sound of a clicker, a whistle, a spoken word, or a flashing laser light. “Treat” could refer to any unconditioned stimulus that has innately reinforcing properties: for instance food, play, or attention.
So, how does the established clicker work in animal training?
A seemingly trivial question, but let’s look at a training trial for a simple behaviour.
You say “sit”, the clicker-trained dog sits, you click and give him a treat.
What just happened? What was the intended function of the click?
a) The click predicted treats (=it was a reinforcer)
b) The click pinpointed the exact moment when the criterion was met (=it was a marker signal)
c) The click spanned the interval between the behaviour and the consequence (=it was a bridge)
a) Reinforcer. The power of the click as a reinforcer depends on the degree of classical conditioning: the number of pairings (click-treat), the quality of the treats, the uniqueness of the combination of the sound and the chosen treat, and so on. It is not all or nothing: it’s everything in between too.
b) Marker. This depends on the other information available to the animal in the current setting. Can it get “marking” information from anything else occurring simultaneously? For instance, when the animal touches a target and gets a click at the same time, those two stimuli need to be processed by the animal’s brain – and sometimes maybe the touch will be more salient to the animal, sometimes the click. Animals are hard wired to attend to stimuli; they will orient towards them, approach them, sniff and investigate. Touching is part of the behaviour and tactile feedback will likely be very noticeable by the animal. In addition, nose-touching involves an olfactory component for many animal species. Also, how much the animal will attend to the clicking sound will likely be a function of previous exposure to that sound: is it reinforcing?
c) Bridge. Animals learn more slowly, or fail to learn the task at all, when the consequences are delayed even by a very short interval. The click predicts the upcoming consequence, so it can lessen the effects of delayed primary reinforcement.
Wow, three-in-one: a marker, a reinforcer, and a bridge! If the click can do all that, no wonder it’s such a popular tool. So, what’s the problem?
Let’s start with laboratory studies of learning.
We expect using a clicker to work
There’s been a lot of work, mainly on rats and pigeons in the laboratory environment, on the effect of conditioned reinforcers (the click being one example) on the acquisition of behaviour. Conditioned reinforcers have been shown to speed up learning, lead to better retention of learned behaviour, produce a euphoric response due to the activation of the SEEKING system, as well as improve the resistance to extinction. In the laboratory setting, there’s not much disagreement about the effects of conditioned reinforcers on brain chemistry and behaviour, as far as I know.
The problems begin when we move this to the applied setting. The real world, outside the laboratory.
But before we do that: what do clicker trainers say about their hands-on experiences?
Clicker trainers say using a clicker works
Many clicker trainers speak highly and enthusiastically of the benefits of training with a clicker. They see engagement and focus, they experience speedy acquisition of new behaviours, and that animals remember what they learned through clicker training years later. They say that it encourages the animal to think for himself, becoming a fully active thinking participant. They say it makes the animal happy and confident.
In a review including 25 sources, a recent study quoted one trainer:
“Clicker training will turn your dog into a learning junkie – a dog who is eager to offer behaviours and to experiment to get you to reward”.
Those anecdotal reports aren’t very scientific, though. Critics might argue that clicker trainers are biased, ignorant or misinformed. That if you’ve never not used a clicker when training animals, you’re not qualified to make that comparison. I for one, wouldn’t qualify. I read, write and teach more than I do actual training, and when I do, I do use the clicker a lot. Probably more than I would have to.
But many trainers have trained both using clickers and not using them, so I was interested in their experience.
I hang out in a few training groups on Facebook, so I asked: “Those of you who’ve trained both using and not using clickers (or other event markers), which of the following statements do you agree the most with?”
And then they could choose whether clickers always sped up the animals’ learning, sometimes, never, or if they didn’t see a difference.
I know, not the greatest survey. I tend to shoot from the hip, as it were.
Within about 24 hours, I had 73 replies. Within this diverse group of trainers, with experiences from training pets, farm animals, as well as zoo animals: this was the outcome.
So, when scientists started setting up experimental studies, we expected them to confirm what these basic scientific studies, and observations from clicker trainers, have suggested.
Only the studies published to date haven’t confirmed it.
Current data says clickers don’t work
So, the efficacy of the clicker hasn’t been corroborated by scientific studies carried out in applied settings. In the real world, outside the laboratory, five out of six studies found no difference, or that learning actually slowed down. Only one found that using a clicker when training increased learning speed.
So, were the original laboratory data wrong, or are clicker trainers delusional?
Or – dare we ask: are those 6 studies asking the right questions? Looking at the hypothesis from the right angle?
I think not. After browsing those articles with a critical eye, I find a few major problems with them.
Problem 1 – insufficient conditioning
Firstly, in all those studies, they condition the clicker a maximum of 20 times (click-treat, click-treat, click-treat) and then compare the “clicker group” to the unconditioned control group. In fact, in several of the studies, there’s no classical conditioning at all before the operant learning begins, if I read them right.
In other words, the animals immediately start learning the task, one group getting “click-treat” and the other just “treat” as the consequence.
So the animal in the clicker-group both has to sort out that behaviour has effect (operant learning), and that the clicking sound predicts the delivery of the treat (classical conditioning).
And the scientists start collecting data from trial one. So, if anything, we’d expect the animals to start at the same point, perhaps the clicker trained animals even being at a slight disadvantage because they need to untangle more information than the animals trained with treats only.
In other words, if the power of the click takes longer to establish than the max 20 pairings that any of these studies allowed, the researchers are looking for the effect in the wrong training interval. They shouldn’t start measuring at trial one but perhaps at trial 150. Or 300. Or whenever we’d expect the reinforcing effect of the clicker to be well established.
Let me explain what I mean. In Sweden, we’re just changing our money. We have a new 500-kronor bill, featuring one of the best singers ever to have walked this earth, Birgit Nilsson.
Digression: I once sang in a choir that had the following punisher written into its bylaws: if we ever were conceited enough to think that we’d delivered a first-class performance, we were to listen to Birgit Nilsson singing Isolde’s death area by Wagner from the 1966 Beyruth live recording, conducted by Karl Böhm. This happened on one memorable occasion: I still remember how the exuberant, happy and proud feelings that we felt when exiting the stage crumbled and deflated as we listened to Birgit’s formidable, impossible and effortless area. Did I mention it was recorded live? I know, that was weird, and frustrating if you understand learning theory. Rather than celebrate an outstanding performance, we sank it.
Where was I?
Ah, money. So, in theory both these bills are equally valuable right now. We’re in the half a year or so when they’re both available. Now imagine finding one of them on the street. You glance down, and there’s a 500-kronor bill there. About 55 US dollars. 52 Euro.
I wouldn’t get all too excited if I were to find the Birgit Nilsson one – and not because I was punished by listening to her ass-kicking performance on a weird and wonderful evening 15 years ago.
What’s the problem then?
That note is too new. It’s unfamiliar. To be frank, I probably wouldn’t even know if it was fake or the real thing. Whereas the other one, with Karl XI, from 2001, is instantly recognizable – I can see a fraction of it and recognize it.
What am I getting at? That conditioned reinforcers acquire their reinforcing properties gradually. It takes familiarity before they can fully evoke a conditioned response.
So, when researchers teach a naïve dog that the sound of the clicker means a treat is imminent, and spends only a maximum of 20 trials conditioning the clicker, that is the equivalent of me finding a Birgit Nilsson- note, rather than the Karl XI-note.
I’d pick it up hesitantly, frowning, going “I wonder if this is… maybe?”
I wouldn’t immediately recognize the value of it.
Back to animals. In a recent review where clicker trainers were interviewed, one said:
“to actually really be a solid clicker training dog would…probably take two months.”
Suggestion: if you’re planning to investigate the efficacy of the clicker, do the pairing procedure (click-treat) at least 150 times for the clicker group before you start the actual operant learning experiment. Feed the control group the same number of times, so that you’re not introducing a bias in how familiar the animals get with the situation. This doesn’t mean that you should do the pairing procedure that ridiculously many times in ordinary training. Then less than 20 is typically appropriate, and the conditioning process will continue as you start operant training. 150 is a semi-random number that I chose with the sole purpose to ensure that complete and utter conditioning has occurred for all individuals in the study. Choosing a lower number risks that for some individuals that hasn’t occurred. I’m not saying that’s what you need for training from a practical perspective, just that when you’re doing research you need to eliminate potential confounds (such as insufficient conditioning before operant learning starts)
OK. So these studies probably didn’t study the reinforcing effect of using a clicker, because it wasn’t properly conditioned, or they chose the wrong interval.
And that still leaves out the marking effect or the bridging effect.
Problem 2 – the operant behaviour contains marker elements
One of the potential effects of the click is that it’s a marker signal, as explained above.
And in all of the scientific studies that have tried to study the efficacy of the clicker in an applied setting, the researchers have used a targeting behaviour as the behaviour under investigation. In other words, a behaviour containing marker elements.
In all the studies, all the animals were taught to touch a cone, or a lever, with their nose. They had visual, tactile and olfactory stimulation helping them to orient towards the stimuli, resulting in the desired behaviour occurring. And since the clicker hadn’t been sufficiently conditioned, it was probably a lot less salient than the other, abundant, information available.
Additionally, some the researchers pointed at the target (the cone or lever), or used luring (showing a food treat next to the target) to get the animals to attend to the stimulus.
In my facebook survey, I got some comments from some very experienced trainers.
“It absolutely depends on the behavior for me. If I’m teaching something where I need a clicker, which for me is high precision shaping, picking out a response in a series of responses or other contexts where I can’t deliver reinforcers quickly enough, ie animal in the air or far away, then yes, those behaviors are typically learned faster with a distinct marker signal.
When the behavior I want to teach is very distinct to the animal (a bird landing on the hand, butt touching the floor) and I can deliver reinforcement quickly, standing in front of the animal in close proximity, I very rarely use markers.”
Notice she refers to the clicker as the marker!
In other words, many skillful animal trainers don’t bother using clickers when target training, because they see no obvious added advantage to doing that. They use it for other purposes!
So, those scientific studies have investigated a behaviour where we wouldn’t expect there to be any tangible difference between the groups.
Suggestion: if you’re planning to study the efficacy of clickers, choose a response where you expect clickers to make a difference compared to the treat-only condition, such as high precision shaping or picking out one behaviour out of a series of potential responses. Avoid responses that include obvious external stimuli that can lead the animals to the correct response. Don’t use prompting or luring.
OK, so the studies didn’t condition the clicker enough, and they taught responses that clicker trainers wouldn’t even bother using a clicker to teach.
In other words, they didn’t examine the reinforcing properties of the clicker, nor the marking properties. How about the bridging properties?
Problem 3: treats appear immediately
Given the choice of response trained, the researchers could present the treat immediately when the correct behaviour occurred.
Specifically, there was no need for a bridging stimulus spanning the time from the correct behaviour to when the consequence occurred.
So, from this perspective, we expect the click-treat condition to be no different from the treat-only condition. In both treatments, the animals would immediately get reinforcement for doing the correct behaviour.
In other words, these studies did not test the bridging properties of the clicker.
Suggestion: in your study, choose to train a response where it’s impossible to deliver the treat to the animal’s mouth exactly as the desired behaviour occurs. Rather, choose a behaviour with some distance between the animal and the treat dispenser, so that there’s a few seconds’ interval between the behaviour and the consequence.
The clicker is a tool, and performs best in specific contexts. Sometimes it’s unnecessary.
If we want to understand how effective training using a clicker is, we need to start by examining it in the context where we expect it to have the largest impact, compared to training using treats only. That involves looking at reinforcing effects, marking effects, bridging effects, or any combination of these to tease out which has the principal effect on learning speed.
The studies published until now have examined neither, unfortunately.
Additionally, we might want to know what clicker training does for the trainer. As Eva Bertilsson (carpemomentum) said when she responded to my survey:
“My training often proceeds faster when using a marker. That is, I can get behaviour faster. My feeling is that it’s about me and my capability rather than “the animal’s learning” – if that could be somehow teased apart.”
A very valid point. How does the trainer’s behaviour change when using a clicker, compared to not using one? Both when it comes to the philosophy of training (and believe me, there’s more to “clicker training” than just using a clicker), but also about the mechanics of delivering timely reinforcers and choosing criteria.
After all, it takes two to tango – the animal’s learning is just one piece of the puzzle.
I’m not concerned with the lack of resounding support for using a clicker when training in the current academic literature. The absence of evidence is not evidence of absence, and I hope that future research will let us know just how effective clicker training can be.
Did you like this article, and would like to learn more? I write blog posts, give free online webinars, and online courses about behaviour management. Sign up below, and I’ll make sure you’ll be notified whenever something’s going on!
Blandina, A.G., n.d. To click or not to click: Positive reinforcement methods on the acquisition of behavior, Unpublished honours thesis.
Chiandetti et al. (2016). Can clicker training facilitate conditioning in dogs?
McCall & Burgin (2002). Equine utilization of secondary reinforcement during response extinction and acquisition.
Share this post: