Poisoning – or counter conditioning?

Yes or no, true or false?

“If you combine negative reinforcement with positive reinforcement, you poison the learning process.”

Do you agree?

Do you poison the cue? The environment? Yourself? – if you after a negatively reinforced response add a positive reinforcer (aka combined reinforcement)?

Some of the readers of this blog may say “yes”.

Some may say “no”.

Some may say “huh?”

After all, that question doesn’t make sense if you’re unfamiliar with those terms. Stay tuned, I’ll explain them in a minute.

I think the right answer is “it depends”.

Is combined reinforcement poisonous?

Before we start untangling the potential pitfalls of combined reinforcement – why is this an important question?

Helping horses and their people.

This question is important because cross-over trainers (I’m thinking horse people here) may find it hard to transition from negative reinforcement to positive reinforcement without using combined reinforcement.

In the horse training world, the paradigm shift from negative reinforcement-based training to positive reinforcement training is a huge leap for both the traditionally schooled trainer – and the horse.

I’ve been told that many horse people try making that shift, fail and give up – reverting back to the familiar, old-school, negative reinforcement-based training.

In my book, the failure to incorporate positive reinforcement in any training regime would lead to potential unnecessary suffering, and I can’t help thinking that a more gradual shift may perhaps be less difficult than an all-or-nothing approach in some cases. Maybe the difficulty in shifting from negative to positive reinforcement could be addressed by transitioning through combined reinforcement?

Making huge changes is difficult, and many horse people fail to make the switch from traditional negative reinforcement-based training to using mostly positive reinforcement. Perhaps a gradual transition, involving combined reinforcement, leads to more horse trainers successfully learning to use positive reinforcement in their training? The gradual transition would involve adding positive reinforcers, diminishing the aversives used, and learning to recognize subtle body language indicating discomfort.

In my opinion, combined reinforcement has the potential of either increasing or decreasing suffering, depending on how it’s carried out – and what the alternatives are.

I may be completely off track here. After all, I’m not a horse trainer, I haven’t lived through that transition personally. So, this blog post is hypothetical – but I’m counting on you, dear reader, to let me know whether you agree with me – or not.

But first, some definitions – just to make sure we’re on the same page.

Negative reinforcement – escape or avoidance

Negative reinforcement is when the animal learns to escape or avoid something unpleasant by performing a specific behaviour.

Escape or avoidance are technical terms in this context.

Let’s take a doggie example, just so that the dog people feel that reading on is relevant – after all, the same learning principles apply to all animal species, not just horses.

Say you’re walking your dog. He pulls ahead, and the leash gets taught, which is uncomfortable. So he stops pulling, which reduces the pressure on the leash. He just performed an escape response. Not ‘escape’ as in running for his life but as in performing a behaviour that terminates an ongoing unpleasant stimulus.

In other words, the dog who successfully performs an escape response has control over the end of the unpleasant stimulation. He has figured out how to end it.

Later he may learn what serves as a warning signal announcing that he’s about to experience the tightening of the leash – whether it’s a certain behaviour from the handler, or something that happens in the environment, such as seeing another dog. So he returns closer to the handler when he perceives the warning signal, and has thus avoided experiencing the increased leash pressure.

‘Avoidance’ is a technical term here. It means that the aversive stimulus is not applied. A horse who’s standing still as you’re saddling him is showing avoidance behaviour if he’s learned that shying away is met with unpleasantness. So, even though he’s just standing still, technically he’s showing avoidance behaviour.

So, for the animal, successful avoidance means controlling the start of the unpleasant stimulation. He has figured out how to avoid its onset.

An escape response terminates an ongoing aversive stimulus, whereas an avoidance response prevents it from occurring, often in response to a warning signal.

This concept of control is extremely important when it comes to empowering animals and maintaining high welfare. Empowered animals are resilient; they aren’t as emotionally affected by aversive events. And vice versa, the welfare of animals who lack control over aversive events is potentially challenged.

It may sound as though I’m promoting negative reinforcement, as if it’s unproblematic. “It gives the animal control; the animal gets empowered and resilient”.

It’s not that simple.

Negative reinforcement is often carried out badly, and since it involves aversive stimuli, it has the same potential unwanted, unhealthy and dangerous side effects as punishment.

What dangerous side effects? I listed 20 problems with punishment here.

So, negative reinforcement potentially carries some serious fall-out, and there are more effective ways of getting behaviour, such as using positive reinforcement.

Positive reinforcement – the preferred approach.

Positive reinforcement is when the animal learns to get access to some type of resource, or something pleasant, by performing a specific behaviour. For instance, the dog gets treats, praise, petting or eye contact when he stays close to the handler. He then spends more time close to the handler to get continued access to those resources.

This approach to teaching and training animals is a lot less problematic than negative reinforcement, and generally results in more enthusiastic learners and better performance.

Also, positive reinforcement is generally more forgiving if you mess up as a trainer. Not to say that positive reinforcement training is without potential pitfalls, though.

The difficulty of transitioning

So which choices are there?

Let’s say you’re a traditionally schooled horse trainer, skilled at using negative reinforcement… would you wake up one day and drop those highly developed skills, and switch to something you’ve never tried before – positive reinforcement?

As far as I’ve understood, and to their immense credit, many horse people actually do try this. They leave familiar ground and try something new, perhaps without an experienced coach mentoring them – and then their horse gets overexcited and starts mugging them, and many of those brave trainers quit.

Sound familiar?

Not being a horse person, I didn’t realize this problem existed until I started hearing it from my students, but it’s a story that keeps coming back.

What is the problem, then? Why is transitioning so difficult?

One part of the issue may be that the would-be cross-over trainer is changing too many things at once.

When teaching using positive reinforcement, we often shape behaviour. We don’t expect behaviour to change in leaps, but rather in baby steps. We change it by increments, making small adjustments to existing behaviour.

We avoid setting the animal up to fail. Success should be easy, at every step of the way.

We avoid frustration.

And yet – we expect cross-over horse trainers to make that huge leap, without shaping the transition from negative to positive reinforcement!

Setting them up to fail, as it were.

What I’m suggesting is that we help them start where they’re at, and consider “teaching-positive-reinforcement-to-a-transitioning-horse-trainer” as a shaping procedure. It has the added benefit of being a non-confrontative approach that might make the trainer who’s still on the fence about trying positive reinforcement more likely to listen to and implement suggestions for change.

A rough shaping plan may look like this.

  1. For instance, if the trainer is using negative reinforcement, we first help them do it better. We teach them to read the subtle changes in body language indicating the animal’s discomfort. We help them improve their timing and the importance of applying as little pressure as possible. We teach them the distinction of escape and avoidance.
  2. If the trainer is using negative reinforcement skillfully (with such mild aversives that there’s no fear reaction), we introduce them to using combined reinforcement, adding a positive reinforcer to a negatively reinforced behaviour. For instance, if the horse yields to slight pressure without showing signs of discomfort, he also gets a treat. It’s important that the treat-giving serves to reduce any slight discomfort through counterconditioning, and doesn’t induce conflict by following a strong aversive that induces fear (as this leads to the risk of poisoning, see below).
  3. If the cross-over trainer has gained some skills using positive reinforcement, the use of negative reinforcement can be faded as that new toolbox grows. Any poisoned cues may be retrained using new cues.

This rough shaping outline may sound easy to implement, and I realize it’s not. I’m not a horse trainer so I wouldn’t know exactly how to go about doing it, either; which exercises to use to help teach the required skills and getting momentum through this shaping process?

Beyond the difficulty in identifying the right practical exercises, I see three main reasons why it might fail.

  • Some would-be cross-over trainers feel guilty about having used negative reinforcement and want to stop immediately. They see no value in improving their use of this technique, and don’t want to spend any time on step 1 – or 2. But going straight to step 3 is simply too difficult.
  • They may be unprepared that the previously well-mannered horse may get excited about food, conflicted and aroused, or start mugging. This makes some would-be cross-over horse trainers give up their attempts of using positive reinforcement altogether.
  • They may have heard that combined reinforcement (negative and positive reinforcement) may poison the training process, and so they won’t try it. This might also make many would-be cross-over horse trainers give up, because step 2 in the shaping protocol is omitted, and going from step 1 to step 3 is, again, too difficult.

So, returning to the first question I asked: Is there some truth to the risk of poisoning?

Well. Sometimes yes, sometimes no.

Hence this blog post – because that “sometimes, no” would bring back step 2 into the shaping process, and ultimately help more horse people start using positive reinforcement.

Poisoned cues

The potential problem is that the cue asking the animal for the response may become poisoned (as may the environment, including the trainer). Once poisoned, the cue is both a warning that if the response isn’t shown, unpleasant things will happen, and a promise that once the correct response is shown, there’ll be goodies.

The poisoned cue is ambiguous, since what follows may sometimes, but not always, involve unpleasantness.  And so the animal reacts emotionally (fearfully, showing avoidance) to the poisoned cue.

So how likely is a cue followed by combined reinforcement to become poisoned?

I’d say it depends on a couple of factors, and poisoning can perhaps be avoided by considering the following:

  • How unpleasant is “unpleasant”? Does the animal show an emotional response to it indicating fear or discomfort? (ideally, no – the unpleasant stimulus should be very mild)
  • Has the animal learned to escape and avoid the unpleasantness? (ideally, yes – he should have learned to control it)
  • How good are the goodies? Does the animal get an emotional response to them? (ideally, yes – he should be enthusiastic about them).

I think cues get poisoned when:

  • The unpleasant stimulus is painful and/or evokes a fear response.
  • The animal hasn’t learned to escape or avoid the unpleasantness.
  • The positive reinforcer isn’t potent enough to countercondition the unpleasantness.

One oft-discussed Master thesis examined the effects of combined reinforcement in comparison to solely positive reinforcement and found that the cue (and the environment) did indeed get poisoned. In that study, the unpleasant stimulus that poisoned the cue involved pulling the dog into the desired position by tugging the leash.

In the study, the dog was wearing a harness and was pulled into position if he didn’t respond to the cue. Later, he learned to avoid pulling by walking to the desired position on cue. In other words, he learned an avoidance response, but he never learned an escape response. Also, the tugging was aversive enough to lead to emotional responses.

And so in that study, the cue became poisoned, and there was a big difference between behaviours trained with combined reinforcement compared to behaviours trained solely with positive reinforcement.

What would have been the outcome if the intensity of the tugging wasn’t as aversive? If the animal learned to escape it – perform a behaviour to terminate it – learned to control the aversive?

Might we see counterconditioning rather than poisoning if the procedure were carried out differently? That’s question one.

Question two is – what’s the alternative? In that study, there was never a comparison with behaviours trained solely with negative reinforcement.

What are the options?

For many would-be cross-over horse trainers, the options aren’t to use to use combined reinforcement or positive reinforcement. Remember, the trainer who hasn’t transitioned isn’t skilled using positive reinforcement – yet.

The options for many traditional horse trainer would perhaps be to use make that transition using combined reinforcement – or revert to the old ways and use negative reinforcement.

Combined reinforcement: good or bad? It depends.

So, the option for the would-be cross-over horse trainer isn’t between combined and positive reinforcement, it’s between combined and negative reinforcement. Again, combined reinforcement has the potential of either increasing or decreasing suffering, depending on how it’s carried out – and what the alternatives are.

If the alternative is positive reinforcement, the combined reinforcement may be the least preferred option.

If the alternative is negative reinforcement, then combined reinforcement may be the best choice, serving two different purposes.

One, teaching the cross-over trainer to incorporate positive reinforcement into the training, helping make the transition.

Two, counterconditioning the training situation so that it’s less aversive to the horse. After all, the horse will learn that uncomfortable situations are invariably followed by nice things happening – those discomforts become predictors of good stuff and therefore less scary.

Negative reinforcement has a bad rap, and many cross-over horse trainers abandon this training technique as soon as they can when they’ve successfully learned to use positive reinforcement.

That’s great. Once you’ve built the skills of using positive reinforcement, negative reinforcement may be faded out of most training situations.

But if you haven’t yet made that transition, or you’re struggling? Combined reinforcement may be the intermediate step that will help you successfully learn to incorporate positive reinforcement in your training.

Combined reinforcement may reduce the risk that you’ll give up.

… what’s the verdict, horse people? Am I making sense, or is this too impractical to do in real life? Have you found a way of shaping the transition that might help the would-be cross-over horse trainer who’s reading this? Please share with us in the comments’ section!

***

Did you find this blog post interesting? Perhaps, then, you’d want to get information about my online courses – I have one about how emotions impact behaviour, and two about animal training (one introductory, one advanced)? Sign up below to get notified of when they’re open for admission – and I’ll also notify you about future blog posts and free webinars!

24 Replies to “Poisoning – or counter conditioning?”

  1. I really enjoyed reading Carolina’s text and all comments. I am right now at the ISES-conference in Rome for a few days of equine research. I will bring with me to the discussion both your very wise and research based “from outside the horse world” thoughts on horse training and all comments on how to apply the learning principles in practice.
    Thank you all
    /Jenny Yngvesson

    1. Jenny – sounds really interesting! I’d love to hear about it – and not only me, I’m thinking..! What did you learn – which were the discussion topics?! 🙂

  2. Whatever training anyone does, with *any animal* they should first have a reasonable knowledge of both learning theory and that animal’s behavioural repertoire (ethogram). Without this, however you train, you are likely to have real problems.

    Where neg R is used, so long as this is mild, brief and non-escalating, it can be part of the training method, though PR is better, really useful and horse-friendly, when combined with shaping.

    1. In the best of worlds, people should have that knowledge reservoir, I agree! Not so sure that they do, though… They make the best of what they know – and as you say, often end up having real problems.

  3. I love this article, thank you. Have you written any books on the subject? I would be very interested if you have. I don’t particularly enjoy online courses, I prefer to read and learn.

    1. Thanks for the comment! No, no books unfortunately. I understand your frustration with online courses – they’re sometimes not so effective…

  4. The majority of horse owners, myself included, have horses that were initially trained using R-.
    The horses were trained to accept a halter and lead rope, follow a handler on a lead, pick up feet and stand tied up. We cannot ever undo this training, the equipment itself tells the horse you must do this. So in effect we are already using both R+ and R- whether we like it or not!
    I made the transition to R+training in precisely the way you describe, though at the beginning R+ had not been my goal. I refined my R- skills and trained myself to observe the smallest response from my horses to my requests, and also to acutely observe their general overall demeanor. I also learned through R- to break things down in to smaller chunks and then put the small pieces together.
    I would not use or advocate using R- for training any animal, but often we are unable to erase past history and the fact that we are still maintaining R- taught behaviours is unaviodable. I think it is important not to “ throw the baby out with the bathwater”, but to encourage transitioning trainers to seek the most ethical, least intrusive methods.

    1. Great – you’re pointing out something really important – just like the trainers, the horses have a history too!

  5. I’ve always used mild aversives followed by treats in my horse training, my horses have always seemed eager come work with me, despite being out 24/7 at grass. I always have treats and give grooming/massage sessions regardless of if I’m going to be riding or training that day. They are always interested in what I have to offer.

    When it comes to riding I never understood how it would be possible with out teaching the horse to yield to pressure or seek the bridle (reach into or follow the contact, bit or not), it’s like a physical body language. It’s important for the horse to stay balanced and aligned under the rider and engage his haunches and core, as the rider follows the horses movement with their own body and influences it. I never thought about how non-equestrians, or even some riders who haven’t had formal riding education may not understand this language and see it as un-necessary prodding, pushing, pulling only (which if it looks like that its bad riding, or hopefully just a bad moment requiring of escalation of the aversive).

    Helen is right it is about communication, at least that is how it is supposed to be. What are your thoughts? Is it possible to have the physical body language between horse and rider and have it be positive reinforcement, or is it always considered combined at best? I have seen riders click and stop their horse to give a treat while riding and giving traditional cues like leg pressure and rein contact. Riding is a real-time event that requires instant communication and understanding from rider to horse. There is already so much feedback between the bodies of the horse and rider just from being on the back of the moving horse. Even riding bitless with a cordeo still involves yielding to a pressure at the base of the neck, and of course the riders weight shifts and legs. Riding can never be pressure-cue free and hands-off just on principle I think.

    1. You’re making a great point – riders do add body weight and changing pressure when seated on the horse. Does this necessarily mean that negative reinforcement is involved? I’m not so sure – it depends on how it was taught. “move away from pressure and it will stop” or shaping using positive reinforcement and then adding a tactile cue..?

      Many of my students are adding “get on” and “get off” behaviours where the horse can indicate whether he wants the rider to mount or dismount. It would be interesting to see how those behaviours are affected by the type of training the horse has gone through..!

  6. It is an interesting topic! From the dog side, I see people using positive rewards and positive punishment all the time. Does it poison the cue and everything else? O, ya! But some of these trainers are very adept at using positive punishment, so they get the results they want. I think when these trainers try to move away from the “negatives”, they revert back because they are not well educated in positive reinforcement problem solving. They flat don’t know how to fix “problems” with positive reinforcement. So, when a problem arises, they revert to what they know. They have no coach, as you point out, when they need it to help them problem solve. And so, they sometimes conclude positive reinforcement “does not work” and they revert to their old way of training – tragically.

  7. For me the difference in enthusiasm, body language, confidence, engagement in training between behaviours taught with positive reinforcement and those previously taught with negative reinforcement was glaringly obvious. This has prompted me to almost completely abandon using negative reinforcement altogether. I am lucky enough that my horse lives with a permanent herd, outside with lots of space, natural enrichment and constant access to forage. I think that this, having a great mentors, support and getting started with relaxation as our focus could be some of the reasons why we have managed to avoid over arousal. Being a qualified Equine massage therapist, healthy posture is also important to me we have trained lots of exercises with target training. Since crossing over to training with positive reinforcement and giving my mare more choice, her ability to problem solve and raise criteria is wonderful. Training with positive reinforcement changed everything for us.

    1. Thanks for sharing! I think you’re really pinpointing the potential problems of using negative reinforcement inexpertly: fear, avoidance, shutting down.

      Great that you found some strategies to avoid over-arousal, too! Positive reinforcement doesn’t have to involve food..! 🙂

  8. This topic is indeed fascinating. I just went through 2 negative-positive-negative-positive training years. My young horse had been trained with negative reinforcement by a horse trainer for a year. At 4 he was shut down, aversive, and at times uncooperative. I decided to take off all tack and began a year of teaching him only with positive reinforcement. He went through initial stages of excitement to ultimately being self confident and very cooperative. I gave him choices and he loved it!! I then introduced back regular training with negative reinforcement. Interestingly I needed very little pressure and training went along very smoothly. I now use both training methods, one for play and one for work. I balance them out like a ying and yang and my horse has been so easy to train. The balance of both has also strengthened our bond. He enjoys the work and also enjoys our playtime. I am going to use both methods to train my next young horse.

    1. Thanks for sharing your story! Indeed I think that very light pressure may be a non-issue to the horse – it’s when that pressure escalates that we see the fallout of using negative reinforcement..! So, making sure to conscientiously minimize the use of aversives is crucial.

  9. A great post and very well written. I ,like Helen, love the classical work and I would agree that for most people using what I call tactile communication , so Helen’s information not compulsion , is where I would love the horse world to get to. I have traveled a bit deeper down the R+ rabbit hole over 20 plus years of clicker training and I now shape and capture much of the lateral and inhand work at liberty and then do a cue transfer process to reintroduce (as most of my horses are crossover or rescue) the ‘normal cues, so that they now understand and can function in a more normal world. I do feel it is very important, like Ken Ramerez says, to train for life so they do need to understand normal pressure cues, but I have taught them how to find the release in what I think is a nicer way. Starting out and shaping people like you said in the blog to become more positive is also the way to get more folks on board and making sure that setting them up for success re where to start and how to set up the environment for success is critical. Thanks for writing this!

    1. Thank you! That’s an important point you’re making – “train for life”. Unlike dogs and cats, many horses do pass through several owners’ hands and it makes sense to teach them to respond to the cues that they might expect from the next owner.

  10. Very interesting artical! Actually I believe that combined training is not only the best way to start positive reinforcement training for horse trainers but, if they are training horses for riding, then I believe that it’s also the most welfare friendly way to continue to train too, although always following your guidance above about the use of pressure.

    I’ve been clicker training my horses for over a decade now and think of myself as a primarily positive reinforcement trainer but I do, and probably always will, use negative reinforcement as information, for mostly practical reasons, and I believe my horses gratefully accept it as that – as a means of communication not that I’m trying to make them ‘do’ something, simply that I’m trying to explain more clearly what I’m after.

    The problem with trying to train with only positive reinforcement with a horse that you are going to ride is, as I see it as I like to think a fairly well educated rider, one of biomechanics. If I’m going to ride my horse I believe that it is in the horse’s best interests to learn how to carry himself, and me, in an altered balance from the one he’s in to carry himself around the field grazing. He needs to overcome his natural asymmetry, so he’s not overloading one lateral or diagonal pair of legs, and shift more weight onto his back legs and engage his abs and lift his back, so he’s not overloading his front legs, to make carrying me easier and less damaging for himself.

    This is taught using lateral work, teaching him to place one hind foot under his centre of gravity in movement individually, and then both together within a movement. It involves teaching fairly precise lateral bending of his body. Trying to do this by getting the horse to target with different body parts is complicated but probably just about doable, but translating that to when you’re riding, when your own balance and seat is so critical too ……???

    The biomechanically ideal and least complicated way to train the horse from the saddle is to train him to move away from your leg, yield to pressure on his mouth or nose and to follow your seat and weight aids. That’s why classical horsemanship has been done this way for many centuries. I actively study classical dressage and often argue with mostly negative reinforcement trainers that the old masters in history would have leapt on clicker training with huge enthusiasm, as a way to get results with far less effort and far more enthusiasm and much less resistance from the horse. Historically training horses biomechanically correctly for riding has always been primarily about working through resistances and, as a clicker trainer, I don’t find that to be the case at all. My horses are always trying their hearts out, it’s always my lack of skill or my weight not being in the right place that causes any problems that we come across, and I’ve proved that to myself time and time again!!!

    So, after many years of experimenting with my horses, personally I do believe that the kindest, most sustainable way for me to train a horse for riding, to make life the most comfortable for him that it can be and to keep him sound and happy into advanced old age, is combination training, teaching him to yield to, and move away from, pressure, but using light, non escalating pressure, as information, never as compulsion. I still want a volunteer not a conscript, who enjoys our work together as much as I do, and achieves physical, mental and emotional balance through it.

    1. Thanks for taking the time to comment, Helen! You raise some very valid points that aren’t all that obvious to non-riders like me: the need for optimal physiology and movement in the ridden horse.

      I like the meme “information, not compulsion”. That shifts the whole perspective, doesn’t it?

Leave a Reply

Your email address will not be published.

Time limit is exhausted. Please reload the CAPTCHA.