Features March 2002 Issue

Understanding Reward Based Dog Training

One dog’s pleasure is another dog’s pain; rewards are an individual thing.

The trim, middle-aged lady strode briskly down the rubber mat in the training center, her black Labrador Retriever bouncing happily at her side. She came to a smooth halt, and Skip sat promptly next to her, in perfect heel position. “Yes!” I thought to myself, and then winced as Carla reached down and enthusiastically patted Skip on the head. Skip jumped up and backed away from his human.

“Carla,” I said softly. “You just punished him for sitting straight.” Carla’s face fell. “Darn it!” she exclaimed. “Why can’t I remember that!”

Hearing a phrase that gives this water-loving
Golden Retriever permission to jump into the
lake is both a cue ("Go get it!") and a high-
value reward.

Wait a minute . . . since when is patting a dog considered punishment? Ever since Skip let us know by ducking his head and backing away from Carla’s hand that he didn’t enjoy being petted. All the other Labs that Carla had owned and trained throughout her life had adored being touched as a reward. Carla petted her dog for being good without even thinking about it – it was a well-conditioned response. Unfortunately, since Skip didn’t like being touched, every time she did it to him, she was actually punishing him, decreasing the likelihood that he would perform that perfect sit again!

A dog’s decisions in life, and his resulting behaviors, are based on whether a particular behavior yields something he likes (a reward) or something he doesn’t like (a punishment). Training is simply a matter of manipulating the rewards and punishments in a thoughtful manner . . . But you have to know your dog – be thoroughly aware of his likes and dislikes – and conscious of your own behavior to make “training” work for you.

Rewards and punishments
In the 1950s, behavioral scientist B.F. Skinner developed a number of principles that are applicable to all living things with a central nervous system. He found that animals are likely to repeat behaviors that are enjoyable/rewarding to them, and not likely to repeat behaviors that result in something unpleasant (punishment). Neutral stimuli – things that don’t matter to the animal – don’t have an impact on behavior one way or the other.

Skinner demonstrated that humans can use these simple principles to modify an animal’s behavior. Rewards are the most reliable way to deliberately increase an animal’s offered behaviors; conversely, punishment decreases those behaviors. (See “The Four Principles of Operant Conditioning,” next at end of story). We use these behavioral principles in dog training with great success.

However, as with Skip, the practical application of “rewards” and “punishments” varies from dog to dog, even though the definition doesn’t. A reward is anything a particular dog likes. A punishment is anything that dog doesn’t like.

We frequently use food treats as our reward in training because we can almost always find some food that a dog will value highly enough that it can serve as an irresistible reward, but food is not the only reward available to us. Remember, a reward is anything a dog likes. It could be a pat on the head (but not for dogs like Skip, who don’t like to be touched), verbal praise, a game of tug o’ war, a chase after a stick or tennis ball, a walk on leash, a car ride, permission to jump up on the sofa, the cue to run an agility course, the release from a “wait” to run out into the yard, permission to go jump in the lake, or the signal to round up a flock of sheep.

When the average inexperienced dog handler hears the word “punishment,” he generally thinks of overt forms of physical punishment, such as smacking, pinching, or kicking the dog, or jerking on the leash. I do not recommend or use physical punishment, as it endangers the handler, damages the relationship with his dog, and can destroy the dog’s enthusiasm for training. Fortunately, physical punishment is not the only way to eliminate an unwanted behavior.

Remember, behaviorists define the word “punishment” as anything that causes an animal to decrease a certain behavior. So, in the case of Skip, the Lab who didn’t like being touched, a pat on the head after he performed a straight sit was enough to make him stop performing those straight sits.

“Positive trainers” – people who have made a commitment to train without the use of pain, fear, force, or intimidation – often use certain forms of “punishment” (in the behavioral sense) to accomplish their training goals. For example, when a dog who craves physical contact and attention jumps all over the trainer, she will turn her back on him and step away, removing both her attention (eye contact and interaction) and the possibility of physical contact with the dog. These are the rewards that the dog is seeking by jumping up. When the dog’s jumping behavior keeps resulting in the loss of something he wants badly, he will stop jumping – especially when this “punishment” is paired with the “reward” of attention, treats, and petting for sitting quietly.

What actually constitutes a punishment or reward to any given dog, then, is an individual matter; in behavioral terms, context is everything.

Unintentional training
Training, therefore, is the intentional use of rewards and punishments to purposefully manipulate a dog’s behavior. What is sometimes difficult to remember is the fact that dogs are learning all the time, whether or not we are paying attention. People are often mystified as to why their dogs do some of the things they do, or fail to do what the people want them to do.

Author Pat Miller works with her terrier-mix,
Josie. The alert and engaged expression on
Josie’s face clearly conveys her enthusiasm
for working with Pat.

It’s actually pretty simple. Dogs do what works for them; they don’t do things unless they get something out of it.

Dogs do things that we consider “inappropriate behavior,” because it’s fun, it feels good, or it tastes good. From a dog’s perspective, behaviors that are unacceptable to us, such as getting in the garbage, chasing cats, or sleeping on the sofa, are just plain fun!

Frustrated owners frequently say to their trainers, “He knows he’s not supposed to do that! I punish him when he does, but he still does it. Why?” Sometimes, the enjoyment the dog gets from the behavior outweighs the owner’s “punishment.” A dog who is highly aroused by the experience of chasing a cat over the backyard fence may not care a bit about getting yelled at for it.

In other cases, the “punishment” may actually be rewarding to the dog. For example, a boisterous Labrador who gets yelled at, hit, or even kicked for jumping up on his owner may not have any clue that the yelling, hitting, and kicking is supposed to be a punishment. To dogs who crave attention and love physical contact with people, this rough treatment is simply an invitation to play an enjoyable (rewarding) game.

Also, dog owners may fail to realize that they often unthinkingly punish a dog for doing the right thing. If you do this frequently enough, you will inadvertently “train” your dog to stop offering the behaviors you want.

Consider the woman whose dog is enjoying a good romp with some canine pals at the dog park. It’s time to leave, so she calls her dog to her. He immediately leaves his play pals and races to her. “Good dog!” she exclaims, and snaps his leash on, taking him from the park. In her view, the verbal praise was ample reward, and leaving the park has no connection to the recall. But here’s how the dog sees it: “Mom called, I came, and the fun’s over. When I come to Mom, a bad thing happens – the fun stops.” He is likely to think twice about coming the next time she calls while he is playing with friends!

Many people have lots of trouble training their dog to come reliably when called. Perhaps they haven’t given enough consideration to what happens to the dog most of the time after he does come. It doesn’t take a canine Einstein to realize that coming when called is a bad idea if something “bad” consistently happens to him immediately afterward – say, he gets stuffed into the basement or locked away from all the guests in the kitchen, or tossed outside in the cold rain.

Training may also break down when the reward isn’t valuable enough to motivate the dog to bother trying to get it. You must program an automatic response to the “come” cue with a high-value reward in the absence of enticing distractions before you try to apply it in the face of dashing squirrels. Few dogs will leave a squirrel hunt in order to come and earn a piece of dry kibble! Many positive trainers use a variety of enticing rewards and mix them up. Then the dog is never sure how big the “payoff” for his good behavior will be; he just knows it will be good.

If you doubt that mixing small rewards (such as verbal praise, a pat, or a piece of dry kibble) with larger rewards (such as pieces of fresh meat, chasing a ball, or being released to run free) is a powerful motivator, consider the slot machine. As long as it pays out a mixture of no rewards, small rewards, and only an occasional jackpot, human gamblers will continue to sit there and pull the handle, long past the time that it makes sense to do so!

Random acts of reinforcement
Having a variety of rewards in your training tool kit gives you greater flexibility and allows you to train your dog without always having a huge supply of treats in your pocket. A good training program moves toward variable reinforcement once the dog is reliably performing a new behavior. Instead of clicking and giving the dog a treat every time he performs the behavior, you occasionally skip a click and praise the dog instead, then ask for the behavior again and click the next one. Gradually increase the variation and length of the reinforcement schedule, remembering that randomness is important.

The verbal praise that her handler meted out
for her correct down/stay has not brightened
her expression or increased her eagerness to
continue the training game. More valuable re-
wards are in order.

If you simply keep making your dog work harder and harder for a click, he’s likely to quit on you. If you vary the reinforcement schedule, like a Las Vegas slot machine, he can’t predict when you will pay off. Will I get a click this time? This time? This time? Click! Just as people will continue inserting quarters, your dog will keep offering behaviors with enthusiasm, sure that the next one will hit the jackpot.

To maintain his enthusiasm as you gradually lengthen the reinforcement schedule, use other rewards to let him know he’s still on track. I frequently use “Good dog!” as praise after I click and treat, so that my dogs associate the same warm fuzzy feeling of getting a food reward with the verbal praise. Then, when I use the verbal praise even without the click and treat, they still have the same classically conditioned response from the association of praise with food, and it makes them feel good. Thus, “Good dog!” becomes a useful reward even without food.

Other rewards may create more of an interruption in the training game. If you use a toy as a reward, you have to stop and let your dog play with it for a while. This can work really well to amp him up on the enthusiasm scale, especially for a dog who is ball crazy or loves to tug. It doesn’t work well when you want to do a lot of repetitions of a discrete behavior in a row. If you toss the ball every time he responds to your “down” cue, it will take you a long time to do a half-dozens repetitions. It does work well as a reward for an extended behavior, such as heel. A ball-crazy dog can learn to heel with perfect attention for long stretches in anticipation of the ball-chase that happens at the end.

Timing is key
It is important to a successful training program to understand what your dog likes and doesn’t like, and to use those rewards and punishments effectively. In order to be effective, consequences – good or bad – must be delivered in close proximity in time to the behavior you are trying to influence.

Say your dog tips over your kitchen garbage can while you are away at work. If you reprimand him when you get home from work, hours after the garbage raid occurred, it only teaches your dog that you are sometimes unpredictable and dangerous when you come home. No matter how “guilty” he looks when you scold him, he makes no connection between your behavior of yelling at him and his behavior of getting in the garbage hours earlier. Your perception of his apparent guilt-stricken conscience, manifested in his lowered head, lack of eye contact, and slinking along the baseboards, is a faulty interpretation of his classic canine body language attempts to quell your wrath, whatever the cause.

Behaviorists agree that a reward or punishment must be delivered within three seconds, preferably one second or less, of the behavior you are trying to increase or decrease. This is a pretty small window of time, and underscores the value of using a clicker or other reward marker (or no-reward marker) to mark the instant of desired (or inappropriate) behavior. If you say “Oops!” the instant your dog jumps up and you turn away, you are teaching your dog a no-reward marker, which you can use to communicate to your dog which behavior it was that made the good thing go away (negative punishment). If you Click! or say “Yes!” the instant your dog sits, he will come to understand that the sit earned the reward, even if it takes several seconds for you to get the treat into his mouth, and even if he gets up from the sit before you manage to deliver the treat.

Skipping ahead
Carla and I had a long discussion about how to continue with Skip’s training. We identified two options. Using desensitization, we could teach Skip that having Carla pat him on the head really was a reward, by consistently pairing her touch with an off-the-charts treat reward, using gentle contact at first, then increasing in intensity until he learned to associate vigorous patting with “really good stuff.” Carla made a commitment to doing this for the long term, as she really wanted Skip to enjoy her touch.

We also initiated a short-term approach of modifying Carla’s behavior, agreeing to use positive reinforcement and negative punishment with her. Every time Skip sat and she didn’t reach down to pat him, Carla earned a reward, such as a quarter, a piece of chocolate, or a dog toy. Every time she forgot and reached down to pat him, I stepped out of the training room without a word, for a period of time from 30 seconds to three minutes. It worked beautifully, and in short order, Skip was sitting happily in perfect heel position when Carla halted, without fear of being punished for his good behavior.


Also With This Article
Click here to view "Who’s Training Whom? Human Training Secrets Revealed."
Click here to view "The Four Principles of Operant Conditioning."

-By Pat Miller

Pat Miller, WDJ’s Training Editor, is also a freelance author and Certified Pet Dog Trainer in Chattanooga, Tennessee. She is the President of the Board of Directors of the Association of Pet Dog Trainers, and recently published her first book, The Power of Positive Dog Training. For more information, see "Resources."

Comments (0)

Be the first to comment on this post using the section below.

New to Whole Dog Journal? Register for Free!

Already Registered?
Log In