[nagdu] Clicker Treats RE: dog corrections

Mon Aug 31 18:24:36 UTC 2009

Hi, Tracy,

It may be that some people use the clicker as you have described.  However, 
Bob Bailey, the Father of Applied Operant Conditioning, who trained many 
thousands of animals of many species, came to the conclusion that behavior 
was best maintained by always following the click (conditioned reinforcer) 
with the treat (primary reinforcer).  Ben has obviously been talking to Bob 
Bailey, and as we clicker trainers always say when there are discussions of 
theory and practice, "Go to people for opinions; go to the animals for 
answers."

The click is always followed by a primary reinforcer, but that does not mean 
that it must be followed by a treat.  Anything that the dog really wants and 
will see as reinforcing will serve as a reinforcer.  Some dogs will happily 
respond for the chance to chase a ball or squeaky toy or to play a game of 
tug.  But food reinforcers are quick and convenient, and as Ben is a 
Labrador, I suspect that food reinforcers might be tops in his book.

Now, that doesn't mean that you always have to click and treat when Ben 
comes promptly out from under the desk, or whatever the desired behavior is. 
In the stage of training where you are trying to establish a new behavior or 
to improve a known behavior, you always want to click and treat, because you 
want to strengthen the behavior by building up a strong reinforcement 
history for that behavior.  You might even want to go back a step in the 
training and click and treat for the first little piece of the behavior you 
are looking for, like Ben lifting his head, starting to move, or even 
yawning, in response to your picking up his harness or unfastening his leash 
or whatever things you do before asking him to come out.

You could click one of these first steps to getting up and coming out, and 
then present the treat for him to take as he gets to his feet and comes out 
from under the desk.  I think this approach would have the added benefit of 
encouraging him to come out from under the desk quickly and happily.  I 
might also, once he comes out and receives his treat, then hold out his 
harness and when he dives into it, click and treat once again.

Once the behavior is strong and consistent, you can, if you really want to, 
begin to fade the click and treat for that particular behavior of coming out 
from under the desk quickly.  The way you would do this is to leave out both 
the click and treat sometimes when you call him to come out from under the 
desk.  I would, however, keep using a click and treat for something further 
down the chain, like getting into his harness.

You can keep reducing the percentage of times that you click and treat his 
coming out from under the desk, and Ben will let you know if you have 
reduced it too far, because the behavior will begin to deteriorate.  Or you 
can begin to raise your criteria by clicking and treating only the trials 
when he comes out a bit faster or more enthusiastically than average.

You can also add more behaviors to the chain before clicking and treating, 
such as ben's coming into heel position before you start off, or his 
stepping out at the appropriate speed when you say "forward", or whatever 
next cues you might be accustomed to giving him, like "find the door" or 
"outside" or whatever.  The good feelings associated with the click and 
reinforcement at the end of the chain of behaviors will move backward along 
the chain so that the opportunity to do each subsequent behavior becomes a 
reinforcer for the previous behavior.  But the chain of behaviors has to be 
built link by link; each behavior has to be reinforced at first so that it 
will be strong; then each behavior has to be linked to the next in pairs; 
then the whole chain can be reinforced at some arbitrary point, before 
starting up again on the next chain of behaviors.

I have found that I usually choose to end a chain of guide behaviors at the 
point when the guide has successfully stopped at a down curb or street 
crossing.  There are at least a couple of reasons for this; one is that we 
are stopping at that point anyway, so it seems like a good point to click 
and treat the entire string of behaviors which occurred to get us there, 
like going around obstacles, lefts and rights, changes of pace, avoiding 
distractions, etc.; another reason is that stopping at curbs is a behavior 
that I really want to keep on a high level of reinforcement, and, at least 
in my experience, it is one that guide dogs tend to begin to neglect if it 
is not highly reinforced.

Here's something for you to mull over and play around with in connection 
with the idea of behavior chains and the concept that each behavior is 
reinforced by the opportunity to go on to the next behavior and that the 
emotions run backward up the chain of behaviors.  Say you call Ben out from 
under the desk, harness him up, set off on your way, go out of the building, 
turn onto the sidewalk, weave in and out of pedestrian traffic, avoid 
obstacles such as trash bins, parking meters, open cellar doors and vehicles 
parked on the sidewalk, etc.,--and of course, you are carrying on the usual 
conversation with Ben during all this time, praising him, cueing him, 
encouraging him, whatever you usually do--and then he fails to stop at the 
curb.  I assume that you would apply a leash correction or some other 
definite punisher at this point to reduce the likelihood that this error 
would be repeated.

Well, the unpleasant or stressful emotion in the dog caused by the 
correction also travels back along the entire chain of events which preceded 
it.  If this string of events occurs frequently, the dog can become 
reluctant to start the entire chain of behaviors.  He may not jump into the 
harness with such enthusiasm, or he might not be so eager and quick to come 
out from under his desk.  I think that we often see this type of behavior 
develop in guide dogs, and we attribute it to stress in the dog caused by 
the rigors of guide life, or to burn out, or the aging process.

If, instead, we habitually click and reinforce stopping at the curb or 
street crossing, we are positively reinforcing all the behaviors in the 
chain which preceded that stop.  This makes the whole emotional environment 
of the guide work very pleasant and positive.

Here's another thing that I have been observing and wondering about in 
connection with guide work and reinforcement.  It seems to me that many 
guide dogs come to be in a great hurry to get to the destination, to dive 
under that desk, almost as if they are in a hurry to get to the "safe place" 
at the end of the trip.  I'm wondering why this might be so, from the dog's 
point of view.

Perhaps it is that being able to rest and relax is a primary reinforcer 
which comes at the end of the behavior chain that constitutes guide work. 
But the demeanor of many dogs in this situation seems to tell me that it 
might be something else, that they are hurrying to get to the end of the 
route and dive under the desk because once there, they are no longer at risk 
of making a mistake and experiencing punishment.  In other words, the entire 
sequence of behaviors has been associated with punishment or the risk of 
punishment which sometimes occurs along the chain.

My experience has been that punctuating chains of guide behavior with the 
click and treat serves to make the entire guide work process more 
reinforcing to the dog.  They seem to be more able to relax and enjoy the 
process, rather than feeling that they need to hurry to get to the end of 
the route.  This also seems to allow them to perform more precisely--if 
they're not in a hurry to get to the end of the route, they are less likely 
to run that curb, and more likely to stop at the curb, giving you the 
opportunity to click and reinforce the entire chain of behaviors that led to 
that point.  If they are enjoying the process and not focused on the need to 
get to the destination, they have more attention available to notice 
obstacles, uneven or slippery footing, overhanging tree branches or 
awnings,, etc., and are therefore more likely to respond to these conditions 
as desired and to earn praise and reinforcement, which again reinforces the 
entire process.

Tracy, I in no way mean to imply that any of my speculative thoughts above 
apply to Ben or to the situation you are asking about related to his coming 
out from under the desk.  The short answer to your question is that it has 
been found to be most effective to always treat when we click.  But that 
doesn't mean that we always have to click a certain behavior.  And it 
doesn't mean that we always have to click before giving a treat.

Using the clicker and positive reinforcement is a method of training new 
behaviors and a way of keeping important behaviors strong and reliable.  The 
purpose of the click is to mark precisely the behavior we want to make 
stronger.  The purpose of the treat after the click is to associate the 
performance of that precise behavior with good things happening and good 
feelings, so that that behavior will be more likely to occur in the same 
situation in the future.

Once a behavior has been "trained" to the desired level, we can fade the 
click and treat from that behavior.  If the behavior deteriorates, we know 
that we have faded the reinforcement too quickly, and we can go back and 
retrain or refresh that behavior.  At the same time, we can incorporate the 
new behavior into behavior chains which are reinforced.  And we will 
probably be using the click and reward to train other new behaviors or to 
improve other behaviors.

I'd love to hear your thoughts on all these musings.

Best,
Ann

----- Original Message ----- 
From: "Tracy Carcione" <carcione at access.net>
To: "NAGDU Mailing List,the National Association of Guide Dog Users" 
<nagdu at nfbnet.org>
Sent: Monday, August 31, 2009 10:27 AM
Subject: Re: [nagdu] Clicker Treats RE: dog corrections

> It is my understanding that, once the dog understands the behavior being
> clicked for, one is supposed to sometimes click with no treat.  For
> instance, Ben loves his cave under my desk at work, and he takes a while
> to come out when I call him.  So I call him, and click and treat when he
> comes out and touches my hand.  So I thought that the next step is to call
> him, and, sometimes, click and treat, and other times just click.  Then I
> can gradually phase out the click.  This is what I thought was sometimes
> called "The Las Vegas method", because the dog never knows when he might
> hit the jackpot, so he will keep playing the game, just in case.
>
> Of course, Ben is very clever, so if I skip the treat more than once in a
> row, he stops coming out again.  Have I said that he's a brat?
>
> Have I got the method right, or am I missing something?
> Tracy
>
>
>> That is very interesting that you do not always use the treats with the
>> clicker.  AT GDB, they told us that we must use a treat every time that 
>> we
>> click the clicker, even if it was an accident.  We were strongly
>> discouraged
>> from doing clicker training around other dogs.  I was reminded why the
>> other
>> day when I was in a pet shop with a training area.  Someone was doing
>> clicker training with their dog, and Lexia was very interested.  Luckily,
>> I
>> was not working her, as she was there for her bath.  I don't use clicker
>> training too much as sometimes Lexia tries too hard for it and misses the
>> point as she tries too hard to please in order to get the treat, but it
>> certainly works well for teaching new commands.
>
>
>
> _______________________________________________
> nagdu mailing list
> nagdu at nfbnet.org
> http://www.nfbnet.org/mailman/listinfo/nagdu_nfbnet.org
> To unsubscribe, change your list options or get your account info for 
> nagdu:
> http://www.nfbnet.org/mailman/options/nagdu_nfbnet.org/annedie%40nycap.rr.com
>