[nagdu] Clicker Treats RE: dog corrections
Ann Edie
annedie at nycap.rr.com
Mon Aug 31 18:24:36 UTC 2009
Hi, Tracy,
It may be that some people use the clicker as you have described. However,
Bob Bailey, the Father of Applied Operant Conditioning, who trained many
thousands of animals of many species, came to the conclusion that behavior
was best maintained by always following the click (conditioned reinforcer)
with the treat (primary reinforcer). Ben has obviously been talking to Bob
Bailey, and as we clicker trainers always say when there are discussions of
theory and practice, "Go to people for opinions; go to the animals for
answers."
The click is always followed by a primary reinforcer, but that does not mean
that it must be followed by a treat. Anything that the dog really wants and
will see as reinforcing will serve as a reinforcer. Some dogs will happily
respond for the chance to chase a ball or squeaky toy or to play a game of
tug. But food reinforcers are quick and convenient, and as Ben is a
Labrador, I suspect that food reinforcers might be tops in his book.
Now, that doesn't mean that you always have to click and treat when Ben
comes promptly out from under the desk, or whatever the desired behavior is.
In the stage of training where you are trying to establish a new behavior or
to improve a known behavior, you always want to click and treat, because you
want to strengthen the behavior by building up a strong reinforcement
history for that behavior. You might even want to go back a step in the
training and click and treat for the first little piece of the behavior you
are looking for, like Ben lifting his head, starting to move, or even
yawning, in response to your picking up his harness or unfastening his leash
or whatever things you do before asking him to come out.
You could click one of these first steps to getting up and coming out, and
then present the treat for him to take as he gets to his feet and comes out
from under the desk. I think this approach would have the added benefit of
encouraging him to come out from under the desk quickly and happily. I
might also, once he comes out and receives his treat, then hold out his
harness and when he dives into it, click and treat once again.
Once the behavior is strong and consistent, you can, if you really want to,
begin to fade the click and treat for that particular behavior of coming out
from under the desk quickly. The way you would do this is to leave out both
the click and treat sometimes when you call him to come out from under the
desk. I would, however, keep using a click and treat for something further
down the chain, like getting into his harness.
You can keep reducing the percentage of times that you click and treat his
coming out from under the desk, and Ben will let you know if you have
reduced it too far, because the behavior will begin to deteriorate. Or you
can begin to raise your criteria by clicking and treating only the trials
when he comes out a bit faster or more enthusiastically than average.
You can also add more behaviors to the chain before clicking and treating,
such as ben's coming into heel position before you start off, or his
stepping out at the appropriate speed when you say "forward", or whatever
next cues you might be accustomed to giving him, like "find the door" or
"outside" or whatever. The good feelings associated with the click and
reinforcement at the end of the chain of behaviors will move backward along
the chain so that the opportunity to do each subsequent behavior becomes a
reinforcer for the previous behavior. But the chain of behaviors has to be
built link by link; each behavior has to be reinforced at first so that it
will be strong; then each behavior has to be linked to the next in pairs;
then the whole chain can be reinforced at some arbitrary point, before
starting up again on the next chain of behaviors.
I have found that I usually choose to end a chain of guide behaviors at the
point when the guide has successfully stopped at a down curb or street
crossing. There are at least a couple of reasons for this; one is that we
are stopping at that point anyway, so it seems like a good point to click
and treat the entire string of behaviors which occurred to get us there,
like going around obstacles, lefts and rights, changes of pace, avoiding
distractions, etc.; another reason is that stopping at curbs is a behavior
that I really want to keep on a high level of reinforcement, and, at least
in my experience, it is one that guide dogs tend to begin to neglect if it
is not highly reinforced.
Here's something for you to mull over and play around with in connection
with the idea of behavior chains and the concept that each behavior is
reinforced by the opportunity to go on to the next behavior and that the
emotions run backward up the chain of behaviors. Say you call Ben out from
under the desk, harness him up, set off on your way, go out of the building,
turn onto the sidewalk, weave in and out of pedestrian traffic, avoid
obstacles such as trash bins, parking meters, open cellar doors and vehicles
parked on the sidewalk, etc.,--and of course, you are carrying on the usual
conversation with Ben during all this time, praising him, cueing him,
encouraging him, whatever you usually do--and then he fails to stop at the
curb. I assume that you would apply a leash correction or some other
definite punisher at this point to reduce the likelihood that this error
would be repeated.
Well, the unpleasant or stressful emotion in the dog caused by the
correction also travels back along the entire chain of events which preceded
it. If this string of events occurs frequently, the dog can become
reluctant to start the entire chain of behaviors. He may not jump into the
harness with such enthusiasm, or he might not be so eager and quick to come
out from under his desk. I think that we often see this type of behavior
develop in guide dogs, and we attribute it to stress in the dog caused by
the rigors of guide life, or to burn out, or the aging process.
If, instead, we habitually click and reinforce stopping at the curb or
street crossing, we are positively reinforcing all the behaviors in the
chain which preceded that stop. This makes the whole emotional environment
of the guide work very pleasant and positive.
Here's another thing that I have been observing and wondering about in
connection with guide work and reinforcement. It seems to me that many
guide dogs come to be in a great hurry to get to the destination, to dive
under that desk, almost as if they are in a hurry to get to the "safe place"
at the end of the trip. I'm wondering why this might be so, from the dog's
point of view.
Perhaps it is that being able to rest and relax is a primary reinforcer
which comes at the end of the behavior chain that constitutes guide work.
But the demeanor of many dogs in this situation seems to tell me that it
might be something else, that they are hurrying to get to the end of the
route and dive under the desk because once there, they are no longer at risk
of making a mistake and experiencing punishment. In other words, the entire
sequence of behaviors has been associated with punishment or the risk of
punishment which sometimes occurs along the chain.
My experience has been that punctuating chains of guide behavior with the
click and treat serves to make the entire guide work process more
reinforcing to the dog. They seem to be more able to relax and enjoy the
process, rather than feeling that they need to hurry to get to the end of
the route. This also seems to allow them to perform more precisely--if
they're not in a hurry to get to the end of the route, they are less likely
to run that curb, and more likely to stop at the curb, giving you the
opportunity to click and reinforce the entire chain of behaviors that led to
that point. If they are enjoying the process and not focused on the need to
get to the destination, they have more attention available to notice
obstacles, uneven or slippery footing, overhanging tree branches or
awnings,, etc., and are therefore more likely to respond to these conditions
as desired and to earn praise and reinforcement, which again reinforces the
entire process.
Tracy, I in no way mean to imply that any of my speculative thoughts above
apply to Ben or to the situation you are asking about related to his coming
out from under the desk. The short answer to your question is that it has
been found to be most effective to always treat when we click. But that
doesn't mean that we always have to click a certain behavior. And it
doesn't mean that we always have to click before giving a treat.
Using the clicker and positive reinforcement is a method of training new
behaviors and a way of keeping important behaviors strong and reliable. The
purpose of the click is to mark precisely the behavior we want to make
stronger. The purpose of the treat after the click is to associate the
performance of that precise behavior with good things happening and good
feelings, so that that behavior will be more likely to occur in the same
situation in the future.
Once a behavior has been "trained" to the desired level, we can fade the
click and treat from that behavior. If the behavior deteriorates, we know
that we have faded the reinforcement too quickly, and we can go back and
retrain or refresh that behavior. At the same time, we can incorporate the
new behavior into behavior chains which are reinforced. And we will
probably be using the click and reward to train other new behaviors or to
improve other behaviors.
I'd love to hear your thoughts on all these musings.
Best,
Ann
----- Original Message -----
From: "Tracy Carcione" <carcione at access.net>
To: "NAGDU Mailing List,the National Association of Guide Dog Users"
<nagdu at nfbnet.org>
Sent: Monday, August 31, 2009 10:27 AM
Subject: Re: [nagdu] Clicker Treats RE: dog corrections
> It is my understanding that, once the dog understands the behavior being
> clicked for, one is supposed to sometimes click with no treat. For
> instance, Ben loves his cave under my desk at work, and he takes a while
> to come out when I call him. So I call him, and click and treat when he
> comes out and touches my hand. So I thought that the next step is to call
> him, and, sometimes, click and treat, and other times just click. Then I
> can gradually phase out the click. This is what I thought was sometimes
> called "The Las Vegas method", because the dog never knows when he might
> hit the jackpot, so he will keep playing the game, just in case.
>
> Of course, Ben is very clever, so if I skip the treat more than once in a
> row, he stops coming out again. Have I said that he's a brat?
>
> Have I got the method right, or am I missing something?
> Tracy
>
>
>> That is very interesting that you do not always use the treats with the
>> clicker. AT GDB, they told us that we must use a treat every time that
>> we
>> click the clicker, even if it was an accident. We were strongly
>> discouraged
>> from doing clicker training around other dogs. I was reminded why the
>> other
>> day when I was in a pet shop with a training area. Someone was doing
>> clicker training with their dog, and Lexia was very interested. Luckily,
>> I
>> was not working her, as she was there for her bath. I don't use clicker
>> training too much as sometimes Lexia tries too hard for it and misses the
>> point as she tries too hard to please in order to get the treat, but it
>> certainly works well for teaching new commands.
>
>
>
> _______________________________________________
> nagdu mailing list
> nagdu at nfbnet.org
> http://www.nfbnet.org/mailman/listinfo/nagdu_nfbnet.org
> To unsubscribe, change your list options or get your account info for
> nagdu:
> http://www.nfbnet.org/mailman/options/nagdu_nfbnet.org/annedie%40nycap.rr.com
>
More information about the NAGDU
mailing list