Monkey See, Monkey Do

Have you ever thought you’d become a labeling machine?

Probably not, nevertheless, you are. We all are.

If you have small children you might have noticed how they sometimes become obsessed with getting your attention. Many parents try to extinguish this behavioral pattern, one that they see as childish and nagging. They are wrong.

A child’s request for attention is deeply rooted in his or her learning patterns. While toddlers mostly learn by experiencing the physical world, where feedback is haptic, older youngsters have evolved past the physical and into the mental modeling of their surroundings. Not only in the sense of understanding and navigating it, rather in building mental models for abstract functions such as socializing, moods, emotional states of the other, and so on. Building and maintaining such models is an ongoing process, and unlike physical actions where the environment either allows or disallows an action, such models need a human tutor in order to properly learn the correct models.

When the child was a child,
It was the time for these questions:
Why am I me, and why not you?
Why am I here, and why not there?
When did time begin, and where does space end?
Is life under the sun not just a dream?
Is what I see and hear and smell
not just the reflection of a world before the world?

Damiel, Der Himmel über Berlin, Wim Wenders,
from Song of Childhood, Peter Handke

An Important Digression into Developmental Psychology

Parents, teachers, psychologists, and others who are concerned with children’s behavior have long recognized that children will behave or misbehave in order to become the center of attention.

Back in 1962 an an American psychologist named Donald Baer conducted research using a puppet and found that children would behave in such a manner as to produce and maintain attention from the puppet. Dreikurs, an Austrian psychiatrist and educator who developed Alfred Adler’s system of individual psychology, noted that obtaining attention is the primary goal in young children, while, the recently deceased professor of psychology Jacob Gewirtz stated as early as 1956 that “Knowledge of the process underlying young children’s behavior for attention and similar reinforcers might well be a key to understanding their general social interaction with both adults and peers, very little is known either about the dimensionality of their attention-seeking or about its antecedents”. Only in 2018 does a recent paper suggested an alternative approach stating that infants point at objects they want their parents to provide information on.

So it would seem then that psychology has mostly studied the emotional aspects of attention-seeking behaviors, dancing around one basic question: why? What is the purpose of attention-seeking. Why have young humans developed a mechanism compelling them to look for the attention of their caregivers.

I would like to suggest a complementary explanation.

The Playing Child

A young child, trying to understand how to behave in a newly experienced social environment is not born with the ruleset (mental model) allowing him to navigate social scenes, they develop their own model as both spectators and participants. But how should they know what is to be considered a proper social behavior as opposed to an anti-social one?

The answer is: they play.

Playing is an activity allowing experimenting without retribution. When a child is at play, he is in the safe zone of trying. Even if he faults, break an object, hits, or offends someone, he is still not punished as this was only a game. Therefore games are a way of fearlessly exploring the social and behavioral space.

However, exploring the space does not create a mental model of proper behavior, for that you need someone to tell you what in that space is considered acceptable behavior, and what’s not.

When the child seeks your attention, when he asks you to look at him while he plays, he unknowingly looks for your subtle feedback based on which he will tune his model. Every facial expression, every tone of voice is actually a guiding signal that labels his gaming action. Whether you frown, smile, fret, or shout all become labels attached to his current behavior pattern. Such signals are constantly received by the child updating his model of the world, social relations, proper vs. improper behavior patterns, well-formed vs. non-well-formed language constructs, and many more.

You are, whether you like it or not, a sort of a labeling machine. Your role, your responsibility, is to constantly tag your child events and actions, to be the source of truth (sometimes called ground truth) for his evolving learned-models of the world around him.

Deep Networks

If you have not lived under a rock for the last decade you probably heard the term deep learning in one context or the other. Deep learning, based on deep neural networks, is trying to simulate human brain-structure thereby create a learning machine whose learning process would approximate our mental one.

Deep networks are currently used for a myriad of tasks from self-driving cars that read the environment and make instant decisions, throughout automated medical diagnosis, and in recent years, understand and converse in natural language.

Formally speaking, deep learning is a class of machine learning algorithms that uses multiple layers of simulated neurons to progressively extract higher-level features from the raw input.

Wait, what?!

Let’s back it up a little.

The neuron is the basic working unit of the brain, a specialized cell designed to transmit information to other nerve cells, muscle, or gland cells. Neurons are cells within the nervous system that transmit information to other nerve cells, muscle, or gland cells. Most neurons have a cell body, an axon, and dendrites.

Though it may sound intimidating neurons are extremely simple (well, not that simple, and never ceasing to amaze neurologists as they find more features implemented by neurons, but I am digressing).

At its core a neuron is a summation machine, it has wires coming in (dendrites), and a wire coming out (axon). The electrical current on the incoming wires is weighted per dendrite and summed up. If it’s larger than some threshold encoded into the neuron body (soma), then the outgoing wire would send an electrical pulse down its axon (which would then branch out via axon terminals towards other neurons’ dendrites).

An artificial (simulated) neuron is basically the same. It has incoming inputs, which are weighted into a summation machine, and it fires out a value only if the weighted sum is larger than the neuron’s current threshold. The input weights and the threshold value comprise the state of the neuron. When we say a neuron learns we mean it changes its state into a new one (more on that later).

Simple? Yes, however, this simple mechanism makes up most of your brain currently reading this complex text while concurrently controlling all of your bodily functions. This amazing feat is not carried out by a single neuron, rather by billions of them wired up together into layers.

Each layer is made up of multiple neurons, connected to yet another layer then another, then … another, till it reaches a final layer (output layer) that controls an action of your body, generates a decision, or abstracts a thought.

Each of the intermediate layers (usually called hidden layers, i.e. the ones between the input and output layers) are grasping a part of the resulting output. For example, when you look at an elephant your brain is activating an image processing network where lower layers (closer to the input layer) may identify edges, while higher layers may identify the shapes of the different parts of an elephant up to detecting what we see is an entire elephant (rather than a Kanguru, or some other animal). We would then say that the deep network has learned to identify an elephant out of a number of other animals. In stricter terms, the network has learned a model of an elephant.


In reality, deep networks can be more complex with output, and even hidden layers, connected back to previous ones forming a sort of cyclic flow of electricity throughout the network till it stabilizes and spits an output value.

But how do deep networks learn?

The Learning Process

We have already discussed how networks work, based on the state of their neurons, but how do neurons arrive at such states allowing, for example, to identify an elephant out of many other animals? The answer is: they learn by example.

Training a network to identify the type of animal it is looking at is a process where the network is shown images of many animals and told the name of each (think of the input layer as if it was the retina of a camera where each pixel is being fed into the first layer neurons, respectively). By telling it the name of each (animal) we mean that each image has a label with the animal name in the picture.

The network then adjusts its neurons’ states in such a way as to output the given answer (label) per each new image. So, if it gets shown a lion, and told (labeled) this is a lion the network would adjust its neurons’ states till the output layer says: lion. This process is reiterated for each new labeled image, till the network has learned all animal types.

Should the learning process be successful the network is able to identify animals in images it has never seen during the learning process. We then say the network has generalized the concept of animals. For example, if shown an image of an elephant it has never see it should say: elephant, even though it has never seen that specific image of an elephant during the learning process.

But you know this process. You have done it as a child numerous times every waking moment.

An elephant swallowed by a boa constrictor

Once, when I was six years old, I saw a magnificent picture in a book called “True Stories”, about the primeval forest. It was a picture of a boa constrictor swallowing a wild beast…  showed my masterpiece to the grown-ups, and asked them whether the drawing frightened them. They answered me: “Why should anyone be frightened by a hat?”
My drawing was not a picture of a hat. It was a picture of a boa constrictor digesting an elephant. Then, I drew the inside of the boa constrictor, so that the grown-ups could see it clearly. They always need to have things explained.

The Little Prince, by Antoine de saint Exupéry

How Does a Child Learn

When a child comes into this world he has no concept of animal nor elephant. It is only after he has witnessed an elephant whether in real life or in pictures and was told that the object he is viewing is an elephant, does the child learns to identify it as such.

At first, he would make mistakes, wrongly identifying somewhat similar animals like elephants, and get told he was wrong. Slowly, via a trial and error process, he would create his mental model of what an elephant looks like. But he cannot do it without someone labeling the images for him, he cannot discern an animal from a plant with large leaves resembling elephant ears till someone tells him it is not an elephant, rather a plant. Every child needs a tutor, a labeling oracle that knows the true labels (names) of the objects he is experiencing.

That labeler is you.

Most parents are performing the role of labeler instinctively never giving a second thought to their role as the source of truth for the child learning process. However, that is what they are, what you are. Every picture book you read to your toddler, every object you name for him, is a label. Every mistake your child is making (e.g. “Hey mom, here’s an elephant” when looking at an elephant-like plant) that you correct, is a label telling him he has misidentified. Both proper identifications of objects and wrong ones all serve as learning examples allowing the child to generalize what an elephant looks like, and what does not, updating and setting the inner state of its deep (biological) neural network.

The Parent as a Witness

Remember the nagging child demanding your attention? He is not doing this to spite, rather learn. When your child tells you to look dad, he is actually telling you: “please label this for me”. “This” may be an assertion he is making (here’s an elephant), an action he is taking (trying to drink the tub water after a shower), or an abstract thought (the sun is diving into the ocean). In each case, your response is a label guiding his learning process.

If you tell him to not drink the tub-water you are labeling them as nondrinking water. When you are saying yes, this is an elephant you are labeling that animal on TV as such, and when you tell him the sun does not actually dive into the ocean you are updating his model of the relations between earth and the sun. By witnessing his actions, thoughts, and utterances you provide invaluable labels without which the child cannot build and update his neural models of the world.

But what if he is misguided?

A Fair Witness

We all know that funny uncle that finds it amusing running pranks on your five-year-old. While you are trying to tell him the animal on the screen is an elephant, the funny uncle would tell him it’s a toaster (and be extremely gratified by the fact he just ran the perfect prank). The funny uncle is a mis-labeler.

“You know how Fair Witnesses behave.”

“Well … no, I don’t. I’ve never had any dealings with Fair Witnesses.”

“So? Perhaps you weren’t aware of it. Anne!”

Anne was seated on the springboard; she turned her head. Jubal called out,

“That new house on the far hilltop-can you see what color they’ve painted it?”

Anne looked in the direction in which Jubal was pointing and answered,

“It’s white on this side.” She did not inquire why Jubal had asked, nor make any comment.

Jubal went on to Jill in normal tones, “You see? Anne is so thoroughly indoctrinated that it doesn’t even occur to her to infer that the other side is probably white, too.

Stranger in a Strange Land, Robert E. Heinlein

The world is not a perfect learning environment, and guys like the funny uncle are a dime a dozen, how then would your child know which is a fair witness providing trustworthy labels, vs. which is not. The answer is: he trusts you.

Children intuitively trust their parents (till proven otherwise). Your labels as a parent carry much more weight in the child’s learning process than those of the funny uncle or the kid next door. This is why children tend to demand your attention. As you are the source of truth (til they hit puberty, where you become the source of all evil) your child repeatedly asks you for a clear set of labels enabling his learning process.

For what prompted the subject to form an ego ideal, on whose behalf his conscience acts as watchman, arose from the critical influence of his parents, to whom were added, as time went on, those who trained and taught him and the innumerable and indefinable host of all the other people in his environment, his fellow-men, and public opinion.

Sigmund Freud, 1914

This repeating claim on your time is a request for reinforcement of the correct labels over the wrong ones. Such reiteration strengthens the neurons’ states (i.e. the input weights, and output thresholds) that reflect a truthful model of the world (as much as any subjective a person can reflect the true state of affairs) over the noise generated by unfair witnesses.

And then there’s attention.

Attention Modeling

(not to be confused with attention-seeking behavior)

When you teach a youngster how to safely cross the road you usually direct his attention towards the cars driving by. However, as one crosses a road there are many other things happening around him. Birds are chirping, wind blowing, maybe a drizzle, people are standing on the other end of the zebra crossing, the clouds are moving in the sky not to mention emotional states and random thoughts. It’s a sensory pandemonium.

When you direct your youngster’s attention to the cars you are providing him with an attention focus, you are telling him: disregard all other data and focus only on the cars driving by. The learning process now becomes a filtered one, filtered by attention. This process of cognitive attention allows the child to learn the road-crossing model with the minimal set of data points concentrating on the utmost important ones.

Likewise, in the context of neural networks, attention is a technique that mimics cognitive attention. The effect enhances the important parts of the input data and fades out the rest — the thought being that the network should devote more computing power to that small but important part of the data.

Recent work in natural language understanding has shown that deep learning algorithms aimed at understanding sentences are doing much better when they know which parts of the sentence or sentences they have to put their attention to. The same goes for self-driving cars that need to process massive amounts of data and make split-second decisions as to where to point the wheel, apply gas and occasionally brake.

It is only natural then that your child would look at his labeler to draw his attention towards the important data signals over the noisy ones. Be those the subject and object in a sentence, or the cars flying by. The child has to develop a filtering mechanism that would only pass-through the important labeled data in order not to overflow its deep biological neural network into uselessness.

To do that, he needs your undivided attention.

Concluding Remarks

Attention-seeking behavior is deeply rooted in early and later stage child development. While psychologists have suggested a great number of explanations for this phenomenon ever since the late 1950s, I think it would be worthwhile applying newly found data regarding brain structures, machine learning, and artificial intelligence to expand our understanding of the reasons and driving forces compelling our children to seek our attention.