Ponder Video Interface

Critical Watching: Drilling Down into the Video Heatmap

Late last fall we released powerful heatmap filtering for the Ponder reading experience. We are now proud to announce a similar upgrade for Ponder Video.

Ponder Video Activity Bar

Teachers have been experiencing hundreds and hundreds of student responses on longer videos and it became obvious that we would need to make it possible to separate out participation based on groups, students, sentiments and themes, just the way we do on text documents.

We love our tick marks and the quick overview they give you, so you’ll find them in their usual place along the yellow video timeline. As before, there’s one tick mark per response at a particular timestamp in the video and the colors match the type of sentiment. Multiple responses still at the same time stamp stack on top of each other, so you can spot the points of focus by both tight clustering and the height of the bars.

Ponder Video Interface





Bookmarking stars appear below the timeline and your familiar zoom in/out UI helps with navigating longer videos. More on using the interface for critical watching on our support site.

Ponder Video Interface









But now here’s where it gets fancy. Notice anything different?

Now, above the timeline, underneath the video, you will find a set of filter drop-downs corresponding to the activity on the video.

Ponder Video Filter Menus



The first drop-down allows you to filter the responses by group; for example, so a teacher can see one section or period of a course they are teaching at a time. The numbers in parentheses indicates the number of responses created by that group.

Ponder Video Group Filter Menu




Want to see just your responses, or those from a particular student?  The second drop-down shows each responder, sorted by the number of responses they created which are indicated in parenthesis adjacent to each username.

Ponder Video User Filter




The third drop-down shows the mix of sentiments used in the responses, sorted by frequency (indicated in parenthesis), and allows you to filter for them.

Ponder Video Sentiment Filter





And the fourth drop-down shows the themes used in responses on the document, sorted by frequency indicated in parenthesis:

Ponder Video Theme Filter




As you can see, much of the Ponder power you are familiar with when navigating ideas across documents are now available for minute dissections of a single video.

And don’t forget, these capabilities are all available for custom integration on your platform through the Ponder API.

New modes of interaction for Flip: Annotating streaming video!

At the end of ever Spring Semester, the extended ITP community gathers round for a solid week (Monday-Friday, 9:40AM-6PM) of thesis presentations. It’s part gauntlet and part send-off for the graduation class.

This year, with the help of Shawn Van Every (the original architect, builder and maintenance man of the ITP Thesis Streaming Service), I had the opportunity to continue my run of informal experiments in video annotation by trying it out on a handful of thesis presentations.

For the third year running, Thesis Week has been accompanied by the Shep chatroom, created by ITP alumn Steve Klise. Shep immediately turned itself into a great place for backchannel commentary on the presentations…or not. I’ve always felt it would be great to see aggregations of Shep activity attached to the timecode of the presentation videos. Shep conversations unfortunately aren’t logged. I also wondered if turning Shep into some kind of “official record” of audience reactions to thesis would be something of a killjoy.

With the Ponder-Thesis experiment, I wasn’t exactly sure where “annotating thesis presentations” would fall on the work-fun spectrum.

It might be seen as some sort of “semi-official record.” That would make it “work,” and perhaps a bit intimidating, like volunteering to close-caption T.V. programs.

But annotating with Ponder requires more thinking than close-captioning which presumably will soon be replaced by voice recognition software if it hasn’t been already. So maybe it can be fun and engaging in a the same sort of “challenging crossword puzzle” way that Ponder for text can be.

Either way, the end-goal was clear, I was interested in some sort of read out of audience reactions: Cognitively (Did people understand the presentation?); Analytically (Was it credible, persuasive?); and Emotionally (Was it funny, scary, enraging, inspiring?).


We were able to get Ponder-Thesis up and running by Day 3 (Wednesday) of Thesis Week.

I sent out a simple announcement via email to come help test this new interface for annotating thesis presentations in some sort of group record to created an annotated record of what happened.

Unlike previous test sessions, there was no real opportunity to ask questions.

Results: One Size Does Not Fit All

Annotating live events is a completely different kettle of fish than annotating static video.

I had made several changes to the static video interface in preparation for thesis, but in retrospect, they weren’t drastic enough.

All of the things I’ve mentioned before that make video annotation harder work and generally less enjoyable than text annotation are repeated 10-fold when you add the live element because now slowing down to stop and reflect isn’t just onerous, it’s not an option.

As a result, the aspect of Ponder that makes it feel like a “fun puzzle” (figuring out which sentiment tag to apply) can’t be done because there simply isn’t time.

It was challenging even for me (who is extremely familiar with the sentiment tags) to figure out at what point to attach my annotation, which tag to apply *AND* write something coherent, all quickly enough so that I’d be ready in time for the next pearl of wisdom or outrageous claim in need of a response.

There was also hints of wanting to replicate the casual feel of the Shep chatroom. People wanted to say “Hi, I’m here” when they arrived.

Going forward, I would tear down the 2-column “Mine v. Theirs” design in favor of a single-column chat-room style conversation space, but I will go into more detail on next steps after reviewing the data that came out of thesis.

Donna Miller Watts presenting: Fictioning, or the Confession of the Librarian

Donna Miller Watts presenting: Fictioning, or the Confession of the Librarian

The Data

  • 36 presentations were annotated. However, 50% of the responses were made on just 6 of them.
  • 46 unique users made annotations. (Or at the very least 46 unique browser cookies made annotations.)
  • 266 annotations in total, 71 of which were { clap! }.
  • 30 unique sentiment tags were applied.
    • ???, Syntax?, Who knew?
    • How?, Why?, e.g?, Or…, Truly?
    • Yup, Nope
    • Interesting, Good point, Fair point, Too simple
    • { ! }, Ooh, Awesome, Nice, Right?
    • Spot on!, Well said, Brave!
    • { shudder }, { sigh }, Woe, Uh-oh, Doh!
    • HA, { chuckle }, { clap! }
  • At peak, there were ~19-20 people in Ponder-Thesis simultaneously.

Broken down by type, there were 39 Cognitive annotations having to do with having questions or wanting more information. 69 Analytical annotations. 158 Emotional annotations, although almost half (71) of those were the { clap! }.

Over half of the non-clap! responses had written elaborations as well (113).

  • Natasha Dzurny had the most applause at 10.
  • Sergio Majluf had the most responses at 26.
  • Kang-ting had the most emotional responses at 18.
  • Talya Stein Rochlin had the most emotional responses if you exclude applause at 14.
  • Sergio Majluf racked up the most eloquence points with 3 “Well saids!”
  • Talya Stein Rochlin had the most written commetns with 15 and the most laughs at 3.

Below are roll-ups of responses across the 36 presenters categorized by type.

  • Cognitive: Yellow
  • Analytical: Green
  • Emotional: Pink

Below is a forest-for-the-trees view of all responses. Same color-coding applies.

Forest-for-the-Trees view of  responses.

Forest-for-the-Trees view of responses.

Interaction Issues

I made a few design changes for thesis and fixed a number of interaction issues within the first few hours of testing:

  • Reduced the overall number of response tags and made them more straightforward. e.g. Huh. which has always been the short-form of “Food for thought…” became Interesting!
  • Replaced the 3rd-person version of the tags (xxx is pondering) with the 1st-person version: Interesting! because after the last test session, I felt a long list of the 3rd-person responses felt a bit wooden.
  • Added a { clap! } tag for applauding.
  • Made the “nametag” field more discoverable by linking it to clicking on the roster (list of people watching). Probably giving yourself a username should be an overlay that covers the entire page so people don’t have an opportunity to miss it.
  • As responses came in, they fill up in the “Theirs” column below the video. Once there were more comments than would fit in your window viewport, you wouldn’t see new comments unless you explicitly scrolled. We explicitly decided not to auto-scroll the list of responses for static video to automatically keep in time with the video because we thought it would be too distracting. For streaming however, auto-scroll would have just been one less thing for you to have to do while you’re trying to keep apace of the video and thinking about how to comment.

Other issues didn’t become apparent until after it was all over…

  • People didn’t see how to “change camera view.” The default view was a pretty tight shot of the presenter. Really the view you want is a wider shot that includes the presentation materials being projected on the wall behind the speaker.
  • The last test session helped me realize that for video the elaboration text field should stay open between submits. But really, it should probably start off open as it’s too small to be something you “discover” on your own while the video is going.
  • The star button which is meant to allow you to mark points of interest in the video without having to select a tag was never used. I’m not sure how useful it is without the ability to write something.


The obvious first step is to go in and squish the remaining interaction issues enumerated above. But there are more systemic problems that need to be addressed.


  • People wanted to say “hey” when they logged on. The “live” nature of the experience means social lubrication is more important than annotating text or video on your own. ITP Thesis is also a special case because the people annotating not only know the presenters personally but are likely sitting in the same room (as opposed to watching online.) One person said they were less likely to speak their mind on Ponder if they had something critical to say about a presentation.
  • There is also general trepidation over attaching a comment to the wrong point in the timecode. One person who is also familiar with the Ponder text response interface described the problem as “I don’t know what I’m annotating. With text, I know exactly what I’m selecting. With video, I’m doing it blind.”

Solution: Chatroom Layout

Replace the 2-column design in favor of a unified “chatroom” window that encourages more casual chatter. If the timecoding feels incidental (at this point in the presentation, someone happen to say such-and-such) then you worry less about attaching your annotation to the precisely correct moment.

Problem: Too many tags.

The sentiment tags got in the way of commenting. There were simply too many to choose from. They knew what they wanted to write, but the step of selecting a tag slowed them down. This was true of static video for those watching STEM instructional videos as well.

Solution: Reduce and Reorder

  • Slim down the tag choices, in effect trading fidelity of data (we’ll lose a lot of nuance in what we can aggregate) for lowering the bar for engagement. There should probably be something like 3, but alas I can’t figure out how to chuck out 2 of the following 5.
    • Question?!
    • Interesting
    • Are you sure about that?
    • HAHA
    • { clap ! }
  • Reorder the workflow so that you can write first and then assign tags after, not dissimilar to how people use hashtags in Twitter.

This rearrangement of steps would turn the live video workflow into the exact inversion of the text annotation experience, which is tag first then elaborate for very specific reasons that have more or less worked out as theorized.


The modest amount of data we gathered from this years’ thesis presentations was enough to validate the original motivation behind the experiment: Collect audience reactions in a way that can yield meaningful analysis of what happened. However there remains a lot of trial-by-error to be done to figure out the right social dynamics and interaction affordances to improve engagement. There are clear “next steps” to try and that is all you can every ask for.

The only tricky part is finding another venue for testing out the next iteration. If you have video (live or static) and warm bodies (students, friends or people you’ve bribed) and are interested in collaborating with us, get in touch! We are always on the look out for new use cases and scenarios.

New modes of interaction for Flip videos Part 3

This past semester I’ve been experimenting with new modes of interaction for video. I’ve written about 2 previous test sessions here and here.

Annotating video is hard. Video is sound, imagery moving through time. It’s an immersive and some might say brain-short-circuiting medium. Watching 3 videos simultaneously may be the norm today. However, if you’re truly engaged in watching video content, in particular content that is chock full of new and complex ideas, it’s hard to do much else.

Watching video content makes our brains go bonkers.

“’Every possible visual area is just going nuts,’ she adds. What does this mean? It shows that the human brain is anything but inactive when it’s watching television. Instead, a multitude of different cortexes and lobes are lighting up and working with each other…”

“She” is Joy Hirsch, Dir. of fMRI Research at Columbia U, being cited by the National Cable & Communications Association who interpret her results to mean watching tv is good for our brains, like Sudoku. I’m not sure about that, but it’s reasonable to conclude that consuming video content occupies quite a lot of our brain.

Of course no one is saying reading doesn’t engage the brain. However, one key difference between text and video makes all the difference when it comes to annotation: With reading, we control the pace of reading, slowing down and speeding up constantly as we scale difficult passages or breeze through easy ones.

Video runs away from us on its own schedule whether or not we can keep up. Sure we can pause and play, fast-forward and slow down, but our ability to regulate video playback can only be clunky when compared to the dexterity with which we can control the pace of reading.

In fact the way researchers describe brain activity while watching tv sounds a lot like trying to keep up with a speeding train. All areas of the brain light up just to keep up with the action.

So what does that mean for those of us building video annotation tools?

Video annotation has all the same cognitive challenges of text annotation, but it comes with additional physiological hurdles as well.

STEM v. The Humanities

I’ve been working off the assumption that responding to STEM material is fundamentally different from The Humanities. For STEM subjects, the range of relevant responses is much more limited. It essentially amounts to different flavors of “I’m confused.” and “I’m not confused.”

I’m confused because:

  • e.g. I need to see more examples to understand this.
  • Syntax! I don’t know the meaning of this word.
  • How? I need this broken down step-by-step.
  • Why? I want to know why this is so.
  • Scale. I need a point of comparison to understand the significance of this.

I get this because:

  • Apt! Thank you. This is a great example.
  • Got it! This was a really clear explanation.

Humor is a commonly wielded weapon in the arsenal of good teaching so being able to chuckle in response to the material is relevant as well.

But as is often the case when trying to define heuristics, it’s more complicated than simply STEM versus not-STEM.

Perhaps a more helpful demarcation of territory would be to speak in terms of the manner and tone of the content (text or video) and more or less ignore subject matter altogether. In other words: The way in which I respond to material depends on how the material is talking to me.

For example, the manner and tone with which the speaker addresses the viewer varies dramatically depending on whether the video is a:

  •  “How-to” tutorial,
  • Expository Lecture
  • Editorializing Opinion
  • Edu-tainment

The tutorial giver is explaining how to get from A to Z by following the intervening steps B through Y. First you do this, then you do that.

The lecturer is a combination of explanatory and provocative. This is how you do this, but here’s some food for thought to get you thinking about why that’s so.

The editorializing opinion-giver is trying to persuade you of a particular viewpoint.

Edu-tainment is well, exactly that. Delivering interesting information in an entertaining format.

And of course, the boundaries between these categories are sometimes blurry. For example, is this Richard Feynman lecture Expository Lecture? or Editorializing Opinion?

I would argue it falls somewhere in the middle. He’s offering a world view, not just statements of fact. You might say that the best lecturers are always operating in this gray area between fact and opinion.

The Test Session

So in our 3rd test session, unlike the previous 2, I chose 3 very different types of video content to test.

Documentary on The Stanford Prison Guard Experiment (Category: Edu-tainment)

A 10-minute segment of the Biden v. Ryan 2012 Vice Presidential Debate re: Medicare starting at ~32:00. (Category: Editorializing Opinion)

Dan Shiffman’s Introduction to Inheritance from Nature of Code (Category: Expository Lecture)

You can try annotating these videos on Ponder yourself:

  1. Dan Shiffman’s Introduction to Inheritance from Nature of Code.
  2. Biden v. Ryan Vice-Presidential Debate.
  3. The Stanford Prison Experiment documentary.

The Set-up

There were 5 test subjects, watching 3 different videos embedded in the Ponder video annotation interface in the same room, each on their own laptop with headphones. That means unlike previous test sessions, each person was able to control the video on their own.

Each video was ~10 minutes long. The prompt was to watch and annotate with the intention of summarizing the salient points of the video.

2 students watched Dan Shiffman’s Nature of Code (NOC) video. 2 students watched the documentary on the Stanford Prison Experiment. And 1 student watched the debate.

The Results

The Stanford Prison Experiment had the most annotations: 15/user versus 12 for NOC and 5 for the debate, and the most varied use of annotations: 22 total versus 5 for NOC and 4 for the debate.

Unsurprisingly the prison documentary provoked a lot of emotional reactions (50% of the responses were emotional – 12 different kinds compared to 0 emotional reactions to the debate).

Again unsurprisingly, the most common response to the NOC lecture was “{ chuckle },” it was 12 of the 25 responses. There was only 1 point of confusion around, a matter of unfamiliarity with syntax: “What is extends?”

This was a pattern I noted in the previous sessions where in many STEM subjects, everything makes perfect sense in the “lecture.” The problem is oftentimes as soon as you try to do it on your own, confusion sets in.

I don’t think there’s any way around this problem other than to bake “problem sets” into video lectures and allow the points of confusion to bubble up through “active trying” rather than “passive listening.”

Intro to Inheritance - NOC Intro to Inheritance – NOCBiden v. Ryan Vice-Presidential Debate Biden v. Ryan Vice-Presidential Debate Stanford Prison Experiment Stanford Prison Experiment

Less is More?

There are 2 annotation modes in Ponder. 1 displays a small set of annotation tags (9) in a Hollywood Squares arrangement. A second displays a much larger set of tags. Again the documentary watchers were the only ones to dive into the 2nd set of more nuanced tags.

Less v. More Less v. More

However, neither student watching the documentary made use of the text elaboration field (they didn’t see it until the end) where you can write a response in addition to applying a tag whereas the Nature of Code and Biden-Ryan debate watchers did. This made me wonder how having the elaboration field as an option changes the rate and character of the responses.

Everyone reported pausing the video more than they normally would in order to annotate. Much of the pausing and starting simply had to do with the clunkiness of applying your annotation to the right moment in time on the timeline.

It’s all in the prompt.

As with any assignment, designing an effective prompt is half the battle.

When I tested without software, the prompt I used was: Raise your hand if something’s confusing. Raise your hand if something is especially clear.

This time, the prompt was: Annotate to summarize.

In retrospect, summarization is a lot harder than simply noting when you’re confused versus when you’re interested.

Summarization is a forest-for-the-trees kind of exercise. You can’t really know moment-to-moment as you watch a video what the salient points are going to be. You need to consume the whole thing, reflect on it, perhaps re-watch parts or all of it and construct a coherent narrative out of what you took in.

By contrast, noting what’s confusing and what’s interesting is decision-making you can do “in real-time” as you watch.

When I asked people for a summarization of their video, no one was prepared to give one (inspite of the exercise) and I understand why.

However, one of the subjects who watched the Stanford Prison Experiment documentary was able to pinpoint the exact sentence uttered by one of the interviewees that he felt summed up the whole thing.

Is Social Always Better?

All 3 tests I’ve conducted were done together, sitting in a classroom. At Ponder, we’ve been discussing the idea of working with schools to set up structured flip study periods. It would be interesting to study the effect of socialization on flip. Do students pay closer attention to the material in a study hall environment versus studying alone at home?

The version of Ponder video we used for the test session shows other users’ activity on the same video in real-time. As you watch and annotate, you see other people’s annotations popping up on the timeline.

For the 2 people watching the Stanford documentary, that sense of watching with someone else was fun and engaging. They both reported being spurred on to explore the annotation tags when they saw the other person using a new one. (e.g. “Appreciates perspicacity? Where’s that one?”)

By contrast, for the 2 people trying to digest Shiffman’s lecture, the real-time feedback was distracting.

I assigned an annotation exercise to another test subject to be done on her own time. The set-up was less social both in the sense that she was not sitting in a room with other people watching and annotating videos and she was also not annotating the video with anyone else via the software.

I gave the same prompt. Interestingly, from the way she described it, she approached the task much like a personal note-taking exercise. She also watched Shiffman’s Nature of Code video. For her, assigning predefined annotation tags got in the way of note-taking.

Interaction Learnings

  • The big challenge with video (and audio) is that they are a black box content-wise. As a result, the mechanism that works so well for text (simply tagging an excerpt of text with a predefined response tag) does less well on video where the artifact (an annotation tag attached to timecode) is not so compelling. So I increased emphasis on the elaboration field, keeping it open at all times to encourage people to write more.
  • On the other hand, the forest-for-the-trees view offered on the video timeline is I think more interesting to look at than the underline heatmap visualization for text so I’ll be looking for ways to build on that



  • There was unanimous desire to be able to drag the timecode tick marks after they had already submitted a response. We implemented that right away.
  • There was also universal desire to be able to attach a response to a span of time (as opposed to a single moment in time). The interaction for this is tricky, so we’ve punted this feature for now.
  • One user requested an interaction feature we had implemented but removed after light testing because we weren’t sure if it would prove to be more confusing than convenient: automatically stopping the video whenever you made mouse gestures indicating you’re intending on creating an annotation and then restarting the video as soon as you finished submitting. I’m still not sure what to do about this, but it supports the idea that the difficulty of pacing video consumption makes annotating and responding to it more onerous than doing the same with text.


  1. Annotating video is hard to do so any interaction affordance to make it easier helps.
  2. Dense material (e.g. Shiffman’s lecture) is more challenging to annotate. Primary sources (e.g. the debate) are also challenging to annotate. The more carefully produced and pre-digested the material (e.g. the documentary), the easier it is to annotate.
  3. With video, we should be encouraging more writing (text elaborations of response tags) to give people more of a view into the content.
  4. Real-time interaction with other users is not always desirable. Users should be given a way to turn it on/off for different situations.
  5. There may be a benefit to setting up “study halls” (virtual or physical) for consuming flip content, but this is mere intuition right now and needs to be tested further.

Last but not least, thank you to everyone at ITP who participated in these informal test session this semester and Shawn Van Every and Dan Shiffman for your interest and support.

Feynman Lectures on Ponder

Flip More of Your Class With Ponder

Typically, the flipped class involves a video of a lecture, often with a screencast illustrating the content of the lecture. There are some great tools out there to embed assessments into these videos. Yet, important aspects of the classroom experience are not replicated. Students control the pace of the lecture, but to what extent is their interaction with the material scaffolded? What are students thinking about? What would you like them to be thinking about? Are they making connections between ideas? Can they infer what’s left out of the lecture? In short, what questions are they asking as they watch? Or are flip video lectures like most video content, consigned to remaining a passive watching experience?

We believe, if you can’t already tell, that a great deal of learning happens from questioning and asking questions of learning material. So, we’ve decided that students shouldn’t have to save their questions for the next day.

With Ponder, students can ask questions as they watch video by annotating the video with micro-responses, a scaffolded approach to critical thinking adapted from micro-reading responses which were originally developed for text. Instead of answering basic assessment questions, students engage in a conversation with the content aided by a set of nuanced “sentiment” tags that guide them through the kinds of questions you’d like them to be asking as they watch.

Their responses are timestamped and shared with their classmates. They can even respond to their classmates’ responses, generating a live discussion of nuanced expressions of understanding, evaluation, and emotion. Requiring students to process the content of the video, consider the sentiment they’d like to react with, and engage their classmates’ ideas offers an essential scaffold to their learning.

We’ve been conducting research at NYU-ITP on flip-video interactions. You can read about it here and here.


In other words, with Ponder, you can flip more than your teaching. You can flip students’ learning, too.

Implementing Flip: Why higher-order literacy is not just about text

Within the field of “instructional” EdTech, Ponder is often described as a “literacy” tool, which while accurate, encompasses a much broader spread of pedagogical challenges. We usually describe our focus as “higher-order” literacy – the ability to extract meaning and think critically about information sources.

A couple months ago we began our pilots of Ponder Video, bringing our patent-pending experience to the medium more often associated with the flipped classroom. From our experience with text over the past two and a half years, we knew this would be an iterative process, and as expected we are learning a lot from the pilots and the experimentation – see part 1 and part 2 of our interface studies.

During this process, some people have asked if Ponder Video is, in startup terminology, a “pivot”; a change of strategy and focus of our organization. The question: do we still consider Ponder a literacy tool?

After a bit of reflection the answer is a resounding YES! And the process of reflection helped us gain a deeper appreciation for what “literacy” actually means. This is not a change of strategy, it is an expansion of Ponder to match the true breadth of literacy.

Literacy > Text

The term literacy is most often thought of as the ability to decode words and sentences. That is, of course, the first level of literacy, but there is a shifting focus in many of the new pedagogical and assessment debates, from the Common Core to the SAT, a shift away from memorizing facts and vocabulary towards students developing a higher-order literacy. Still, higher-order literacy is a vague concept, and at Ponder we are always searching for ways of articulating our vision more clearly.

One line I like, from a now deprecated page of ProLiteracy.org with no by-line, does a really nice job of concisely capturing the significance of a broad definition of literacy: “literacy is necessary for an individual to understand information that is out of context, whether written or verbal.”

The definition is so simple, you might miss its significance. So let me repeat it:

“literacy is necessary for an individual to understand information that is out of context, whether written or verbal”

I like it because “understand information” goes beyond mere sentence decoding, and “out of context” unassumingly captures the purpose of literacy – to communicate beyond our immediate surroundings. The “or verbal” I would interpret broadly to include the many forms that information comes in today – audio, video and graphical representations.

The 21st century, at least so far and for the foreseeable future, is the interconnected century, the communication century, the manipulated statistics century, the Photoshopped century, perhaps the misinformation/disinformation century, and I would posit that if there is one “21st century skill” that we can all agree on, it is literacy, in the broad sense:

Understanding information out of context.

A text or video is inherently out of context, so a student at home is not only one step removed from the creator of the content, but also removed from the classroom. So a question immediately jumps to mind:

Are your students ready to learn out of context?

The answer to this question varies dramatically, and is not easily delineated by grade level; defining that readiness to provide an appropriate scaffold requires care, and is something we have worked to understand empirically through student activity in Ponder.

The National Center for Education Statistics, part of the US Department of Education, has put a lot of effort into defining and measuring this skill, and have twice performed a survey they call the National Assessment of Adult Literacy (NAAL), providing a useful jumping off point for thinking about your students.

This is not like one of those surveys you read about. It is a uniquely thorough survey that consists of a background questionaire and screening process, followed by an interview.

The NAAL is made up 100% of open-ended, short-answer responsesnot multiple choice – and focuses on the participants ability to apply what they have read to accomplish “everyday literacy goals”. You read something, then answer a question that depends on something you have to have extracted from the reading.

As you might imagine, this is not a quick process.

Administering the NAAL takes 89 minutes per person and in 2003 was administered to 18,000 adults sampled to represent the US population. That’s almost 26,700 person-hours or three person-years of interviewing.

This thoroughness is important given that they are trying to measure a broad definition of literacy.

The NAAL breaks literacy into three literacy skill types:

  • Prose
  • Document
  • Quantitative

You can read the details on their site, but given that it turns out American adults have roughly comparable prose and document literacy scores, I would lump them together under a general heading of “reading.” Examples of quantitative literacy tasks are reading math word problems, balancing a checkbook or calculating interest on a loan.

They delineate four literacy levels:

  • Below basic
  • Basic
  • Intermediate
  • Proficient

Again, they go into a lot of detail mapping scores on to these names, but I think what’s most useful are the “key abilities” that distinguish each level in their definitions.

My interest in higher-order literacy immediately takes my eye to the key distinction between “Basic” and Intermediate. An intermediate skill level means the individual is capable of:

“reading and understanding moderately dense, less commonplace prose texts as well as summarizing, making simple inferences, determining cause and effect, and recognizing the author’s purpose”
NAAL Overview of Literacy Levels

That list of skills captures the starting point of what we think of as higher order literacy. (If you’re curious, the highest level of literacy, modestly labeled “Proficient,” seems to mostly be distinguished by the ability to this sort of analysis across multiple documents.)

For me, the NAAL provides a useful framework for breaking down the literacy problems that instructional techniques (and technologies) are trying to address.

Ponder supports teachers who are trying to move their students from a level of basic literacy to being able to make inferences, determine cause and effect, recognize the author’s purpose.

…but our goal is to go an important step beyond even the NAAL’s definition of literacy.

Because what is the point really of making inferences and identifying cause and effect if ultimately you are unable to probe with your own questions and evaluate with your own conclusions?

In the end, the end-game of literacy is the so-called ability to “think for yourself.”

Flip is a great way to practice literacy. But you need literate students to flip.

The flipped classroom model is typically used for students to dig into and prepare for class discussion, and obviously presumes a basic student literacy level. But passively consuming a video or skimming a text isn’t enough to drive discussion back in class. As we all know from our own student days, technically meeting the requirements of having “done the reading” does not comprehension make.

Flipping, more so than traditional classroom lectures, requires students to be able to dig beneath the surface of the content, question its credibility, ask clarifying questions and make their own inferences.

Such are the makings of a classic Chicken and Egg conundrum. Flipping requires students to have the skills they are still trying to learn and master through…flipping.

I don’t think anyone has claimed to have answered this question yet, and neither have we, but the first step is realizing what you don’t know, and we do claim to have done that! We will continue to share the learnings from our video research as we iterate on Ponder Video, and welcome more ideas and discussion from teachers everywhere.

Population by Prose Literacy Level (Courtesy NAAL)

Population by Prose Literacy Level (Courtesy NAAL)

Curious about the numbers? The NAAL has been run twice – once in 1993, and a second time in 2003, and there wasn’t a big change in the scores in those ten years, except a slight increase in quantitative literacy. However, we have a pretty serious higher-order literacy problem. Between 34% and 43% of adult Americans lack the higher order literacy skills to be classified as “intermediate” or above by the NAAL.

Logging Tutorial

New modes of interaction for Flip videos Pt. 2

This semester in addition to teaching, I am a SIR (Something-in-Residence) at ITP, NYU-Tisch’s art/design and technology program.

My mission for the next 3 months is to experiment with new modes of interaction for video: both for the flip video and live events. (Two very different fish!)

User Study No. 2

I recently wrote about my first User Study with students from Dan Shiffman’s Nature of Code class. A couple of weeks ago, 3 students from Shawn Van Every’s class, Always On, Always Connected volunteered to watch 4 videos. In this class, students design and build new applications which leverage the built-in sensor and media capabilities of mobile devices.

The Setup

Again, there were no computers involved. We screened the videos movie-style in a classroom. Instead of asking for 2 modes of interaction (Yes that was super clear! versus Help!), I asked for a single mode of feedback: “Raise your hand when you’ve come across something you want to return to.”

The 4 videos introduce students to developing for the Android OS. It’s important to note that simply by virtue of working in the Android development environment (as opposed to Processing) the material covered in these videos is much more difficult than what was covered in the Nature of Code videos the previous week. However, the students’ level of programming experience is about the same.

What happened…

Video 1: Introduction to logging

Zero hands. But from the looks on people’s faces, it was clear it was not because everyone was ready to move on. A few things to note:

  1. For screencasts, watching a video movie-style is extra not-helpful as it is difficult to read the text.
  2. The video begins promptly after a 20s intro. With instructional videos, the emphasis is always on brevity (the shorter the better!) In this case however, I wonder if 20s isn’t enough to allow you to settle down and settle into the business of trying to wrap your head around new and alien concepts. I’m sure #1 made it harder to focus as well.
  3. Reading code you didn’t write is always challenging and I think even more so on video where we’re much more tuned into action (what’s changing) and much less likely to notice or parse what’s static.
  4. Unlike before, when I asked after-the-fact, “Where in the video do you want to go back to?” the students were unable to respond. Instead the unanimous response was, “Let’s watch the entire video again.” This is where collecting passive data about the number of times any given second of a video is watched by the same person would be helpful.
  5. In general, questions had to do with backstory. The individual steps of how to log messages to the console were clear. What was missed was what and why. First, what am I looking at? And second, why would I ever want to log messages to the console? I say “missed” and not “missing” because the answer to those questions were in the videos. But for whatever reason, they were not fully absorbed.
  6. Last but not least, I have to imagine that this watching as a group and raising your hands business feels forced if not downright embarrassing.

Hopefully, a software interface for doing the same thing will elicit more free-flowing responses from students as it will provide them with a way to ask questions “in private.”

Videos 2-4

Everyone was more settled in after watching the logging video. Each subsequent video built on the momentum accumulated in the previous one. With Video 2, I started to get some hand-raising starting with video 2. when we returned to those points in the video, people were very specific about what was confusing them.

“Why did you do that there?” or it’s converse, “Why didn’t you do that there?” was a common type of question as was “What is the significance of that syntax?”

Another way to look at it is: There were never any issues with the “How.” The steps were clearly communicated and video is a great medium for explaining how. The questions almost always had to do with “Why?”, which makes me wonder if this is the particular challenge of video as a medium for instruction.

Does learning “Why?” require a conversation initiated by you asking your particular formulation of “Why?”

Video 2: Toasts

Video 3: Lifecycle of an Android App Part 1

  • @2.00: What is the significance of @-id? Why aren’t you using the strings file for this?
  • @6:50: Why did you change arg0 to clickedView?

Other syntax questions included:

  • What’s the difference between protected versus public functions?
  • What’s extend and implement?
  • What’s super()?
  • What’s @override

All of these syntax questions pointed to a much larger / deeper rabbit hole having to do with object-oriented programming and encapsulation, a quick way to derail any getting started tutorial.

In general though, I don’t think these syntax questions prevented you from understanding the point of the videos which were about creating pop-up messages (Video 2) and the lifecycle of Android apps (Videos 3 and 4) (when do they pause versus stop, re-initialize versus preserve state).

Video 4: Lifecycle of an Android App Part 2

In Video 4, there were a lot of nodding heads during a section that shows the difference between pausing/resuming and killing/creating an Android app. The demonstration starts around @2:20 and goes for a full minute until @3:23. Epic in online video terms. It’s followed by a walk-through of what’s happening in the code and then reinforced again through a perusal of Android documentation @3:50 where an otherwise impenetrable event-flow diagram is now much more understandable.

It’s also important to note that both the demo which starts at @2:20 and the documentation overview @3:50 are preceded by “downtime” of trying, failing and debugging starting the Android emulator and navigating in the browser to find the flow diagram.

In general there’s a lot of showing in these videos. Each concept that’s explained is also demonstrated. This segment however was particularly lengthy and in accordance with the law of “It-Takes-A-While-To-Understand-Even-What-We’re-Talking-About” (#2 above), my completely unverified interpretation of events is that the length of the demonstration (far from being boring) helped everyone sink their teeth into the material.

What’s Next and Learnings for Design

As we head into testing actual software interfaces, this 2nd session gave me more concrete takeaways for workflow.

  1. You lay down “bookmarks” on the timeline as you watch to mark points you would like to return to.
  2. You lay down “bookmarks” on the timeline after you’ve watched the video to signal to the instructor what you would like to review in class the next time you meet.
  3. You can expand upon “bookmarks” with an actual question.
  4. You select from a set of question tags that hopefully help you formulate your question. (More to come on this.)

While it’s important to break down the videos into digestible morsels, it’s also important to make it easy to watch a few in a row as the first one is always going to be the most painful one to settle into. There are ways to encourage serial watching with user interface design. (e.g. playlist module, next video button, preview and auto-play the next video.)  But perhaps something can be done on the content side as well by ending each video on a so-called “cliffhanger.”


Magnitude! Direction!

New modes of interaction for Flip videos Pt. 1

This semester in addition to teaching, I am a SIR (Something-in-Residence, no joke) at ITP, NYU-Tisch’s art/design and technology program.

My mission for the next 3 months is to experiment with new modes of interaction for video: both for the flipped classroom and live events. (Two very different fish!)

Magnitude! Direction!

Magnitude! Direction!

2 weeks ago, I conducted the first in a series of informal user studies with 3 students from Dan Shiffman’s Nature of Code class which introduces techniques for modeling natural systems in the Java-based programming environment Processing .)

User Study No. 1

Last year, Dan flipped his class, creating a series of ~10 minute videos starting with how to use random() to move objects around a screen to modeling the movement in ant colonies.

The Setup

We viewed the first 2 videos from Chapter 1 which introduce the concept of a vector and the PVector class in Processing.

There were no computers involved. We watched it together on a big screen, just like you would a movie except I sat facing the students.

The only “interaction” I asked of them was to:

  1. Raise their right hand when Dan said something extra clear.
  2. Raise both hands to indicate “Help!”

What happened…

On the whole, the “interaction” was a non-event. There were a few tentative raisings of the right hand and zero raisings of both hands.

However, after each video when I asked, at what points in the video were things not perfectly clear? There were no hesitations in the replies.

Points of confusion fell mostly into 1 of 2 categories:

  1. I simply need more background information.
  2. Maybe if I rewatch that part it will help.

Although I will say that in my subjective option, number 2 was said in a tentative and theoretical fashion, which is interesting because “the ability to re-watch” is a much-vaunted benefit of flip.

It must be said though, that we’re early in the semester and the concepts being introduced in these videos are simple relative to what’s to come.

Learnings for design…

Some early thoughts I had coming out of this first session were:

  • Unlike reading, video is too overwhelming for you to be able to reflect on how you’re reacting to the video while you’re trying to take it in. (Just compare reading a novel to watching a movie.)
  • That being said, so long as the video is short enough, people have a pretty good idea of where they lost their way, even after the fact.
  • Still, there needs to be a pay-off for bothering to register where those points are on the video. For example, “If I take the time to mark the points in the video where I needed more information, the instructor will review them in class.”
  • It’s still unclear to me how a “social” element might change both expectations and behavior. If you could see other people registering their points of confusion and asking questions and you could simply pile on and say “I have this question too” (many discussion forums have this feature), the whole dynamic could change.
  • Hence, the popularity of inserting slides with multiple choice questions every few minutes in MOOC videos. The slides serve as an explicit break to give the viewer space to reflect on what they’ve just watched.

I wonder though if they need to be questions. Genuinely thought-provoking questions in multiple choice form are hard to come by.

In a flip class where there is such a thing as in-class face time, a simple checklist of concepts might suffice.

Here were the concepts that were covered. Cross off the ones you feel good about. Star the ones you’d like to spend more time on. I think it’s key that you are asked to do both actions, meaning there shouldn’t be a default that allows you to proceed without explicitly crossing off or starring a concept.

Concepts for Video 1.1:

  • A vector is an arrow ★ @2:00 : Why is it an arrow and not a line?
  • A vector has magnitude (length).
  • A vector has direction (theta) ★ @3:30: What’s theta?
  • A vector is a way to store a pair of numbers: x and y.
  • A vector is the hypotenuse of a right triangle (Pythagorean Theorem).

Concepts for Video 1.2:

  • PVector is Processing’s vector class.
  • How to construct a new PVector.
  • Replacing floats x,y with a PVector for location.
  • Replacing float xspeed, yspeed with a PVector for velocity.
  • Using the add() method to add 2 PVectors.
  • Adding the velocity vector to the location vector.
  • Using the x and y attributes of a PVector to check edges.

The tricky thing with Video 1.2 is that the only point of confusion here was @10:27 when Dan says almost as an aside that you add the velocity vector to the location vector which you can think of as a vector from the origin. I don’t think this kind of problem area would surface in a list of concepts to cross out or star.

Next Session

In session 2, I will be conducting another user study with students from Shawn Van Every’s class “Always On, Always Connected” (an exploration of the technologies and designs that keep us online 24/7).

My plan is to try asking the students to raise their hand simply to register: I want somebody to review what happened at this point in the video.

We’ll see how it goes!