National Council of Teachers of English (NCTE) Journal Features Ponder

We are honored to be featured in Dr. Kristen Hawley Turner and Dr. Troy Hicks’ piece in the NCTE’s November 2015 English Journal (Vol. 105, No. 2) entitled “Connected Reading Is the Heart of Research.”

The piece introduces a framework for teaching adolescents to read, but it also advocates a more thoughtful approach to the application of technology in the English classroom through an introspective exploration of what it means to be digitally literate.


“We must advocate for digital literacy, not just technology, in a way that reconceptualizes our discipline.”

A sample Ponder lesson walk-through in the piece explains how the collaborative annotation experience supports shared inquiry during research for an essay, and how Ponder’s interface speeds the teacher’s review of student activity, allowing them to better lead class discussion.

Both former K-12 English teachers, Dr. Turner is a professor of Curriculum and Teaching at Fordham University where she directs the Fordham Digital Literacies Collaborative, and Dr. Hicks is a professor of English Language & Literature at Central Michigan University where he directs CMU’s Chippewa River Writing Project.

Also check out their 2015 book Connected Reading: Teaching Adolescent Readers in a Digital World.

Hassle Factor: University Web Sites

Applying Behavioral Economics to College Success

Last week we attended a workshop run by the consulting firm Ideas42 on behalf of the Robin Hood Foundation. The goal of the day-long event was to provide teams with an overview of behavioral economics, the study of the psychological, social, cognitive and emotional factors that go into decision-making.

Students lose their way in school for a broad range of reasons, many of which are non-academic, and the hope behind Robin Hood’s prize is that technological solutions can help them scale already proven strategies for improving matriculation rates.

The day began with an overview of the day to come by ideas42’s Josh Wright:

  • Psychological “Scarcity” – how even unrelated stress and anxiety levy a cognitive tax.
  • Hassle Factors – how seemingly insignificant logistical challenges can discourage and therefore effectively prevent task completion.
  • Limited Attention – Information overload.
  • Self-control – how to shore up self-control through social bonds, incentives and tricks you play on your own psyche.
  • Prospective Memory – not only remembering to do something, but following through to actually do it.
  • Social Norms – the human tendency to choose “normal” over “right”.

On the whole the sessions were lively, peppered with informal experiments, anecdotes and studies that illustrated key points through examples rather than jargon and formal definitions. Every session provoked incisive questions from the audience. For our part, we walked away with much more specific ideas for the design and implementation of our solutions as well as a host of questions for the folks at Robin Hood and CUNY, mostly around how the program will be introduced to the students.


NYT: The Mental Strain of Making Do With Less

The Mental Strain of Making Do With Less Mullainathan, NYT

Sendhil Mullainathan’s well-argued presentation on the Psychology of Scarcity made abundantly clear how poverty in one area of life (financial) creates poverty in another (academic performance). Study after study showed how even subtle reminders of financial stresses degraded cognitive performance. (You can get a synopsis from his New York Times piece on the same topic.)

  • The obvious next question to ask then was: Can you prime in the opposite direction? Can you help people forget their stresses and perform better than they would otherwise? The answer? Nothing conclusive so far, although an interesting study found that activating Asian female students’ positive (ethnic) vs negative (female) stereotypes affected their quantitative performance which suggests it may be possible here too.
  • Given that the nature of our relationship with the students will be long-term, another question we had was: Does the effect of priming wear off over time with repeated exposure, positive or negative?
  • Another issue this brought up for us is whether the mere fact that students are participating in this program remind them of their “remediation” status thereby undermining our efforts to bolster their performance? As we understand to date, only remedial students will be using our technologies. Are we missing an opportunity to build technologies that help remedial students feel a part of the CUNY community as a whole?

Filling out Gigantic Forms

William Congdon talked about hassle factors. We could all relate to the hassles of coordinating calendars that span different aspects of life (work, school, childcare, family, commuting). Ideas42 in particular is working on improving the onerous process of applying for financial aid. Two approaches that came up repeatedly was the idea of 1) defaults to reduce the cognitive load of making decisions; and 2) pre-filled out forms to remove the hassle factor of having to “look up” information the university already has. So we’re wondering:

  • Will participation be mandatory or will students be asked to decide? If the latter, will their participation be assumed with the option to opt out or will they be asked to opt-in?
  • Will students be pre-registered, or will they need to go sign up? Can we piggy-back on their CUNY accounts?
  • Will we have API access to student schedules and class syllabi, or will we need to ask the student to provide that information to us separately?

Information Overload

William also covered Limited Attention, a familiar topic in modern day life. One interesting tidbit from this session: It’s generally assumed that students’ preferred mode of communication is SMS. However, like Twitter, email before it and perhaps Yo! to come, what happens when every system and organization shifts to using text messages? More specifically, how will our communications with students interact with / collide with CUNY’s existing student support program START?

Hey, remind me to…

Matt Darling presented on Prospective Memory, the art of following through on future commitments. Memory, or the lack of it, is clearly the first problem to overcome. But assuming you are able to implement some kind of reminder system, how do you actually make those reminders count? Hassle factors and self-control (see below) come into play for sure. But Matt pointed to the power of “being specific” as one simple technique that doesn’t rely on the student to be more disciplined.

It made us reconsider how we’re thinking about designing our reminders. When you send them, how often you send them and the language you use in them of course remain important factors. But what exactly are you reminding the student to do, and how do you want them to respond to the reminder is where the real design problem lies.

Specifically for us, we’re working on ways to make tasks more concrete and bite-sized (aka, doable), tasks students can easily imagine completing successfully in a limited amount of time.

Creating Community

Allie Rosenbloom reminded us of the now famous marshmallow test.  Self-control or willpower is a tricky issue in light of Sendhil’s ideas about scarcity. In an environment of scarcity, there simply isn’t a lot of self-control to go around. Social supports and personal incentives (e.g. betting against yourself) were 2 approaches discussed. The challenge we see ahead is how to create social supports through our technology when the students who will be using our service may or may not be in the same classes or even campus due to the structure of the Randomized Controlled Trial (RCT). We are encouraged though that the student population will be big enough that we can build community around shared interests and career aspirations, if not coursework. Allie’s talk also supported the idea that “getting specific” with tasks would be a boost to performance because as we all know, focusing on “exercising today” is a lot easier than thinking about the 20 days of exercise you committed to for the month.

What is everyone else doing?

Finally, social norms come into play – the emphasis of the studies noted here had to do with public service announcements intended to discourage problematic behaviors that end up reinforcing them. Examples included posters designed to discourage binge drinking that make the reader who doesn’t drink feel like they are abnormal, since everyone else must clearly be drinking, or provoke petrified tree theft rather than discourage it.

  • Most relevant here is that commuting community college students (which is the majority of them) often feel isolated, and don’t have a good sense of how other students are handling the challenge of college. Social norms seem most relevant to us in terms of Ponder providing an atmosphere where students feel connected with their classmates, are aware they are working hard, and engage with one another through their college and career interests. We wondered if we could coordinate with existing CUNY support services to reinforce the somewhat disparate nature of the randomized participating students, and provide an in-person, face-to-face experience.

We’re thinking about how we can use the data we have about student progress to reshape students’ sense of “what’s normal” when it comes to school. Our goal would be to not only show students how others like them are succeeding in school, but to also paint a realistic picture of how much time and effort it takes to succeed at school. At the very least we can prevent students from feeling discouraged because it takes them ‘too long’ to study; or because they feel uniquely selfish in spending so much time on school in light of their other obligations.

At the beginning of the day David Crook from CUNY voiced his enthusiasm for the teams and the prize; having had an opportunity to digest all of the above, we can’t help thinking it would be great to have a second webinar to drill into the data with this new perspective.

All in all I think I’ve demonstrated here that the workshop provided much food for thought and advanced our thinking greatly towards our prototype for January. We also finally got an opportunity to meet and learn from the other teams in the Challenge. Thank you Robin Hood and ideas42 for organizing!


The Guardian: Readers absorb less on Kindles than on paper, study finds

E-reading is bad for reading. Now what?

Researchers continue to show that people retain less, comprehend less and do both less well, when reading digital texts compared to reading paper texts. Moreover, what little we retain from e-text we take less seriously. There is concern that e-reading is an additional threat to already embattled humanities and worse, reading ON-line deteriorates your ability to read OFF-line.

(Hello! Eyes here!)

Indeed, our intuition as readers can guide us in this question:

  • Devices inherently present more distractions.
  • There are tactile, physical components to reading offline that are clearly missing online.

And those are just the cognitive deficits. There a whole host of IT problems that make the cognitive ones seem trivial.

If you measure online-texts against what paper-texts are good at (freedom from distraction, physical cues to provide context and focus your attention), it is no revelation that online texts will lose every time.

So, if you are thinking of trading a paperback or a Xerox in your classroom for a screen, don’t do it?

Unless you have a really good reason. Like, if my students can look up words while they read they’re much more likely to keep at reading hard texts. If my students have a smart way to track their reading across lots of different documents, they have a much easier time seeing the connection between texts and as a result write better papers.

These are things computers are good at. So if we start with what computers are good at and we measure paper texts against online texts, there should also be no surprise that online texts will win (provided the software delivers on its promise).

Research is starting to support some of these claims. A study published recently by researchers at National Chengchi University demonstrated improved comprehension of a text with the help of a scaffolded annotation tool. Research at San Francisco State showed similarly exciting educational outcome improvements.

So here’s a different rule of thumb to consider, one informed by the research I cite above and reinforced by countless conversations with teachers:

If you’re considering moving to e-texts, don’t, unless it does nothing short of transforming your classroom in ways that paper can’t and has something to do with learning, not functionality.

ie. This tool will help me push my students to re-read passages they didn’t fully understand which in turn will get them to be more proactive about asking questions in class. As compared to: This tool makes it possible for my students to see each other’s comments as they read. (The latter is a description of software functionality, and a rather high-level one at that, which may or may not be implemented in a way that has pedagogical value.)

Sometimes an instructor knowing specifically why and how they’re going to use a certain tool makes all the difference in efficacy, so two teachers using the same software can experience drastically different results.

Other times, getting specific with what you intend to use software for is precisely the “missing information” you need to separate the wheat from the chaff when evaluating tools.

In other aspects of our lives, this would be considered stating the obvious. After all, knowing what I care about in a product is how I evaluate the relevance of other people’s reviews of said product, which is why online review forums typically ask if you found the review helpful, as opposed to if you found it informative.

However, for whatever reason, this is still something we’re learning to do with edtech.

In either case, the logistical wins of going from paper to digital alone are not big enough to offset the logistical problems of managing digital, or the cognitive hit we all take when reading from a screen.


New modes of interaction for Flip: Annotating streaming video!

At the end of ever Spring Semester, the extended ITP community gathers round for a solid week (Monday-Friday, 9:40AM-6PM) of thesis presentations. It’s part gauntlet and part send-off for the graduation class.

This year, with the help of Shawn Van Every (the original architect, builder and maintenance man of the ITP Thesis Streaming Service), I had the opportunity to continue my run of informal experiments in video annotation by trying it out on a handful of thesis presentations.

For the third year running, Thesis Week has been accompanied by the Shep chatroom, created by ITP alumn Steve Klise. Shep immediately turned itself into a great place for backchannel commentary on the presentations…or not. I’ve always felt it would be great to see aggregations of Shep activity attached to the timecode of the presentation videos. Shep conversations unfortunately aren’t logged. I also wondered if turning Shep into some kind of “official record” of audience reactions to thesis would be something of a killjoy.

With the Ponder-Thesis experiment, I wasn’t exactly sure where “annotating thesis presentations” would fall on the work-fun spectrum.

It might be seen as some sort of “semi-official record.” That would make it “work,” and perhaps a bit intimidating, like volunteering to close-caption T.V. programs.

But annotating with Ponder requires more thinking than close-captioning which presumably will soon be replaced by voice recognition software if it hasn’t been already. So maybe it can be fun and engaging in a the same sort of “challenging crossword puzzle” way that Ponder for text can be.

Either way, the end-goal was clear, I was interested in some sort of read out of audience reactions: Cognitively (Did people understand the presentation?); Analytically (Was it credible, persuasive?); and Emotionally (Was it funny, scary, enraging, inspiring?).


We were able to get Ponder-Thesis up and running by Day 3 (Wednesday) of Thesis Week.

I sent out a simple announcement via email to come help test this new interface for annotating thesis presentations in some sort of group record to created an annotated record of what happened.

Unlike previous test sessions, there was no real opportunity to ask questions.

Results: One Size Does Not Fit All

Annotating live events is a completely different kettle of fish than annotating static video.

I had made several changes to the static video interface in preparation for thesis, but in retrospect, they weren’t drastic enough.

All of the things I’ve mentioned before that make video annotation harder work and generally less enjoyable than text annotation are repeated 10-fold when you add the live element because now slowing down to stop and reflect isn’t just onerous, it’s not an option.

As a result, the aspect of Ponder that makes it feel like a “fun puzzle” (figuring out which sentiment tag to apply) can’t be done because there simply isn’t time.

It was challenging even for me (who is extremely familiar with the sentiment tags) to figure out at what point to attach my annotation, which tag to apply *AND* write something coherent, all quickly enough so that I’d be ready in time for the next pearl of wisdom or outrageous claim in need of a response.

There was also hints of wanting to replicate the casual feel of the Shep chatroom. People wanted to say “Hi, I’m here” when they arrived.

Going forward, I would tear down the 2-column “Mine v. Theirs” design in favor of a single-column chat-room style conversation space, but I will go into more detail on next steps after reviewing the data that came out of thesis.

Donna Miller Watts presenting: Fictioning, or the Confession of the Librarian

Donna Miller Watts presenting: Fictioning, or the Confession of the Librarian

The Data

  • 36 presentations were annotated. However, 50% of the responses were made on just 6 of them.
  • 46 unique users made annotations. (Or at the very least 46 unique browser cookies made annotations.)
  • 266 annotations in total, 71 of which were { clap! }.
  • 30 unique sentiment tags were applied.
    • ???, Syntax?, Who knew?
    • How?, Why?, e.g?, Or…, Truly?
    • Yup, Nope
    • Interesting, Good point, Fair point, Too simple
    • { ! }, Ooh, Awesome, Nice, Right?
    • Spot on!, Well said, Brave!
    • { shudder }, { sigh }, Woe, Uh-oh, Doh!
    • HA, { chuckle }, { clap! }
  • At peak, there were ~19-20 people in Ponder-Thesis simultaneously.

Broken down by type, there were 39 Cognitive annotations having to do with having questions or wanting more information. 69 Analytical annotations. 158 Emotional annotations, although almost half (71) of those were the { clap! }.

Over half of the non-clap! responses had written elaborations as well (113).

  • Natasha Dzurny had the most applause at 10.
  • Sergio Majluf had the most responses at 26.
  • Kang-ting had the most emotional responses at 18.
  • Talya Stein Rochlin had the most emotional responses if you exclude applause at 14.
  • Sergio Majluf racked up the most eloquence points with 3 “Well saids!”
  • Talya Stein Rochlin had the most written commetns with 15 and the most laughs at 3.

Below are roll-ups of responses across the 36 presenters categorized by type.

  • Cognitive: Yellow
  • Analytical: Green
  • Emotional: Pink

Below is a forest-for-the-trees view of all responses. Same color-coding applies.

Forest-for-the-Trees view of  responses.

Forest-for-the-Trees view of responses.

Interaction Issues

I made a few design changes for thesis and fixed a number of interaction issues within the first few hours of testing:

  • Reduced the overall number of response tags and made them more straightforward. e.g. Huh. which has always been the short-form of “Food for thought…” became Interesting!
  • Replaced the 3rd-person version of the tags (xxx is pondering) with the 1st-person version: Interesting! because after the last test session, I felt a long list of the 3rd-person responses felt a bit wooden.
  • Added a { clap! } tag for applauding.
  • Made the “nametag” field more discoverable by linking it to clicking on the roster (list of people watching). Probably giving yourself a username should be an overlay that covers the entire page so people don’t have an opportunity to miss it.
  • As responses came in, they fill up in the “Theirs” column below the video. Once there were more comments than would fit in your window viewport, you wouldn’t see new comments unless you explicitly scrolled. We explicitly decided not to auto-scroll the list of responses for static video to automatically keep in time with the video because we thought it would be too distracting. For streaming however, auto-scroll would have just been one less thing for you to have to do while you’re trying to keep apace of the video and thinking about how to comment.

Other issues didn’t become apparent until after it was all over…

  • People didn’t see how to “change camera view.” The default view was a pretty tight shot of the presenter. Really the view you want is a wider shot that includes the presentation materials being projected on the wall behind the speaker.
  • The last test session helped me realize that for video the elaboration text field should stay open between submits. But really, it should probably start off open as it’s too small to be something you “discover” on your own while the video is going.
  • The star button which is meant to allow you to mark points of interest in the video without having to select a tag was never used. I’m not sure how useful it is without the ability to write something.


The obvious first step is to go in and squish the remaining interaction issues enumerated above. But there are more systemic problems that need to be addressed.


  • People wanted to say “hey” when they logged on. The “live” nature of the experience means social lubrication is more important than annotating text or video on your own. ITP Thesis is also a special case because the people annotating not only know the presenters personally but are likely sitting in the same room (as opposed to watching online.) One person said they were less likely to speak their mind on Ponder if they had something critical to say about a presentation.
  • There is also general trepidation over attaching a comment to the wrong point in the timecode. One person who is also familiar with the Ponder text response interface described the problem as “I don’t know what I’m annotating. With text, I know exactly what I’m selecting. With video, I’m doing it blind.”

Solution: Chatroom Layout

Replace the 2-column design in favor of a unified “chatroom” window that encourages more casual chatter. If the timecoding feels incidental (at this point in the presentation, someone happen to say such-and-such) then you worry less about attaching your annotation to the precisely correct moment.

Problem: Too many tags.

The sentiment tags got in the way of commenting. There were simply too many to choose from. They knew what they wanted to write, but the step of selecting a tag slowed them down. This was true of static video for those watching STEM instructional videos as well.

Solution: Reduce and Reorder

  • Slim down the tag choices, in effect trading fidelity of data (we’ll lose a lot of nuance in what we can aggregate) for lowering the bar for engagement. There should probably be something like 3, but alas I can’t figure out how to chuck out 2 of the following 5.
    • Question?!
    • Interesting
    • Are you sure about that?
    • HAHA
    • { clap ! }
  • Reorder the workflow so that you can write first and then assign tags after, not dissimilar to how people use hashtags in Twitter.

This rearrangement of steps would turn the live video workflow into the exact inversion of the text annotation experience, which is tag first then elaborate for very specific reasons that have more or less worked out as theorized.


The modest amount of data we gathered from this years’ thesis presentations was enough to validate the original motivation behind the experiment: Collect audience reactions in a way that can yield meaningful analysis of what happened. However there remains a lot of trial-by-error to be done to figure out the right social dynamics and interaction affordances to improve engagement. There are clear “next steps” to try and that is all you can every ask for.

The only tricky part is finding another venue for testing out the next iteration. If you have video (live or static) and warm bodies (students, friends or people you’ve bribed) and are interested in collaborating with us, get in touch! We are always on the look out for new use cases and scenarios.

SFSU Study Outcomes Detail

Controlled study shows Ponder engages at a rate 12x over the control discussion forum.

We are fortunate that 7 of our teachers and professors agreed to sit down with us to talk about their experiences teaching and teaching with Ponder.

We have a lot of anecdotal evidence that Ponder has a noticeable impact on student engagement and class discussion. But how big is the impact? and compared to what? Many of our teachers had tried using blogs or discussion forums. But many more had never tried any technology to support student reading. In short, we wanted to turn subjective, observed correlations into objective, reproducible causations.

A Controlled Study

To measure Ponder’s effect more rigorously, we need a controlled environment where, to the extent possible, the only difference between two groups of students is the specific online discussion-focused tool they are using for class. Only then can we draw any credible conclusions about Ponder’s effectiveness.

xkcd: Correlation

Thanks to xkcd

Last summer, two researchers at San Francisco State University designed a long-term controlled study of Ponder’s impact on the classroom learning environment. Geoff Desa, Ph. D. and Meg Gorzycki, Ed.D. completed a review with the SFSU Institutional Review Board to ensure that the study met their guidelines. For those who are curious, or those who would like to replicate the process, take a look at the IRB Protocol Review Documents.

Everything but the name…

The study involved 4 classes of ~30 students each taking the identical class with the same professor during an intensive, 5-week summer semester. Two of the classes were instructed to use SFSU’s iLearn discussion forums to post and discuss articles relating to the class – the “control” for the experiment – and two of the classes were instructed to use Ponder. In both cases, identical scripts were used by the professor to introduce the tools, with only the name of the tool changed. The “quantity and quality” of their class contributions in these tools were to be incorporated into their class participation grades.

The professor used one script to introduce the tools to all 4 classes. The only difference was the name of the tool.

For those of you who are familiar with both discussion forums and Ponder, you might be wondering how the same script could be used to describe both, since they are functionally quite different from one another. You are right to wonder! Not only are they quite different, but most students were probably familiar with iLearn or something very much like it, and completely unfamiliar with Ponder or anything remotely like it. Still, it was essential to keep the number of differences between two experimental cohorts as small as possible. Consequently, students in the Ponder classes were left to figure out how to use it on their own. Despite this, students encountered few problems getting up and running with Ponder.

Preliminary Results

SFSU Study Outcomes Detail

SFSU Study Outcomes Detail

Over 90% of Ponder students participated, and on average, each of the students in the Ponder group contributed 26.9 responses while spending 329 minutes reading 208 documents – all of which was self-directed reading that fell outside of the assigned materials for the course. By contrast, students in the control group posted only an average of 2.2 times, and a quarter never contributed at all. The classes with Ponder were not only more inclusive, but generated over 12 times the volume of participation. In iLearn, there were a total of 123 posts. In Ponder there were 1747.

In iLearn, there were a total of 123 posts. In Ponder there were 1747.

In addition, preliminary analysis of data show that:

  • The average final grade for Ponder students was B+ as compared to B for the control.
  • There was a positive correlation between Ponder activity and quizzes, class participation, and group projects.
  • There was a negative correlation between iLearn activity and class participation.
  • More students participated in class discussion in the Ponder section with more participation per student.
ILearn Posts

Number of students who made a given number of iLearn forum posts during the study. (ie. 4 students made 3 forum posts.)

Ponder Posts

Number of students who made a given number of Ponder posts during the study. (ie. 10 students made 20-25 posts.)

What conclusions can we draw from these numbers?

Ponder is either more engaging and fun to use than iLearn, or it’s simply easier to use so students are more likely to use it. It’s hard to tell because none of the students experienced both so we can’t ask for subjective feedback comparing the two.

Either way, what’s important from the perspective of educational impact is that the students in the Ponder classes produced higher quality work, as demonstrated by the average half-grade improvement from the control.

When the final grades came in, Ponder students averaged a statistically significant half grade higher than the students in the control group.

The study itself has many more components to it, and the qualitative survey data had more students complaining about being confused by Ponder’s interface, unsurprising given the circumstances. The survey results also showed that students from the Ponder group appreciated the online component of the course more, said it made them want to read more and would recommend it to other professors more than the iLearn control group.

So what’s next?

These early results are very encouraging, but what we’re really curious about is forthcoming analysis of Ponder data that will start to paint a picture of how student reading habits change over time with Ponder.

  • Do students read from a broader range of sources by the end of the course? Do they read longer articles? Do they read more deeply around individual topics?
  • How do students affect each other’s reading? Do students discover new areas of interest through their peers? In other words, we don’t just want to know if their topic clouds grow over the course of the semester, but whether the students’ topic clouds increase in overlap.
  • Which students are the best at engaging other students in reading?

In short, what we’re interested in measuring with Ponder is not only student engagement but “intellectual curiosity” as defined by the reading behaviors listed above.

The second phase of the study is running right now, and we are looking to repeat and broaden the data to other levels and institutions. If you would be interested in collaborating with us or the SFSU researchers, get in touch!



New modes of interaction for Flip videos Part 3

This past semester I’ve been experimenting with new modes of interaction for video. I’ve written about 2 previous test sessions here and here.

Annotating video is hard. Video is sound, imagery moving through time. It’s an immersive and some might say brain-short-circuiting medium. Watching 3 videos simultaneously may be the norm today. However, if you’re truly engaged in watching video content, in particular content that is chock full of new and complex ideas, it’s hard to do much else.

Watching video content makes our brains go bonkers.

“’Every possible visual area is just going nuts,’ she adds. What does this mean? It shows that the human brain is anything but inactive when it’s watching television. Instead, a multitude of different cortexes and lobes are lighting up and working with each other…”

“She” is Joy Hirsch, Dir. of fMRI Research at Columbia U, being cited by the National Cable & Communications Association who interpret her results to mean watching tv is good for our brains, like Sudoku. I’m not sure about that, but it’s reasonable to conclude that consuming video content occupies quite a lot of our brain.

Of course no one is saying reading doesn’t engage the brain. However, one key difference between text and video makes all the difference when it comes to annotation: With reading, we control the pace of reading, slowing down and speeding up constantly as we scale difficult passages or breeze through easy ones.

Video runs away from us on its own schedule whether or not we can keep up. Sure we can pause and play, fast-forward and slow down, but our ability to regulate video playback can only be clunky when compared to the dexterity with which we can control the pace of reading.

In fact the way researchers describe brain activity while watching tv sounds a lot like trying to keep up with a speeding train. All areas of the brain light up just to keep up with the action.

So what does that mean for those of us building video annotation tools?

Video annotation has all the same cognitive challenges of text annotation, but it comes with additional physiological hurdles as well.

STEM v. The Humanities

I’ve been working off the assumption that responding to STEM material is fundamentally different from The Humanities. For STEM subjects, the range of relevant responses is much more limited. It essentially amounts to different flavors of “I’m confused.” and “I’m not confused.”

I’m confused because:

  • e.g. I need to see more examples to understand this.
  • Syntax! I don’t know the meaning of this word.
  • How? I need this broken down step-by-step.
  • Why? I want to know why this is so.
  • Scale. I need a point of comparison to understand the significance of this.

I get this because:

  • Apt! Thank you. This is a great example.
  • Got it! This was a really clear explanation.

Humor is a commonly wielded weapon in the arsenal of good teaching so being able to chuckle in response to the material is relevant as well.

But as is often the case when trying to define heuristics, it’s more complicated than simply STEM versus not-STEM.

Perhaps a more helpful demarcation of territory would be to speak in terms of the manner and tone of the content (text or video) and more or less ignore subject matter altogether. In other words: The way in which I respond to material depends on how the material is talking to me.

For example, the manner and tone with which the speaker addresses the viewer varies dramatically depending on whether the video is a:

  •  “How-to” tutorial,
  • Expository Lecture
  • Editorializing Opinion
  • Edu-tainment

The tutorial giver is explaining how to get from A to Z by following the intervening steps B through Y. First you do this, then you do that.

The lecturer is a combination of explanatory and provocative. This is how you do this, but here’s some food for thought to get you thinking about why that’s so.

The editorializing opinion-giver is trying to persuade you of a particular viewpoint.

Edu-tainment is well, exactly that. Delivering interesting information in an entertaining format.

And of course, the boundaries between these categories are sometimes blurry. For example, is this Richard Feynman lecture Expository Lecture? or Editorializing Opinion?

I would argue it falls somewhere in the middle. He’s offering a world view, not just statements of fact. You might say that the best lecturers are always operating in this gray area between fact and opinion.

The Test Session

So in our 3rd test session, unlike the previous 2, I chose 3 very different types of video content to test.

Documentary on The Stanford Prison Guard Experiment (Category: Edu-tainment)

A 10-minute segment of the Biden v. Ryan 2012 Vice Presidential Debate re: Medicare starting at ~32:00. (Category: Editorializing Opinion)

Dan Shiffman’s Introduction to Inheritance from Nature of Code (Category: Expository Lecture)

You can try annotating these videos on Ponder yourself:

  1. Dan Shiffman’s Introduction to Inheritance from Nature of Code.
  2. Biden v. Ryan Vice-Presidential Debate.
  3. The Stanford Prison Experiment documentary.

The Set-up

There were 5 test subjects, watching 3 different videos embedded in the Ponder video annotation interface in the same room, each on their own laptop with headphones. That means unlike previous test sessions, each person was able to control the video on their own.

Each video was ~10 minutes long. The prompt was to watch and annotate with the intention of summarizing the salient points of the video.

2 students watched Dan Shiffman’s Nature of Code (NOC) video. 2 students watched the documentary on the Stanford Prison Experiment. And 1 student watched the debate.

The Results

The Stanford Prison Experiment had the most annotations: 15/user versus 12 for NOC and 5 for the debate, and the most varied use of annotations: 22 total versus 5 for NOC and 4 for the debate.

Unsurprisingly the prison documentary provoked a lot of emotional reactions (50% of the responses were emotional – 12 different kinds compared to 0 emotional reactions to the debate).

Again unsurprisingly, the most common response to the NOC lecture was “{ chuckle },” it was 12 of the 25 responses. There was only 1 point of confusion around, a matter of unfamiliarity with syntax: “What is extends?”

This was a pattern I noted in the previous sessions where in many STEM subjects, everything makes perfect sense in the “lecture.” The problem is oftentimes as soon as you try to do it on your own, confusion sets in.

I don’t think there’s any way around this problem other than to bake “problem sets” into video lectures and allow the points of confusion to bubble up through “active trying” rather than “passive listening.”

Intro to Inheritance - NOC Intro to Inheritance – NOCBiden v. Ryan Vice-Presidential Debate Biden v. Ryan Vice-Presidential Debate Stanford Prison Experiment Stanford Prison Experiment

Less is More?

There are 2 annotation modes in Ponder. 1 displays a small set of annotation tags (9) in a Hollywood Squares arrangement. A second displays a much larger set of tags. Again the documentary watchers were the only ones to dive into the 2nd set of more nuanced tags.

Less v. More Less v. More

However, neither student watching the documentary made use of the text elaboration field (they didn’t see it until the end) where you can write a response in addition to applying a tag whereas the Nature of Code and Biden-Ryan debate watchers did. This made me wonder how having the elaboration field as an option changes the rate and character of the responses.

Everyone reported pausing the video more than they normally would in order to annotate. Much of the pausing and starting simply had to do with the clunkiness of applying your annotation to the right moment in time on the timeline.

It’s all in the prompt.

As with any assignment, designing an effective prompt is half the battle.

When I tested without software, the prompt I used was: Raise your hand if something’s confusing. Raise your hand if something is especially clear.

This time, the prompt was: Annotate to summarize.

In retrospect, summarization is a lot harder than simply noting when you’re confused versus when you’re interested.

Summarization is a forest-for-the-trees kind of exercise. You can’t really know moment-to-moment as you watch a video what the salient points are going to be. You need to consume the whole thing, reflect on it, perhaps re-watch parts or all of it and construct a coherent narrative out of what you took in.

By contrast, noting what’s confusing and what’s interesting is decision-making you can do “in real-time” as you watch.

When I asked people for a summarization of their video, no one was prepared to give one (inspite of the exercise) and I understand why.

However, one of the subjects who watched the Stanford Prison Experiment documentary was able to pinpoint the exact sentence uttered by one of the interviewees that he felt summed up the whole thing.

Is Social Always Better?

All 3 tests I’ve conducted were done together, sitting in a classroom. At Ponder, we’ve been discussing the idea of working with schools to set up structured flip study periods. It would be interesting to study the effect of socialization on flip. Do students pay closer attention to the material in a study hall environment versus studying alone at home?

The version of Ponder video we used for the test session shows other users’ activity on the same video in real-time. As you watch and annotate, you see other people’s annotations popping up on the timeline.

For the 2 people watching the Stanford documentary, that sense of watching with someone else was fun and engaging. They both reported being spurred on to explore the annotation tags when they saw the other person using a new one. (e.g. “Appreciates perspicacity? Where’s that one?”)

By contrast, for the 2 people trying to digest Shiffman’s lecture, the real-time feedback was distracting.

I assigned an annotation exercise to another test subject to be done on her own time. The set-up was less social both in the sense that she was not sitting in a room with other people watching and annotating videos and she was also not annotating the video with anyone else via the software.

I gave the same prompt. Interestingly, from the way she described it, she approached the task much like a personal note-taking exercise. She also watched Shiffman’s Nature of Code video. For her, assigning predefined annotation tags got in the way of note-taking.

Interaction Learnings

  • The big challenge with video (and audio) is that they are a black box content-wise. As a result, the mechanism that works so well for text (simply tagging an excerpt of text with a predefined response tag) does less well on video where the artifact (an annotation tag attached to timecode) is not so compelling. So I increased emphasis on the elaboration field, keeping it open at all times to encourage people to write more.
  • On the other hand, the forest-for-the-trees view offered on the video timeline is I think more interesting to look at than the underline heatmap visualization for text so I’ll be looking for ways to build on that



  • There was unanimous desire to be able to drag the timecode tick marks after they had already submitted a response. We implemented that right away.
  • There was also universal desire to be able to attach a response to a span of time (as opposed to a single moment in time). The interaction for this is tricky, so we’ve punted this feature for now.
  • One user requested an interaction feature we had implemented but removed after light testing because we weren’t sure if it would prove to be more confusing than convenient: automatically stopping the video whenever you made mouse gestures indicating you’re intending on creating an annotation and then restarting the video as soon as you finished submitting. I’m still not sure what to do about this, but it supports the idea that the difficulty of pacing video consumption makes annotating and responding to it more onerous than doing the same with text.


  1. Annotating video is hard to do so any interaction affordance to make it easier helps.
  2. Dense material (e.g. Shiffman’s lecture) is more challenging to annotate. Primary sources (e.g. the debate) are also challenging to annotate. The more carefully produced and pre-digested the material (e.g. the documentary), the easier it is to annotate.
  3. With video, we should be encouraging more writing (text elaborations of response tags) to give people more of a view into the content.
  4. Real-time interaction with other users is not always desirable. Users should be given a way to turn it on/off for different situations.
  5. There may be a benefit to setting up “study halls” (virtual or physical) for consuming flip content, but this is mere intuition right now and needs to be tested further.

Last but not least, thank you to everyone at ITP who participated in these informal test session this semester and Shawn Van Every and Dan Shiffman for your interest and support.

Actionable Data: Survey Responses from EdSurge Baltimore

Companies love conducting surveys. Most of them are pretty awful to the point of uselessness.

So it was a nice surprise to find a few gems amongst the questions asked in the survey from the latest EdSurge event in Baltimore,

3 that I liked in particular:

1. If administrators were looking to purchase this product for their school, how strongly would you advocate for this product?

The question was not just another: “Rate this product on a scale of 1 to 5,” a purely theoretical exercise that’s meaningless in comparison to the much more concrete task of trying to imagine how far you’d stick your neck out to try a new product.

2. Forget about the current price of the product (if it’s free, forget about that too). If you were given an extra $100 to spend per user (e.g. student, teacher, etc.) how much would you be willing to spend for this product?

A more confusing alternative might have been “How much would you pay for this product?” Instead, teachers are given a hypothetical that gets rid of issues that often confuse the question of pricing (e.g. variations in teachers’ discretionary budgets, how much was granted and how much they have remaining, whether a product could be gotten for free) and focuses on the teacher’s sense of a fair price. This question replaced my previous favorite: “How much would you expect to pay for a tool like this?” which Desmos founder Eli Luberoff shared a few months ago.

Last but not least, I want to mention this question not so much because it’s particularly well-worded, but because I rarely hear it discussed:

How often would you use this product?

Room for Improvement?

Now that I’ve gotten the praise out of the way, here are 3 specific ideas on how the surveys could be even better.

1. Enable companies to respond to feedback through a craigslist-style-mail-relay mechanism which would enable direct communication while keeping teacher identities anonymous.

2. Provide teachers more of an incentive for quality (of feedback) over quantity (of responses) by allowing each company to enter one survey response in a “Most Useful Feedback” raffle.

3. Don’t allow teachers to respond to surveys before the event starts! I had a teacher show up before the start of the Baltimore Summit who told me she had already looked at Ponder online and decided it was a terrible idea, but if I wanted I could try to convince her otherwise. So I spent ten minutes walking her through our product, after which she said “Well, this is pretty great actually. I’m going to recommend it to the teachers at my school – you should put all of what you said to me on to the web site.” Which is fine, since we definitely need to improve the story on our site, and I was glad she was excited. Then she said “I already filled out the long form survey yesterday, so can you give me the tickets?” I gave her the tickets, and as she walked away realized that we would now have whatever her initial reaction was captured in our public survey results, with no mechanism for her to update them or comment on her misunderstandings.

4. Some sort of tech-savvy-ness question for the teachers to provide some context for their answers. How often do you use tech in your class today? Have you tried other tools similar to this one in purpose and funcationality?

An even bigger idea: The EdSurge Census

Here’s more of a project-sized proposal for EdSurge: A survey of educators, schools and infrastructure – The EdSurge Census. A grassroots, lay of the land sort of thing, which could be independent of the summits. I think it would be valuable to ed-tech companies, the schools themselves as well as funders and foundations. In the spirit of the new “50 States Project,” but a bit more data-oriented.

A few obvious questions we’d love answers to readily come to mind:

Devices – What schools have, what they think they’ll have, what they wish they’d had, etc; and

eBooks – In our experience most K12 schools seem to be lost at sea when it comes to ebook platforms – they don’t like being locked in, it complicates their device story, they have physical books which are easy to manage, etc.

Adoption – What tools are teachers using today and how often are they using them? Tools should be categorized by function (e.g. Administration and Logistics, Planning and Instruction) subject matter and grade-level so it’s easy to see where the gaps are.

EdSurge could charge companies to fund the data collection and reporting. Or perhaps one of the edu heavy weights would be willing to step up up to fund it.

A statistically accurate portrait of the state of ed-tech in schools is probably unlikely to emerge from such a survey. But so long as the results were positioned honestly as an informal sampling, the results would be undeniably useful.

Whatever happens, I’m looking forward to seeing the next incarnation of the Edsurge surveys in Nashville!

Logging Tutorial

New modes of interaction for Flip videos Pt. 2

This semester in addition to teaching, I am a SIR (Something-in-Residence) at ITP, NYU-Tisch’s art/design and technology program.

My mission for the next 3 months is to experiment with new modes of interaction for video: both for the flip video and live events. (Two very different fish!)

User Study No. 2

I recently wrote about my first User Study with students from Dan Shiffman’s Nature of Code class. A couple of weeks ago, 3 students from Shawn Van Every’s class, Always On, Always Connected volunteered to watch 4 videos. In this class, students design and build new applications which leverage the built-in sensor and media capabilities of mobile devices.

The Setup

Again, there were no computers involved. We screened the videos movie-style in a classroom. Instead of asking for 2 modes of interaction (Yes that was super clear! versus Help!), I asked for a single mode of feedback: “Raise your hand when you’ve come across something you want to return to.”

The 4 videos introduce students to developing for the Android OS. It’s important to note that simply by virtue of working in the Android development environment (as opposed to Processing) the material covered in these videos is much more difficult than what was covered in the Nature of Code videos the previous week. However, the students’ level of programming experience is about the same.

What happened…

Video 1: Introduction to logging

Zero hands. But from the looks on people’s faces, it was clear it was not because everyone was ready to move on. A few things to note:

  1. For screencasts, watching a video movie-style is extra not-helpful as it is difficult to read the text.
  2. The video begins promptly after a 20s intro. With instructional videos, the emphasis is always on brevity (the shorter the better!) In this case however, I wonder if 20s isn’t enough to allow you to settle down and settle into the business of trying to wrap your head around new and alien concepts. I’m sure #1 made it harder to focus as well.
  3. Reading code you didn’t write is always challenging and I think even more so on video where we’re much more tuned into action (what’s changing) and much less likely to notice or parse what’s static.
  4. Unlike before, when I asked after-the-fact, “Where in the video do you want to go back to?” the students were unable to respond. Instead the unanimous response was, “Let’s watch the entire video again.” This is where collecting passive data about the number of times any given second of a video is watched by the same person would be helpful.
  5. In general, questions had to do with backstory. The individual steps of how to log messages to the console were clear. What was missed was what and why. First, what am I looking at? And second, why would I ever want to log messages to the console? I say “missed” and not “missing” because the answer to those questions were in the videos. But for whatever reason, they were not fully absorbed.
  6. Last but not least, I have to imagine that this watching as a group and raising your hands business feels forced if not downright embarrassing.

Hopefully, a software interface for doing the same thing will elicit more free-flowing responses from students as it will provide them with a way to ask questions “in private.”

Videos 2-4

Everyone was more settled in after watching the logging video. Each subsequent video built on the momentum accumulated in the previous one. With Video 2, I started to get some hand-raising starting with video 2. when we returned to those points in the video, people were very specific about what was confusing them.

“Why did you do that there?” or it’s converse, “Why didn’t you do that there?” was a common type of question as was “What is the significance of that syntax?”

Another way to look at it is: There were never any issues with the “How.” The steps were clearly communicated and video is a great medium for explaining how. The questions almost always had to do with “Why?”, which makes me wonder if this is the particular challenge of video as a medium for instruction.

Does learning “Why?” require a conversation initiated by you asking your particular formulation of “Why?”

Video 2: Toasts

Video 3: Lifecycle of an Android App Part 1

  • @2.00: What is the significance of @-id? Why aren’t you using the strings file for this?
  • @6:50: Why did you change arg0 to clickedView?

Other syntax questions included:

  • What’s the difference between protected versus public functions?
  • What’s extend and implement?
  • What’s super()?
  • What’s @override

All of these syntax questions pointed to a much larger / deeper rabbit hole having to do with object-oriented programming and encapsulation, a quick way to derail any getting started tutorial.

In general though, I don’t think these syntax questions prevented you from understanding the point of the videos which were about creating pop-up messages (Video 2) and the lifecycle of Android apps (Videos 3 and 4) (when do they pause versus stop, re-initialize versus preserve state).

Video 4: Lifecycle of an Android App Part 2

In Video 4, there were a lot of nodding heads during a section that shows the difference between pausing/resuming and killing/creating an Android app. The demonstration starts around @2:20 and goes for a full minute until @3:23. Epic in online video terms. It’s followed by a walk-through of what’s happening in the code and then reinforced again through a perusal of Android documentation @3:50 where an otherwise impenetrable event-flow diagram is now much more understandable.

It’s also important to note that both the demo which starts at @2:20 and the documentation overview @3:50 are preceded by “downtime” of trying, failing and debugging starting the Android emulator and navigating in the browser to find the flow diagram.

In general there’s a lot of showing in these videos. Each concept that’s explained is also demonstrated. This segment however was particularly lengthy and in accordance with the law of “It-Takes-A-While-To-Understand-Even-What-We’re-Talking-About” (#2 above), my completely unverified interpretation of events is that the length of the demonstration (far from being boring) helped everyone sink their teeth into the material.

What’s Next and Learnings for Design

As we head into testing actual software interfaces, this 2nd session gave me more concrete takeaways for workflow.

  1. You lay down “bookmarks” on the timeline as you watch to mark points you would like to return to.
  2. You lay down “bookmarks” on the timeline after you’ve watched the video to signal to the instructor what you would like to review in class the next time you meet.
  3. You can expand upon “bookmarks” with an actual question.
  4. You select from a set of question tags that hopefully help you formulate your question. (More to come on this.)

While it’s important to break down the videos into digestible morsels, it’s also important to make it easy to watch a few in a row as the first one is always going to be the most painful one to settle into. There are ways to encourage serial watching with user interface design. (e.g. playlist module, next video button, preview and auto-play the next video.)  But perhaps something can be done on the content side as well by ending each video on a so-called “cliffhanger.”


Magnitude! Direction!

New modes of interaction for Flip videos Pt. 1

This semester in addition to teaching, I am a SIR (Something-in-Residence, no joke) at ITP, NYU-Tisch’s art/design and technology program.

My mission for the next 3 months is to experiment with new modes of interaction for video: both for the flipped classroom and live events. (Two very different fish!)

Magnitude! Direction!

Magnitude! Direction!

2 weeks ago, I conducted the first in a series of informal user studies with 3 students from Dan Shiffman’s Nature of Code class which introduces techniques for modeling natural systems in the Java-based programming environment Processing .)

User Study No. 1

Last year, Dan flipped his class, creating a series of ~10 minute videos starting with how to use random() to move objects around a screen to modeling the movement in ant colonies.

The Setup

We viewed the first 2 videos from Chapter 1 which introduce the concept of a vector and the PVector class in Processing.

There were no computers involved. We watched it together on a big screen, just like you would a movie except I sat facing the students.

The only “interaction” I asked of them was to:

  1. Raise their right hand when Dan said something extra clear.
  2. Raise both hands to indicate “Help!”

What happened…

On the whole, the “interaction” was a non-event. There were a few tentative raisings of the right hand and zero raisings of both hands.

However, after each video when I asked, at what points in the video were things not perfectly clear? There were no hesitations in the replies.

Points of confusion fell mostly into 1 of 2 categories:

  1. I simply need more background information.
  2. Maybe if I rewatch that part it will help.

Although I will say that in my subjective option, number 2 was said in a tentative and theoretical fashion, which is interesting because “the ability to re-watch” is a much-vaunted benefit of flip.

It must be said though, that we’re early in the semester and the concepts being introduced in these videos are simple relative to what’s to come.

Learnings for design…

Some early thoughts I had coming out of this first session were:

  • Unlike reading, video is too overwhelming for you to be able to reflect on how you’re reacting to the video while you’re trying to take it in. (Just compare reading a novel to watching a movie.)
  • That being said, so long as the video is short enough, people have a pretty good idea of where they lost their way, even after the fact.
  • Still, there needs to be a pay-off for bothering to register where those points are on the video. For example, “If I take the time to mark the points in the video where I needed more information, the instructor will review them in class.”
  • It’s still unclear to me how a “social” element might change both expectations and behavior. If you could see other people registering their points of confusion and asking questions and you could simply pile on and say “I have this question too” (many discussion forums have this feature), the whole dynamic could change.
  • Hence, the popularity of inserting slides with multiple choice questions every few minutes in MOOC videos. The slides serve as an explicit break to give the viewer space to reflect on what they’ve just watched.

I wonder though if they need to be questions. Genuinely thought-provoking questions in multiple choice form are hard to come by.

In a flip class where there is such a thing as in-class face time, a simple checklist of concepts might suffice.

Here were the concepts that were covered. Cross off the ones you feel good about. Star the ones you’d like to spend more time on. I think it’s key that you are asked to do both actions, meaning there shouldn’t be a default that allows you to proceed without explicitly crossing off or starring a concept.

Concepts for Video 1.1:

  • A vector is an arrow ★ @2:00 : Why is it an arrow and not a line?
  • A vector has magnitude (length).
  • A vector has direction (theta) ★ @3:30: What’s theta?
  • A vector is a way to store a pair of numbers: x and y.
  • A vector is the hypotenuse of a right triangle (Pythagorean Theorem).

Concepts for Video 1.2:

  • PVector is Processing’s vector class.
  • How to construct a new PVector.
  • Replacing floats x,y with a PVector for location.
  • Replacing float xspeed, yspeed with a PVector for velocity.
  • Using the add() method to add 2 PVectors.
  • Adding the velocity vector to the location vector.
  • Using the x and y attributes of a PVector to check edges.

The tricky thing with Video 1.2 is that the only point of confusion here was @10:27 when Dan says almost as an aside that you add the velocity vector to the location vector which you can think of as a vector from the origin. I don’t think this kind of problem area would surface in a list of concepts to cross out or star.

Next Session

In session 2, I will be conducting another user study with students from Shawn Van Every’s class “Always On, Always Connected” (an exploration of the technologies and designs that keep us online 24/7).

My plan is to try asking the students to raise their hand simply to register: I want somebody to review what happened at this point in the video.

We’ll see how it goes!