Category Archives: Uncategorized

eyeData Progress – Midterm Review

What it do, what we do!?
Last week, our team lead, Alex, presented a summary of our project progress so far. He presented some of the more technical aspects associated with software development (such as version control and site deployment) and the tools we’re using to make that happen (git, Jenkins, etc).
What went well?
Essentially, we’ve doing well for the semester and are now at the point where we can dive into the code and make some changes happen! Everything is set up with Django and Jenkins so that the website is built nightly and hosted live on here.
eyeData Site
The current eyeData homepage!
As such, changes to the project are as simple as git pull, commit, push. We still have the bullk of the work left to do, but I’m sure we’ll figure it out soon enough!
Simple Git Diagram
What was challenging?
The most challenging aspect for this week has been getting the team to work together. Now that we’ve started diving into the code, we’re having each member specialize on a certian aspect of the project: UI, data analysis, graph plotting, etc. As such, we’re getting to the point where each member is the expert in his/her area, so making larger changes requires closer collaboration than before. This has proved to be a challenge, but with definitive goals, we will overcome it!
What’s up next?
We’re planning to use the resources at the Berkman Center to learn about what visualizations are useful for social science research. Not every social scientists is going to need the same type of graphs ; we want our product to be intuitive and intelligent, so we need to learn more about what’s actually in use out there in the wild.

We’ll also want to learn what languages/libraries are used most frequently by social scientists. While we’re focused more heavily on data visualization, at some point in the process, we’d like to provide the social scientists with access to the manipulations performed on the data. Knowing what background knowledge they have will be helpful for us

Take care, and keep a look out for eyeData!

-Luis

Open Access – Team Update

Our team was very excited to have participated at the Mid-Term Reviews. Presented by our Team Lead, Wendy W Fok, our Open Access team received many helpful feedback and responses from Prof Chris Bavitz, Prof Jim Waldo, and Prof Urs Grasser.

Most importantly, we would like to announce an exciting approval of the Open Access Policy by Berkman Center:

The Berkman Center announces Open Access Policy
With this policy, approved on October 9, 2014, the Berkman Center’s faculty directors and staff join the action of the nine School faculties: granting the University nonexclusive rights on all new scholarly work relating to the purview of research at the Berkman Center. The policy ensures that the “fruits of [Berkman’s] research and scholarship” will be distributed as widely as possible. (Click here for Read More)

Updated on 29 October, 2014 – Wendy W Fok

Big Data – Midterm Review

Midterm review
Last week, we presented our goals and questions for the semester in the DPSI Midterm review event. It was great to hear about the focus of other groups – some of them struggling with very different questions than ours and others working on surprisingly similar things. We exchanged thoughts and ideas with other members of the community.
A puzzle to solve
To help envision the problems we are dealing with, we shared a few examples of de-identification problems and their implications:
(1) EdX and Completion rate
When researchers began analyzing the completion rate of EdX courses they began noticing that the annonmyzed dataset presented with very different statistics when compared with the original dataset. The completion rate showed a significant drop when the data was anonymized. When digging into this, it became evident that observations of  many of those who actually completed courses was dropped from the anonmyzized dataset. This is because the characteristics of a person who signed up for a course once and never went back to the page again were drastically different than those of a person who signed up, watched every lecture, and did every problem set. With so much identifying information, such observations were frequently dropped, even though these individuals were much more likely to finish a course. Analysis on the annonymized dataset was therefore useless.
(1) Google Ads and users behavior 
While interning at Google this summer, Olivia, a member of our Big Data group, ran into a peculiar problem. In her role as a data scientist, she was trying to understand whether people who saw Google Ads were more likely to conduct search on ad-related queries. Since Olivia was an intern and was not allowed to see user’s individual search information, she received a dataset in aggregated form, which summed up interactions by user. When she ran the analysis she saw some strange results – it seemed like people who saw ads were somehow less likely to perform ad-related queries. Since she believed such results were suspicious she raised those concerns, and her supervisor ran the analysis on the original dataset. The results were radically different, and as expected showed that users who saw ads were much more likely to run ad-related queries. Why did this happen? It seemed like users who watched ads for a few seconds were very different from users who watched ads for a few minutes, but that richness of the data disappeared in aggregated form. You could no longer distinguish between a user who saw many ads for a second and a user who saw an ad for a minute. This drastically changed the results and rendered the anonomyzed dataset useless for such purpose.
 
What’s up next
We now officially have a de-identified dataset to work with, along with some of the documentation around how it was de-identified. The coders in the group will begin examining it and playing with the code.
Our policy team continues to work on de-identification laws outside of the education space (FERPA). We are taking a look at HIPPA, which specifies de-identification requirements for medical information, and international laws (especially in privacy protecting Europe).

#DocShop Meeting 06

Midterm Review 

We got a good idea of people’s familiarity with interactive documentary and gauged the interest in the topic. As was brought up during our previous meetings by group members and the metaLAB team, folks wondered how we capture process and audience responses to media.

On the whole, we definitely made some nice connections and got great feedback… we also might get some lessons learned documentation from Wendy Fok, one of the winners of last year’s Dean’s Design Challenge, on our entry into the i-Lab Cultural Entrepreneurship Challenge.

What we worked on

The group met Tina Pamintuan, Nieman Fellow from CUNY Grad Center, who introduced herself and sat in on the discussion. Dan and other group members shared their experience of the Sam Green show at the ICA, but much of the conversation was about Ragnar Kjartansson’s piece The Visitors and the overall success of troubling some of the ideas of how to show multi-stream work in a museum setting.

Dan continued his conversation with Lara Baladi (MIT OpenDocs Lab) and Dalia Othman (Berkman), and we hope that they can join our next meeting. Perhaps a build out of Lara’s project on Egypt would work well for our kickoff event! We hope they can attend the meeting 11/7.

What went well

Idea jamming and talking about architecting different kinds of interaction between time-based media and audiences/authors. It was really productive to think of the spatialization of documentary as an , from the ground level (in a gallery or museum space), as opposed to combining top-down approaches with narrative seen in mapping, or the ‘choose your own adventure model’ that is viewed in a browser or a device, still being a single stream experience.

What was challenging

Pinning down the terms we wish to use in our proposal and charter. How do we break out of the ‘black box’ in a cinematic or dramaturgical environment? Perhaps it is a black cube?

What do we call this group and event series? An incubator or seminar/workshop series, a school, an interactive doc film festival? Perhaps a new term– a CoLABoratory. This implies that a number of different stakeholders across disciplines are collaboratively designing solutions to the problems of interactive documentary.

What’s up next

A number of group members were at the Illuminus Festival, so it will be productive to talk about what worked about that. Start work on our narrative, scope and process schedule, budget, and milestones for 2014/early 2015.

Halloween field trip!!! At our last meeting we decided to hold our next session in Lawrence  to view the location-based documentary, The Path: Fall of the Pemberton Mill made by Dan Koff in 2010.  

eyeData – Our Second Meeting

What we worked on!

Introductions and check-ins!

We had Raman Prasad, our resident expert on python, mapping, and many other things, volunteer to provide a helping hand in setting everything up! He’s been asked to give us a Python back-end to interface with the DataVerse API.

Each person in the team made a short presentation on what we expect to see from our eyeData project! The presentations went quite well!

Here’s a quick snapshot of our UI Presentation, highlighting some key ideas (inspired by CS50 Courses)

Presentation Snapshot

What went well

The presentations went quite well – each team member contributed great idea to the project and raised questions the presenter hadn’t yet considered. The process of figuring out how the UI would be laid out went incredibly well 🙂

What was challenging

The scope analysis of this project is the most challenging so far! Do we want to focus on just survey data and DataVerse, or should we begin considering handling more general sets. Another challenging aspect of the work will be the collaboration – how will we divide the work and make sure that each team member can be kept up to data? Our plan is to use GitHub for version control and bug reports!

What’s up next

Up next, we have our Scope Doc! This is a document which will incorporate most of our ideas and list out the tools and resources we plan to use for our project. It will help us get organized before we dive into coding! We plan to have this prepared for our next meeting, during which we will dive into each aspect in detail and begin assigning parts to each team member! Here’s a quick snapshot of what we have so far!

Scope Doc Capture

Go team eyeData! Wait to hear more from us next week!

Best,

Luis Perez (College ’16)

Info Session @ 4pm

Miss the deadline to apply, but still want to join DPSI? Not to worry! Although teams are underway, we’re still accepting applicants. Apply now

We’ll also be having an info session Friday, September 11, at 4pm (with pizza!), at the Berkman Center, located on the second floor of 23 Everett Street, Cambridge.

If you can’t make it, feel free to reach out to us at dpsi (at) cyber (dot) law (dot) harvard (dot) edu.

 

Applications for fall 2014 are live now!

The Berkman Center for Internet & Society is seeking energetic, creative, and passionate students to apply for the Digital Problem Solving Initiative (DPSI). DPSI is a University-wide initiative that brings together a diverse group of learners (students, faculty, fellows, and staff) to work on real-world projects that address problems and opportunities across the university – no experience necessary!

DPSI offers all participants the opportunity to enhance and cultivate competency in various digital literacies as teams engage with research, design, and policy work relating to the Harvard community. As a student, you’ll have the opportunity to work with – and be mentored by – Harvard faculty, fellows, and staff in collaborative teams that will build and shape the increasingly digital environment in which we live, learn, work, and create.

Furthermore, DPSI is about listening to student voices, empowering student-led teams to bring their ideas to fruition. Maybe you want to help libraries connect better with students, or think about how to make online education more accessible. Maybe you want to build a Harvard-specific app. The DPSI community is a collaborative space that makes it possible to deliver tangible results.

To learn more about the application process, check out the DPSI website and apply now! Applications are due on Tuesday, September 9. If you are accepted into the program, you’ll be asked to commit to approximately 6-8 hours a week for the fall semester. In addition, if you’re interested in learning more about the Berkman Center, check out some of our kickoff activities in early September.