All posts by alexw

Wrapping Up

Our prototype is done! You can view it here. There’s only basic functionality currently: you can select one of three datasets, select a variable from it, and it fits the parameters (less than 15 elements), it will generate a bar plot. It may seem a bit underwhelming, but a lot of time, effort, and learning went into this, and we’re proud of the result. More importantly, we’ve set up the framework to develop and extend the functionality and types of graphs covered. It took us a lot of time to set up Django, Jenkins, etc., and now that that is all done, it will be much easier for later generations to develop.

 

Regarding possible next steps, perhaps one of the most important features will be to connect with the new Dataverse API so you can actually search for datasets in the repository. Also important will be to write scripts to generate graphs for variables that don’t fit our parameters, such as density plots or time series. Functionality should extend to include maps and scatter plots. There’s also a lot to be done with the GUI and visual presentation. We still need to add axes to the graphs, and polish the UI to look sleek and appealing. And, of course, there are still some bugs to clean up.

 

Overall, our experience with DPSI was very rewarding! All of us got the experience we were looking for, whether it was with data visualizations, data processing, UI, or even just working in a team. We want to give special thanks to Phil Durbin and Raman Prasad from IQSS who were a massive help in getting to where we are, as well as our mentors Vito D’Orazio and Merce Crosas. We couldn’t have done it without you!

Keep on Trucking

Hi everyone,

 

First, props to anyone who has stuck with our adventure thus far. We appreciate the readership!

 

This week, some of the mentors were out of town, so instead of a discussion meeting, we had a work session. We were successful this week with developing an understanding of how our various parts integrate together: the UI has to allow users to select variables, which tells the data manipulation module what to select, which is passed to the visualizing function which passes the graph to the UI… It requires a lot of prior coordinating and planning, so that’s part of what we focused on this week.

 

What has been challenging is navigating the Django framework and understanding where  the various functions that we write belong. Some of us are new to it, so we’re trying to grasp what’s going on when it loads the page and how we fit into it. Shouldn’t be too challenging, though!

 

With the final presentation coming in less than a month, all our efforts are focusing on cleaning up the UI and generating the visualization. It’s gonna be great!

 

Best,

Alex

eyeData – In the Trenches

What We Did

This week, we continued to work on the actual coding. Unfortunately, due to some tight schedules and external factors, we weren’t able to meet. We also have some design considerations to resolve before we can continue to move forward with the coding, described below.

What Went Well

Despite not being able to meet, what did go well was our communication outside of the meeting. Batsheva, Luis, and I continued our dialogue on language and technology choice. As this project will require a lot of collaboration between each of our three parts, it’s reassuring to know that progress can and will be made outside the meeting.

What Was Challenging

The main challenge this week was deciding whether to continue to use the d3.js library or shift to Vincent, a Python library built on top of d3.js. Since we are using Pandas, a Python library for data processing, it makes intuitive sense to use Vincent, but it is another package to learn, which takes a nontrivial amount of time to familiarize with.

What’s Up Next

We continue to make progress with the code! By next week, we hope to be able to select a variable from a dataset and create univariate graphs for that variable. We also should move to using the Dataverse API to grab datasets and continue polishing the UI.

 

Until next week,

Alex

eyeData – Coding

What We Worked On

This week, we actually got started coding. Our friends at IQSS set us up with a server and website (http://eyedata.datascience.iq.harvard.edu/). We all set up our personal environments and are working off a git repo. We use Jenkins to integrate our changes nightly.

With our midterm presentation this week, we then decided to split up the project and each of us (Batsheva, Luis, and myself) would develop an area and integrate them for the presentation. Luis is working on the UI; Batsheva is processing the data; I am using d3.js to create visuals.

What Went Well

So far it’s all has been going quite well! We were able to lay out the architecture of the website clearly and divide up the work evenly. We are on track to have a very basic version this week.

What Was Challenging

Setting up the environment for Windows was challenging. I eventually just resorted to setting it up on a VM running Ubuntu.

For me personally, it has be challenging to navigate the various technologies that we are using. Most of what we are using, such as Django and d3.js, are new to me, so a lot of time is spent just learning the language and principles behind them. Luis and Batsheva have previous experience with their modules, but regardless it takes a nontrivial amount of time to work through. Once these fundamentals have been laid out, we’ll have a barebones version of the website that will use some datasets living in the git repo to generate some visuals!

What’s Up Next

Currently, we are all working on our individual parts in order to have something to present at the midterm presentations this Wednesday. Our goal is to have a basic mockup of the UI that we envision, a processed data set, and a visualization created from that dataset. If you’re in the area, feel free to stop by and check us out!

After this week, we’ll probably hook up our website with the Dataverse search API, polish the UI, and automate the creation of the d3.js graphs. Once we have the website’s functionality up and running, we’ll all transition to creating graphs for different types of data.

Until next week,

Alex

eyeData – Planning

Hello there, Alex from eyeData with an update on our progress.

 

Unfortunately due to a miscommunications, our advisers weren’t in town and we were not able to meet with them.

 

However, we have still been hard at work. This week, we have been hard at work building our design doc, which Luis mentioned in our blog post last week. This design doc describes our overall goals and implementation strategies for our final product. The doc encompasses our discussion so far on UI and other aspects of the final product and was a good exercise in scoping out a project and defining milestones, deadlines, technical writing, etc. Furthermore, it was a good exercise in collaborating and dividing up work so that all of us can meaningfully contribute.

 

Without further ado, and since for the most part ya’ll don’t have access to our doc, here is a description of our vision of our basic product. We want to create a standalone website. Users will be able to search for datasets on Dataverse, using the Dataverse API, and then select whichever options they want. They will then be taken to a dashboard where they will be able to select which type of visualization they want. Our most basic goals are handling one dataset and basic graphs, such as bar graphs, scatter plots, time graphs, etc. But our broader vision is to allow users to choose multiple datasets and interact with their variables, as well as create more visually striking and meaningful graphs. Check out the Github page for d3.js and let us know your favorites!

 

Next week, we will actually get our hands dirty coding and developing! We also need to discuss how the work will be split up. Our friend from last week, Raman, will help us set up a server and the necessary tools. We may need to work through some tutorials to familiarize ourselves with the languages we’re using.

 

Until next week!

Alex Wang (College ’17)

eyeData – Getting Started

Hi! My name is Alex and I’m a member of eyeData, a Digital Problem Solving Initiative project working on creating a data visualization tool to integrate with Dataverse, the largest collection of research data led by Mercè Crosas of the Institute for Quantitative Social Science and Vito D’Orazio. The rest of the team is four undergraduates at Harvard College, as well as a more senior member assisting us all the way from the U.K.

Getting started with the project, we gathered to discuss the scope of the project and the direction that we wanted to take the project in. It turns out that there are rudimentary tools existing already, for example Princeton University has a tool called Roper that can generate tables and graphs for the survey results it includes in its database. Similarly, Dataverse already has a feature called TwoRavens that leverages metadata attached to the data sets in order to create graphs with some user interactivity built in. Knowing that some of these features exist, we will take these into account as we meet again next week to discuss our task for the week.

Our task was for each individual to identify some interesting features or graphs that they would like to see implemented in the project. Since we are using d3.js, I’m sure many of us turned to the d3.js github for inspiration, although there are certainly many striking visualizations online to inspire us.

As this is only our first week and we haven’t dived into the programming, nothing has been challenging thus far. But we are all excited to dive into the code and learn! Many of us have prior experience with R, Javascript, and d3.js, the main technologies we will be building with, but this project will require us to expand our knowledge bases of those languages, which I’m sure will be a challenging, though rewarding experience.

Best,

Alex Wang