Using Big Data for Social Science Research

Course Description: 

12 sessions concentrated in a 2-week period between April 10-21. Each session lasts for 100 minutes.

The course is an introduction to state-of-the-art methods to use Big Data in social sciences research. It is a hands-on course requiring students to bring their own research problems and ideas for independent research. The course will review three main topics making Big Data research unique:

1. New and emerging data sources such social media or government administrative data;
2. Innovative data collection techniques such as web scraping; and
3. Data analysis techniques typical of Big Data analysis such as machine learning.

Big Data means that both the speed and frequency of data created are increasing at an accelerating pace virtually covering the full spectrum of social life in ever greater detail. Moreover, much of this data is more and more readily available making real-time data analysis feasible.

During the course students will acquaint with different concepts, methodological approaches, and empirical results revolving around the use of Big Data in social sciences. As this domain of knowledge is rapidly evolving and already vast, the course can only engender the basic literacy skills for understanding Big Data and its novel uses. Students will be encouraged to use the acquired skills in their own research throughout the course and continue engaging with new methods.

Learning Outcomes: 

Students will be acquainted with the basic concepts and methods of Big Data and their use for social sciences research. They will gain first-hand experience with applying such methods to real-life research problems. The acquired knowledge will enable students to use Big Data methods in their individual research on various topics of political science, economics, and sociology.


* Students are required to attend classes regularly, familiarize themselves with each session’s reading list and to participate actively in course discussions, in particular providing constructive feedback on other students’ presentations.

* Students will pick a data source and research question at the beginning of the course which they will have to regularly work on and report to the class. The methods and approaches learnt in each session will have to be applied to the selected source and research question.

* Students will have to write individual final papers and submit their database and codes which they produced throughout the whole course. The final paper will be short, not longer than 3000 words, describing and critically assessing the data source, data collection method, and analytical tools used in light of the selected research question and relevant prior literature. Great emphasis will be given to the submitted database and annotated codes. Final student project delivery is due 2 weeks after the last session.

  • Attendance and class-room participation 15 %
  • In-class presentations 40 %
  • Student project & final paper 45%
  • Final papers will be due on the 30th of April (a week after teaching ends).

Elementary proficiency in quantitative methods and familiarity with statistical softwares, in particular R. Enrolment in MA or PhD course.