Data-oriented Programming Paradigms

Submitted by webmaster on Fri, 10/22/2021 - 14:48
Course No: 
188995
Course Type: 
VU
Term: 
2021W
Weekly Hours: 
2.0
Lecturer: 
Gábor Recski
Wojciech Kusa
Allan Hanbury
Adam Kovacs
Language: 
English
Objective: 

This lecture covers the basic programming approaches in Data Science. The emphasis is on computational thinking, the formulation of problems and their solution spaces so that a computer can solve them. Methods for increasing the efficiency of the solutions are also presented. Use cases demonstrate the practical application of data science solutions.

Content: 

The following topics are covered in the lectures:

  • Introduction to Data-Oriented Programming Paradigms
  • Python
  • SciPy, NumPy, vectorisation, execution performance measurement
  • Data preparation, structuring, fusion with Pandas
  • Data Science solution approaches and case studies
  • Introduction to machine learning
  • Introduction to network analysis
Information: 

The link to the online lectures is on TUWEL.
 
Syllabus
All Lectures on Tuesday 12:00 c.t.-13:45.

  1. Kickoff-Session, data science process, community, solution examples, Python introduction, Introduction to DOPP (5.10.2021)
  2. Data wrangling on the command line, Text stream processing (12.10.2021)
  3. SciPy, NumPy, vectorisation, visualisation, benchmarking (19.10.2021)
  4. Preprocessing, Pandas (9.11.2021)
  5. Data suitability, Data biases (16.11.2021)
  6. Intro to Machine Learning (23.11.2021)
  7. Network Analysis (30.11.2021)

Exercise-related sessions
Review meetings for exercise 3 (15 minutes for each group):

  • 14.12.2021, 9:00-16:00
  • 15.12.2021, 9:00-16:00

Project presentation: 25.1.2022, 9:00-18:00
 
The effort breakdown is:
Python tutorial: 4hLectures: 7 sessions @ 2h: 14hExercises:     EX1 (data wrangling): 8h    EX2 (pandas + sklearn): 12h     EX3 (project): 37h [includes review meeting (topic + questions + work plan)]SUM: 75h

Notes: 
Examination: 

<p>Three practical exercises. The third exercise requires a report, Jupyter Notebook, and presentation of the results.</p>

Recommendation: