~ read.

Semester 2: One Semester Away from Graduating

Semester 2 is finished at last! Last semester, my GPA was 4.8/5 (but actually after transferring basic courses that I already took in bachelor's degree, it drops quite significant) and this semester I got a bit better: 4.92/5.

This semester, I took 6 courses: Data Mining, Parallel Numerical Method, Intelligent Information System, Methodology and Ethical Aspects of Research, Master Thesis 1, and Physical Education (Swimming).

Data Mining

This course is probably the most famous course for Erasmus students. In this course, I learned about Frequent Itemsets, Association Rules, Sequential Patterns, Classification (Naive Bayes, Lazy Classification with Contrast Patterns), Clustering, and Functional Dependencies. This course consists of lectures, labs, and project. I have no problems at all with this course since I've learned most of the topics before (I used Apriori in my Bachelor's thesis to find topics in a set of papers), and I am very familiar with classification and clustering (I took AI and Machine Learning in Bachelor's and Pattern Recognition last semester). So things are obvious to me: I got 99% in written exam and 5 for the project.

The lecturer of this course is Prof. Marzena Kryszkiewicz. She is very detailed and organized in explaining the material from giving an example of the algorithm (using simple dummy data) to proving the theorem. I basically didn't do any extra works to understand every topics in the course. From my experience, I think female lecturers tend to be more detailed and organized. That's why we probably need more women in STEM (Science, Technology, Engineering, and Mathematics). For the project, we are allowed to choose the lecturer/instructor and which project we want to undertake from the list given by the lecturer. I took Logistic Regression to implement as the project. I am also familiar with it, it didn't take much time to finish it completely with unittests and demonstration using Jupyter Notebook (I'd recommend this to any Python programmers who are working on Machine Learning).

For the lab, we use R. I think probably half of the class fail this course because of the lab. Yes the instructor is quite strict with what he wants, but he always emphasized that we need to explore the data (not just using the libraries), which probably most students didn't do.

Parallel Numerical Method

This course, as expected, is an extension of numerical methods course. We are taught on how to run algorithms in parallel. The course contains methods to solve: triangularization (lower or upper triangular matrix is useful for many cases later on), inverse (with: fast direct inverse, fast iterative inverse, divide and conquer), linear systems (with: odd-even reduction, conjugate gradient method, classical iterative matrix: Jacobi, Gauss Seidel, and their overrelaxation variants), dual lagrange problem (including solving primal, dual, or solving one knowing the other's solution), linear programming (simplex, 2-phase simplex), Dantze-Wolfe Decomposition, Benders Decomposition, and the last is Branch and Bound for linear programming with integer solution. This course contains 2 parts: lectures and project.

What I like about this course is that I managed to learn new skill. I think I am able to solve any linear programming problems by hand now. Since in the class the lecturer (Dr. Andrzej Stachurski) mostly only explained about the concepts, I spent 2 hours/week to exercise. Math is a skill that needs to be exercised. Without exercise, it's similar to learning how to swim just by watching Youtube videos. There are 2 projects and 1 written exam. I managed to get 50/50 for the projects and 48/50 for the exam. The projects are quite easy since it's explained nicely in the lectures. I easily got a 5 from this course.

Intelligent Information System

In this course, we are taught by the director of research in the institute of CS, Prof. Mieczysław Muraszkiewicz. The course consists of lectures and project. The lectures, in general, discuss the definition of Intelligence, Information Systems, and Knowledge Representations (along with some representations: Logics, Non-Standard Logics, Semantic Networks, Semantic Atoms, Frames and Scripts, and Genetic Algorithms). I know some of them in a more technical sense (I took Logic of Informatics in Bachelor and Evolutionary Algorithm last semester), so it's not really hard to follow this topic, especially when the discussion is in a more general sense.

For the project we can choose the project implementing something proposed by the lecturer or propose our own project or even just making an essay. Since I am exploring some methods in image classification for master's thesis, I submitted an implementation of Bag of Words concept for image classification. It turns out quite good (around 78% accuracy for Kaggle's Dog vs Cat data).

There is a written exam and oral exam (based on the written test). I got 97/100 from the written exam and in the oral exam, Prof. Muraszkiewicz didn't ask me anything since he said that I got almost full score from the exam. So I only had to explain the project to him and got 5 right away. At the end of the oral exam, he asked me if I have already had plans after graduation, which quite struck me as surprising.

One side-benefit that I got from this course is that I found the 'right' way to study for exams that need memorization. I am always bad at subjects like history or social sciences since they need memorization. In the discussion of frames and scripts, it was mentioned that people have problems to recall story and make some necessary changes that will fit into their schemata even when the story is repeated many times. That fact inspires the researcher invented the concept of frames and scripts (and later the concept of OOP). What I did to prepare for 2 exams later is to list all the topics as if they are 'slots' in frames and try to recall what this topic talks about using my own words. It works wonder for the 2nd test of Methodological and Ethical Aspect of Research and this course's exam. It also helps me understand the whole topics in a bird's eye view and as a checklist if I understand all topics.

Methodology and Ethical Aspects of Research

Every Wednesday I woke up with some excitements: I would have this class (even though the class starts at 8am) and later in the evening a swimming class. In the beginning, we started the discussions around the philosophy of science and later moving on to discussions about research methodology, its ethical issues that may come along, and ended with Intellectual Property discussion.

Do you watch The Big Bang Theory and remember about infinite persistence gyroscope that Howard, Sheldon, and Leonard invented?

The TBBT Gyroscope

They had dilemma from the beginning: trying to patent it and found out that the majority of shares will be owned by the university (see the video here), when the U.S. Air Force contacted them (see the video here), and at the end the U.S. Air Force took over completely the project (see the video here). This course discusses all the ethical issues and dilemmas that you will find as a researcher similar to the case shown in The Big Bang Theory. The lecturer is Prof. Roman Morawski, even though he is an expert in metrology, he has a lot of experience about this topic that will engage you for the whole class sessions. At the very end of the course when he emailed the students that he closed his activities regarding this course, I replied it with honesty that it's worth it to attend every lectures. I just thought that he has to know that the course is great :)

In this course, there are 2 exams and class tutorials. I didn't get quite good result for the first exam (19/35. The class average before retake was only 16). But in the second test I got 42/45 and I ended up with 4.5 (no one gets 5). I was really happy with this that I didn't consider taking retake for the first exam. I told my friend, with tongue in cheek, that retake is only for losers (yes, I've never taken any retakes) :) That's a joke of course, I was simply happy with that grade and it took me quite a lot of preparation to get 42 on the second exam. Some questions ask about ethical issues or dilemmas and you need to have great arguments (not just good) to get a good point. You need to research and prepare arguments so that on the exam you just need to recall it. The exam is only 1 hour, so simply preparing the argument on the exams won't cut out since you will only have time to write answers.

The class tutorial is the part where students lead the discussion about particular topic. My classmate and I presented Trolley Problem. I got 20/20 for this part. My final score was 81 (exactly the lower boundary of 4.5 :)). But as I've mentioned earlier that the lecturer is a metrology expert, he added 4 points to the final score as an uncertainty for his evaluation. I still need 6 points though to get 5.

Master Thesis 1

For thesis, I am supervised by Rajmund Kożuszek. We started the exploration from probabilistic topic model (using Latent Dirichlet Allocation). In the process, I was curious what happened when we directly classify the bag of words model from the image descriptors. Then we explored more about this approach: using different descriptors algorithms (SIFT, KAZE, AKAZE), the significance of the number of vocabularies, and some ideas that have not been implemented yet e.g it might be worth it in the case of many-classes classification to use TF-IDF (instead of just TF) for the quantization of the descriptors. We also explored Convolutional Neural Network, some network models (VGG16, VGG19, ResNet), and using ImageNet pretrained models.

I'd say that there are some interesting findings we found along the way. For example, we can easily distinguish between dogs and cats using ImageNet pretrained model since there are quite a lot of cat or dog breeds in ImageNet 1K categories (so we can easily get 98% accuracy with just little effort in training). Classifying gender (using CodaLab Smile and Gender Classification dataset) with this approach is also quite good (around 81% accuracy). With some trainings, what the network does is that it will try to associate gender with those 1k categories (e.g if the person wears leather jacket, sunglasses, and a hat (these are categories in 1k ImageNet) this person will more likely be classified as a man). But with, for example, smile classification (whether the person in the image is smiling or not), using this approach won't get you anywhere (though it's quite good - around 70% accuracy) since it is much harder to associate the 1k categories in ImageNet into these classes (smiling or not).

Convolutional Neural Network is nice to get things done. But I don't see something that I could improve (or at least attempting some modifications) with this approach. So, I'll probably explore the Bag of Words concept more and pull my hair around this topic next semester.

Physical Education (Swimming)

There are many options to choose if you take this course. But mostly people take gym because there's no test and you can go to the gym, put your signature, and go home again :) At first I wanted to register Badminton as the sport. But it seems that they are only looking for professional athletes. When I registered they asked me (in Polish): "Are you good?". I was thinking, well I am an average in Indonesia, but if I look how Poles play (I played every Friday here), I think I am quite good. So I just said to them "Yes". They were still not sure and asked me: "Are you an athlete?", I said "No" and then they just said "Oh sorry, it's only for athletes". Very unfortunate, they should really watch me play first! :)

I've run half marathon and done some century rides (100K cycling). One other thing to go is quite obvious: swimming. I couldn't swim until a few years ago when I saw this video The first 20 hours - how to learn anything. Go watch that video and ask yourself what's popping up in your mind first that you want to learn. For me, it was swimming. So after watching that video, I would watch youtube videos about swimming and learned how to do breast-stroke in the morning before going to work. And yes, I could swim this style very well now and learned it in less than 20 hours. But breast-stroke is the most tiring and slowest of all. It would be crazy to do triathlon and swim with this style. Your legs will be tired and you still have to ride and run. That's why I registered for swimming.

One session is only 45 minutes, but it's tiresome. The instructor will ask you to do various drills and swim back stroke, front crawl, and breast stroke. We didn't really do butterfly much though. I really love this swimming session. My front crawl is getting better and faster now (my breast stroke was much faster before). I learn that kicking only contributes little to propulsion (in contrast to breast stroke) so I am making myself get used to kicking less now. In addition to this swimming class, I add one session/week. I will stick to this (2 sessions/week) and won't skip any session for at least one year and see if I am ready to do triathlon after that :)

The Aftermath

I've passed all the courses this semester and it leaves me with Master's Thesis 2 and 3, and one humanity course (English Culture) next semester. I hope I can do well next semester. To think about it, it is much much much easier to get good grades in PW (Politechnika Warszawska) here than in ITB (Institut Teknologi Bandung). There are many reasons for that, that perhaps I should make another post comparing two systems. I think it's fair comparing the two since both of them are the best technical universities in their own country.

Prof. Muraszkiewicz's question reminds me about what I want to do after graduation. Of course there's a desire to continue getting PhD. The main reason for this is that I really want to explore more about machine learning and be a machine learning engineer. On the other side, I need to think about my own financial situation. I need to work here and there with the scholarship that I have right now. Let's just see, perhaps something interesting would pop up :)

comments powered by Disqus
comments powered by Disqus