Recently, Feb 2022

Feb 2022

I resurrected my blog. I took the idea for these regular posts from Tom MacWright. This is more interesting than posting content to different services like Letterboxd, Goodreads, Twitter, 500px, and so on.

I finished reading Surely You’re Joking Mr. Feynman, which I liked and wrote about.

I wanted to learn how to write better. Almost anyone in your life assumes that you know how to write. But nobody teaches you to write. If you are not a native speaker of English, you learn to do the opposite. Your teachers say that you need to use fancy words, idioms, and collocations to make your writing sound better. This is not how I want to write. I want to write like Paul Graham. My favorite thing about his essays is that I can imagine him talking. So I started reading On Writing Well by William Zinsser, which is a popular book on writing. It focuses on the simple and precise use of the English language for writing non-fiction.

I started the introduction course on reinforcement learning by David Silver. It is ten weeks long. I might finish it sooner, but I prefer spending time on better understanding ideas, so it might take longer. I also started reading Reinforcement Learning: An Introduction, which is a classic book on the subject, and David’s course is partly based on it.

Update 2022-05-23

Today, I implemented REINFORCE . It took me more than two months to understand how these 48 lines of code work. It's a policy-based algorithm, which means that it does not try to predict future rewards and choose actions that would produce the highest rewards. Instead, REINFORCE learns to take an action based on the current state, aiming for an optimal policy. This policy is approximated using a neural network.

The policy learns to balance the pole (starts at 20 sec)