I resurrected my blog. I took the idea for these regular posts from Tom MacWright. This is more interesting than posting content to different services like Letterboxd, Goodreads, Twitter, 500px, and so on.
I finished reading Surely You’re Joking Mr. Feynman, which I liked and wrote about.
I wanted to learn how to write better. Almost anyone in your life assumes that you know how to write. But nobody teaches you to write. If you are not a native speaker of English, you learn to do the opposite. Your teachers say that you need to use fancy words, idioms, and collocations to make your writing sound better. This is not how I want to write. I want to write like Paul Graham. My favorite thing about his essays is that I can imagine him talking. So I started reading On Writing Well by William Zinsser, which is a popular book on writing. It focuses on the simple and precise use of the English language for writing non-fiction.
I started the introduction course on reinforcement learning by David Silver. It is ten weeks long. I might finish it sooner, but I prefer spending time on better understanding ideas, so it might take longer. I also started reading Reinforcement Learning: An Introduction, which is a classic book on the subject, and David’s course is partly based on it.
Today, I implemented REINFORCE. It took me more than two months to understand how these 48 lines of code work. It's a policy-based algorithm, which means that it does not try to predict future rewards and choose actions that would produce the highest rewards. Instead, REINFORCE learns an optimal policy — taking an action given the current state. And this policy is approximated by a neural net.