Letter #4 - In which we send a pull request with the three laws of robotics.
“You can only be young once. But you can always be immature.”
But don’t do it on purpose, otherwise it’s just weird.
You know I can’t change, I can’t change, I can’t change,
But I’m here in my mold, I am here in my mold.
And I’m a million different people from one day to the next
Back then, I don’t think I really appreciated how beautiful this song was, I just liked its sound. Such a pity about the whole Rolling Stones debacle, The Verve did deserve better than being left with zero royalties from the song. Here’s the original concert arrangement they sampled, if you wanna give it a listen.
The rule-of-thumb is that except in rare cases, domain-specific algorithms work faster and better than reinforcement learning. This isn’t a problem if you’re doing deep RL for deep RL’s sake, but I personally find it frustrating when I compare RL’s performance to, well, anything else. One reason I liked AlphaGo so much was because it was an unambiguous win for deep RL, and that doesn’t happen very often.
Reinforcement Learning is hard. Harder than Machine Learning. And it might not work for you. Or, if it does, it might trigger the robot apocalypse because you neglected to add an extra check to your reward function. Either way, you’re screwed. Might as well read this, too.
Not sure what you just read? Take a look at this post.
Enjoyed it? Subscribe to my tiny newsletter and get new issues straight to your inbox. Every week, on Tuesdays.
Yours truly, V.