I recently gave a tutorial on getting started with word embeddings in Python to a digital humanities group. The tutorial covers material from 15 (vector semantics) and 16 (semantics with dense vectors) from Speech and Language Processing. The data set is ~30,000 Supreme Court opinions provided by CourtListener. The repository comes with a small data set loaded and instructions for getting more data from CourtListener.

You can find the tutorial/instructions/additional resources at: https://github.com/idc9/word_embed_tutorial