If it's November, it must be NaNoGenMo

Which means it's NaNoGenMo - Nation Novel Generating Month.

For the 3rd year in a row I'm participating, although this might be the first time I've actually gotten around to writing it up?

Python on windows.

I run Windows at home, becuase I've always worked in a Windows shop.
I don't love Windows, but it works.

Now, Python I don't generally do. I've done some Python for classes and exploring other NaNoGenMo projects.
So, by no means an expert, much less even fluent.

I was trying to get this repo to work: https://github.com/mewo2/vocab-mashup

Aaaaand, I had a hard time doing so.

First, I removed my installation of Strawberry Perl, because it had an older copy of gcc that was causing issues.
Then I installed mingw via Xrad:Programming/Chocolatey.

I also needed Microsoft Visual C++ Compiler for Python 2.7 which I discovered at StackOverlflow.

Standard installs of scipy, numpy and gensim didn't work, and I apparently needed to get windows 64-bit binaries.
Which I snagged from www.lfd.uci.edu/~gohlke/pythonlibs/
They were downloaded locally, and then installed via something like the following:

pip install D:\downloads\scipy-0.16.1-cp27-none-win_amd64.whl

Now, I'm trying to figure out word2vec and the training data I need to run that application.

word2vec link-dump

https://code.google.com/p/word2vec/
http://alexminnaar.com/word2vec-tutorial-part-i-the-skip-gram-model.html
https://radimrehurek.com/gensim/models/word2vec.html

Discussed in here

Project Gutenberg for fun and profit

I downloaded the entire DVD .iso from the torrent.
I used the Deluge client.
Mounted the iso using WinCDEmu.
Once installed, double-click on .iso to mount.
I used all standard defaults, and --poof-- there it was.
Of course, there's an HTML browser and all of the texts are zipped.

Here are some (unevaluated) notes on extracting everything from the zips.

TODO: more notes on Gutenberg parsing. There are a number of utilities I want to play with. I've only worked with manually downloaded files in the past. Which hasn't been bad.