If it's November, it must be NaNoGenMo
Which means it's NaNoGenMo - Nation Novel Generating Month.
For the 3rd year in a row I'm participating, although this might be the first time I've actually gotten around to writing it up?
Python on windows.
I run Windows at home, becuase I've always worked in a Windows shop.
I don't love Windows, but it works.
Now, Python I don't generally do. I've done some Python for classes and exploring other NaNoGenMo projects.
So, by no means an expert, much less even fluent.
I was trying to get this repo to work: https://github.com/mewo2/vocab-mashup
Aaaaand, I had a hard time doing so.
First, I removed my installation of Strawberry Perl, because it had an older copy of gcc
that was causing issues.
Then I installed mingw
via Xrad:Programming/Chocolatey.
I also needed Microsoft Visual C++ Compiler for Python 2.7 which I discovered at StackOverlflow.
Standard installs of scipy
, numpy
and gensim
didn't work, and I apparently needed to get windows 64-bit binaries.
Which I snagged from www.lfd.uci.edu/~gohlke/pythonlibs/
They were downloaded locally, and then installed via something like the following:
pip install D:\downloads\scipy-0.16.1-cp27-none-win_amd64.whl
Now, I'm trying to figure out word2vec
and the training data I need to run that application.
word2vec
link-dump
https://code.google.com/p/word2vec/
http://alexminnaar.com/word2vec-tutorial-part-i-the-skip-gram-model.html
https://radimrehurek.com/gensim/models/word2vec.html
Discussed in here
Project Gutenberg for fun and profit
I downloaded the entire DVD .iso
from the torrent.
I used the Deluge client.
Mounted the iso using WinCDEmu.
Once installed, double-click on .iso
to mount.
I used all standard defaults, and --poof-- there it was.
Of course, there's an HTML browser and all of the texts are zipped.
Here are some (unevaluated) notes on extracting everything from the zips.
TODO: more notes on Gutenberg parsing. There are a number of utilities I want to play with. I've only worked with manually downloaded files in the past. Which hasn't been bad.
Recent Comments