Two amazing programs are now available on Git hub, which are significant developments in the availability of open source machine learning to the "normal person",
This code implements multi-layer Recurrent Neural Network (RNN, LSTM, and GRU) for training/sampling from character-level language models. The model learns to predict the probability of the next character in a sequence. In other words, the input is a single text file and the model learns to generate text like it.
The context of this code base is described in detail in my blog post. The project page that has a few pointers to some datasets.
If you are new to Torch/Lua/Neural Nets, it might be helpful to know that this code is really just a slightly more fancy version of this 100-line gist that I wrote in Python/numpy. The code in this repo additionally allows for multiple layers, uses an LSTM instead of an RNN, has more supporting code for model checkpointing, and is of course much more efficient.
This code was originally based on Oxford University Machine Learning class practical 6, which is in turn based on learning to execute code from Wojciech Zaremba. Chunks of it were also developed in collaboration with my labmate Justin Johnson.
ConceptNet aims to give computers access to common-sense knowledge, the kind of information that ordinary people know but usually leave unstated.
This Python package contains a toolset for loading new datasets into ConceptNet 5, and it serves the HTML and JSON Web APIs for it. You don't need it to simply access ConceptNet 5; see http://conceptnet5.media.mit.edu for more information.
If you're interested in using ConceptNet, please join the conceptnet-users Google group: http://groups.google.com/group/conceptnet-users?hl=en
Further documentation is available on the Wiki: https://github.com/commonsense/conceptnet5/wiki
ConceptNet is a multilingual knowledge base, representing words and phrases that people use and the common-sense relationships between them. The knowledge in ConceptNet is collected from a variety of resources, including crowd-sourced resources (such as Wiktionary and Open Mind Common Sense), games with a purpose (such as Verbosity and nadya.jp), and expert-created resources (such as WordNet and JMDict).