Useful links and tools.


Data Mining and Machine Learning


Andrew’s Moore statistics and data mining tutorials

Gaussian Processes for Machine Learning (book)

Caltech Online Course: Learning from data

Web Services and standalone applications: DAME, Weka, Orange, VOStat

Books:

“Data Mining - Concepts and Techniques”, J. Han & M. Kamber, MK

“Neural Networks for Pattern Recognition”, C.M. Bishop, Oxford University Press

“Introduction to Information Retrieval”, Christopher D. Manning, Prabhakar Raghavan & Hinrich Schütze, 2008, Cambridge University Press


Programming Languages and DM Libraries

R: Official Manuals, Advanced R programming, R-Inferno, R-studio

Matlab: Official Documentation, SOM Toolbox, Bayesian Net Toolbox, Netlab

Other libraries: FANN, LibSVM


Software Architectures and Information Retrieval

Apache Tika (Content Detection and Analysis)
Joshua Decoder (Statistical Machine Translation)

Apache OODT (Big Data Processing)


Visualization Resources


Tools: d3js, plot.ly, Topcat, Processing


Modern Visualization Thinking:

https://vimeo.com/channels/544709

http://ieeevis.org/

http://infosthetics.com/

http://www.edwardtufte.com/tufte/


Historical Visualization Thinking:

http://en.wikipedia.org/wiki/%C3%89tienne-Jules_Marey

http://en.wikipedia.org/wiki/Charles_Joseph_Minard