Recent Articles

Balancing the training dataset to a reported positive-negative class ratio, in the unseen dataset

A python gist for balancing (re-sampling) a training dataset to match a reported positive-negative class ratio, in the unseen dataset When we know the unseen’s pos-neg class ratio (or “guess” it from the LB..) we should give a try at balancing the training dataset, to reflect it. I wrote a python gist for it, using […]

Dolphin Community Detection; using Louvain and Edge-Betweenness-Centrality algorithms

I wanted to share my final data-science assignment from “Coursera Illinois Data Visualization” course Hi-Res Image Live demo – see the “Result” tab An un-directed social network of frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand The original data from https://networkdata.ics.uci.edu/data.php?id=6 My goal was to improve the original network plot from UCI You could […]

Dockerizing python with ssdeep dependency

The official  python docker container does exactly what it should. It automatically copies the requirements.txt  file and your current directory into /usr/src/app. It should then automatically pip install the dependencies from the requirements.txt, before running the actual Dockerfile commands But sometimes python libraries require additional system setups by themselves (one might consider it a pip bug..). For example, working with ssdeep, a […]

Installing Elastic 2.0 on a clean Ubuntu 14.04 (single node)

Tested on a clean Ubuntu 14.04 image: #java 8 – includes some interactive responses on your side sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer #test it: java -version #should be: java version “1.8.XX” #fixing apt-get, installation and post service config wget -qO – https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add – echo “deb http://packages.elastic.co/elasticsearch/2.x/debian […]

REST API security, by Stormpath

Excellent. Note how the author correctly sets the true meaning of http header name and codes, poorly chosen ages ago (“Authorization” header, 401 code, 403 code)

Python design patterns #5: observer (aka: pub-sub)

The observer is probably the GOF pattern with the most impact on networking. It is a massaging pattern by itself. It’s quite simple, though. The subscribers, interested with a certain topic, subscribe with a publisher for updates regarding that topic. The publisher, triggered by our topic-aware system, then publishes the topic updates to the right subscribers. Use case Every […]