A!

Posts

Showing posts from 2015

In this post i'd like to share references and articles which I came across while learning Anomaly Detection Techniques like blogs/ papers / patents/ Wikipedia information etc. Popular Anomaly Detection Techniques: Density-based techniques (k-nearest neighbor, local outlier factor, and many more variations of this concept). Knorr, E. M.; Ng, R. T.; Tucakov, V. (2000). "Distance-based outliers: Algorithms and applications". The VLDB Journal the International Journal on Very Large Data Bases 8 (3–4): 237. doi:10.1007/s007780050006. Ramaswamy, S.; Rastogi, R.; Shim, K. (2000). Efficient algorithms for mining outliers from large data sets. Proceedings of the 2000 ACM SIGMOD international conference on Management of data - SIGMOD '00. p. 427. doi:10.1145/342009.335437. ISBN 1581132174. Angiulli, F.; Pizzuti, C. (2002). Fast Outlier Detection in High Dimensional Spaces. Principles of Data Mining and Knowledge Discovery. Lecture Notes in Computer Sci...

18 mistakes that kill Startups

May 11, 2015

Creating Service for starting/stopping Tomcat Unix

May 06, 2015

Create the init script in /etc/init.d/tomcat7 with the contents as per below (your script should work too but I think this one adheres more closely to the standards). This way Tomcat will start only after network interfaces have been configured. Init script contents: #!/bin/bash ### BEGIN INIT INFO # Provides: tomcat7 # Required-Start: $network # Required-Stop: $network # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Start/Stop Tomcat server ### END INIT INFO PATH=/sbin:/bin:/usr/sbin:/usr/bin start() { sh /usr/share/tomcat7/bin/startup.sh } stop() { sh /usr/share/tomcat7/bin/shutdown.sh } case $1 in start|stop) $1;; restart) stop; start;; *) echo "Run as $0 <start|stop|restart>"; exit 1;; esac Change its permissions and add the correct symlinks automatically: chmod 755 /etc/init.d/tomcat7 update-rc.d tomcat7 defaults And from now on it will be automatically started and shut down upon entering the appropriate r...

Change Java Version on a Debian Machine

May 05, 2015

1. Install OpenJDK 1.7 $ sudo apt-get install openjdk-7-jdk openjdk-7-jre 2. Update Java alternative path : $ sudo update-alternatives --config java There are 2 choices for the alternative java (providing /usr/bin/java). Selection Path Priority Status ------------------------------------------------------------ * 0 /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java 1061 auto mode 1 /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java 1061 manual mode 2 /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java 1051 manual mode Press enter to keep the current choice[*], or type selection number: 2 updat...

Spark Codes

April 16, 2015

Clustering import org.apache.spark.mllib.clustering.KMeans import org.apache.spark.mllib.linalg.Vectors val data = sc.textFile("data.csv") val parsedData = data.map(s => Vectors.dense(s.split(',').map(_.toDouble))) val numClusters = 6 val numIterations = 300 val clusters = KMeans.train(parsedData, numClusters, numIterations) val WSSSE = clusters.computeCost(parsedData) println("Within Set Sum of Squared Errors = " + WSSSE) val labeledVectors = clusters.predict(parsedData) labelVectors.saveAsTextFile val centers = clusters.clusterCenters =================================================================== scala> textFile.count() // Number of items in this RDD res0: Long = 126 scala> textFile.first() // First item in this RDD res1: String = # Apache Spark scala> val linesWithSpark = textFile.filter(line => line.contains("Spark")) linesWithSpark: spark.RDD[String] = spark.FilteredRDD@7dd4af09 scala> te...

Building Search Engine like Google

April 04, 2015

Interesting Reads: http://www.rose-hulman.edu/~bryan/googleFinalVersionFixed.pdf

Machine Intelligence / Learning Landscape

April 04, 2015

Raspberry Pi

March 25, 2015

Static IP for Wifi sudo vi /etc/network/interfaces Add wifi string ID (Different from wifi SSID) auto lo iface lo inet loopback iface eth0 inet dhcp allow-hotplug wlan0 iface wlan0 inet manual wpa-roam /etc/wpa_supplicant/wpa_supplicant.conf iface default inet dhcp iface home_static inet static address 192.168.0.53 netmask 255.255.255.0 gateway 192.168.0.1

Perl

March 25, 2015

Part 1 Recap of Datastructures Complex Data structures Anon CDS -Hashes Part 2 Regular Expressions Subroutines & Files Part 3 Modules CPAN/DBI/CGI/N-w ------------------------------------------------------- Perl identify the DS by the prefix symbol Scalar - $ - a value List - () Array - @ Hashes - % define a lexical scalar - my $a; default value scalar - undef how to check for undef - if(defined($a)){ } get input from keyboard - $a=<STDIN> # \n stops @arr=<STDIN> # EOF stops output to console - print STDOUT "Hello"; print "Hello"; Errors - print S...

Unix and Hadoop

March 25, 2015

SCP to a seperate PORT scp -P 5050 asd.tar.gz user@192.168.1.15:/home/user Tomcat Webapps Location /usr/share/tomcat/webapps Get IP address ifconfig eth0 | awk '/inet /{print $2}' Get Tomcat logs tail -f /usr/share/apache-tomcat-7.0.30/logs/catalina.out tail -f -n 10 Running Hadoop Jobs set mapred.job.queue.name=dev hadoop jar acs.jar -D mapreduce.reduce.speculative=false -D mapreduce.map.speculative=false -D mapred.job.queue.name=dev -D mapreduce.task.io.sort.factor=256 -D file.pattern=.*20110110.* -D mapred.reduce.slowstart.completed.maps=0.9 -D mapred.reduce.tasks=10 /Input /Output hadoop jar test.jar -D mapred.job.queue.name=dev -D mapred.textoutputformat.separator=, Glassfish Webapps folder location /usr/share/glassfish4/glassfish/domains/domain1/applications Killing Mysql Processes ps -ef | grep mysql | awk -F" " '{system("kill -9 "$2)}' Starting MySQL /etc/init.d/mysqld start Mo...

Free Dashboards

February 25, 2015

http://webdesign.tutsplus.com/tutorials/build-a-dynamic-dashboard-with-chartjs--webdesign-14363 http://keen.github.io/dashboards/examples/ http://usebootstrap.com/theme/sb-admin

Handling CSV

February 25, 2015

import java.io.BufferedReader; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.IOException; public class ReadCVS { public static void main(String[] args) { ReadCVS obj = new ReadCVS(); obj.run(); } public void run() { String csvFile = "DailyData.csv"; BufferedReader br = null; String line = ""; String cvsSplitBy = ","; try { br = new BufferedReader(new FileReader(csvFile)); while ((line = br.readLine()) != null) { // use comma as separator String[] splits = line.split(cvsSplitBy); System.out.println(splits[4] +splits[5]); } } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { if (br != null) { try { br.close(); } catch (IOException e) { ...

Hadoop Part 1: Hello World

January 19, 2015

Hadoop Hello World: The Word Count Code: The word count code is the simplest program to get you started with Map Reduce Framework. The task that a wordcount program performs is as follows: Given several text files find a count of number of times each word appears in the entire set It primarily consists of 3 parts: Driver : Driver portion of the code contains the configuration details for the Hadoop Job. For example the input path, the output path, number of reducers , mapper class name, reducer class name etc Mapper : Role of mapper in word count is to emit <word, 1> for each word appearing in the document. Reducer : Role of Reducer in word count is to sum the list of 1's prepared by shuffle and sort phase <word, [1,1,1,1,1,1]> and emit <word, 6> It's easier to create an eclipse java project and add relevant hadoop jar files for the code below. package com.kush; import java.io.IOException; import java.uti...

Search This Blog

A!

Posts

Choosing Statistical Model

Anomaly Detection

18 mistakes that kill Startups

Creating Service for starting/stopping Tomcat Unix

Change Java Version on a Debian Machine

Spark Codes

Building Search Engine like Google

Machine Intelligence / Learning Landscape

Raspberry Pi

Perl

Unix and Hadoop

Free Dashboards

Handling CSV

Hadoop Part 1: Hello World