Posts
Showing posts from 2015
Anomaly Detection
- Get link
- X
- Other Apps
In this post i'd like to share references and articles which I came across while learning Anomaly Detection Techniques like blogs/ papers / patents/ Wikipedia information etc. Popular Anomaly Detection Techniques: Density-based techniques (k-nearest neighbor, local outlier factor, and many more variations of this concept). Knorr, E. M.; Ng, R. T.; Tucakov, V. (2000). "Distance-based outliers: Algorithms and applications". The VLDB Journal the International Journal on Very Large Data Bases 8 (3–4): 237. doi:10.1007/s007780050006. Ramaswamy, S.; Rastogi, R.; Shim, K. (2000). Efficient algorithms for mining outliers from large data sets. Proceedings of the 2000 ACM SIGMOD international conference on Management of data - SIGMOD '00. p. 427. doi:10.1145/342009.335437. ISBN 1581132174. Angiulli, F.; Pizzuti, C. (2002). Fast Outlier Detection in High Dimensional Spaces. Principles of Data Mining and Knowledge Discovery. Lecture Notes in Computer Sci...
Creating Service for starting/stopping Tomcat Unix
- Get link
- X
- Other Apps
Create the init script in /etc/init.d/tomcat7 with the contents as per below (your script should work too but I think this one adheres more closely to the standards). This way Tomcat will start only after network interfaces have been configured. Init script contents: #!/bin/bash ### BEGIN INIT INFO # Provides: tomcat7 # Required-Start: $network # Required-Stop: $network # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Start/Stop Tomcat server ### END INIT INFO PATH=/sbin:/bin:/usr/sbin:/usr/bin start() { sh /usr/share/tomcat7/bin/startup.sh } stop() { sh /usr/share/tomcat7/bin/shutdown.sh } case $1 in start|stop) $1;; restart) stop; start;; *) echo "Run as $0 <start|stop|restart>"; exit 1;; esac Change its permissions and add the correct symlinks automatically: chmod 755 /etc/init.d/tomcat7 update-rc.d tomcat7 defaults And from now on it will be automatically started and shut down upon entering the appropriate r...
Change Java Version on a Debian Machine
- Get link
- X
- Other Apps
1. Install OpenJDK 1.7 $ sudo apt-get install openjdk-7-jdk openjdk-7-jre 2. Update Java alternative path : $ sudo update-alternatives --config java There are 2 choices for the alternative java (providing /usr/bin/java). Selection Path Priority Status ------------------------------------------------------------ * 0 /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java 1061 auto mode 1 /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java 1061 manual mode 2 /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java 1051 manual mode Press enter to keep the current choice[*], or type selection number: 2 updat...
Spark Codes
- Get link
- X
- Other Apps
Clustering import org.apache.spark.mllib.clustering.KMeans import org.apache.spark.mllib.linalg.Vectors val data = sc.textFile("data.csv") val parsedData = data.map(s => Vectors.dense(s.split(',').map(_.toDouble))) val numClusters = 6 val numIterations = 300 val clusters = KMeans.train(parsedData, numClusters, numIterations) val WSSSE = clusters.computeCost(parsedData) println("Within Set Sum of Squared Errors = " + WSSSE) val labeledVectors = clusters.predict(parsedData) labelVectors.saveAsTextFile val centers = clusters.clusterCenters =================================================================== scala> textFile.count() // Number of items in this RDD res0: Long = 126 scala> textFile.first() // First item in this RDD res1: String = # Apache Spark scala> val linesWithSpark = textFile.filter(line => line.contains("Spark")) linesWithSpark: spark.RDD[String] = spark.FilteredRDD@7dd4af09 scala> te...
Raspberry Pi
- Get link
- X
- Other Apps
Static IP for Wifi sudo vi /etc/network/interfaces Add wifi string ID (Different from wifi SSID) auto lo iface lo inet loopback iface eth0 inet dhcp allow-hotplug wlan0 iface wlan0 inet manual wpa-roam /etc/wpa_supplicant/wpa_supplicant.conf iface default inet dhcp iface home_static inet static address 192.168.0.53 netmask 255.255.255.0 gateway 192.168.0.1
Perl
- Get link
- X
- Other Apps
Part 1 Recap of Datastructures Complex Data structures Anon CDS -Hashes Part 2 Regular Expressions Subroutines & Files Part 3 Modules CPAN/DBI/CGI/N-w ------------------------------------------------------- Perl identify the DS by the prefix symbol Scalar - $ - a value List - () Array - @ Hashes - % define a lexical scalar - my $a; default value scalar - undef how to check for undef - if(defined($a)){ } get input from keyboard - $a=<STDIN> # \n stops @arr=<STDIN> # EOF stops output to console - print STDOUT "Hello"; print "Hello"; Errors - print S...
Unix and Hadoop
- Get link
- X
- Other Apps
SCP to a seperate PORT scp -P 5050 asd.tar.gz user@192.168.1.15:/home/user Tomcat Webapps Location /usr/share/tomcat/webapps Get IP address ifconfig eth0 | awk '/inet /{print $2}' Get Tomcat logs tail -f /usr/share/apache-tomcat-7.0.30/logs/catalina.out tail -f -n 10 Running Hadoop Jobs set mapred.job.queue.name=dev hadoop jar acs.jar -D mapreduce.reduce.speculative=false -D mapreduce.map.speculative=false -D mapred.job.queue.name=dev -D mapreduce.task.io.sort.factor=256 -D file.pattern=.*20110110.* -D mapred.reduce.slowstart.completed.maps=0.9 -D mapred.reduce.tasks=10 /Input /Output hadoop jar test.jar -D mapred.job.queue.name=dev -D mapred.textoutputformat.separator=, Glassfish Webapps folder location /usr/share/glassfish4/glassfish/domains/domain1/applications Killing Mysql Processes ps -ef | grep mysql | awk -F" " '{system("kill -9 "$2)}' Starting MySQL /etc/init.d/mysqld start Mo...
Handling CSV
- Get link
- X
- Other Apps
import java.io.BufferedReader; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.IOException; public class ReadCVS { public static void main(String[] args) { ReadCVS obj = new ReadCVS(); obj.run(); } public void run() { String csvFile = "DailyData.csv"; BufferedReader br = null; String line = ""; String cvsSplitBy = ","; try { br = new BufferedReader(new FileReader(csvFile)); while ((line = br.readLine()) != null) { // use comma as separator String[] splits = line.split(cvsSplitBy); System.out.println(splits[4] +splits[5]); } } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { if (br != null) { try { br.close(); } catch (IOException e) { ...
Hadoop Part 1: Hello World
- Get link
- X
- Other Apps
Hadoop Hello World: The Word Count Code: The word count code is the simplest program to get you started with Map Reduce Framework. The task that a wordcount program performs is as follows: Given several text files find a count of number of times each word appears in the entire set It primarily consists of 3 parts: Driver : Driver portion of the code contains the configuration details for the Hadoop Job. For example the input path, the output path, number of reducers , mapper class name, reducer class name etc Mapper : Role of mapper in word count is to emit <word, 1> for each word appearing in the document. Reducer : Role of Reducer in word count is to sum the list of 1's prepared by shuffle and sort phase <word, [1,1,1,1,1,1]> and emit <word, 6> It's easier to create an eclipse java project and add relevant hadoop jar files for the code below. package com.kush; import java.io.IOException; import java.uti...