Tuesday, April 5, 2016

Ubuntu - VM player: Ubuntu guest keeps going back to login screen on login


Scenario: Any ubuntu vm keeps going back to login screen even after entering valid credentials

Add line

mks.gl.allowBlacklistedDrivers = "TRUE"
to the VMX file.
http://askubuntu.com/questions/443474/how-can-i-enable-3d-acceleration-for-ubuntu-as-a-vmware-guest?lq=1

Friday, April 1, 2016

Image PDF to text PDF using OCR

Usecase: I have a PDF multi-page document that was compiled by scanning the physical document. I want to read that in my kindle and make notes.

The problem: Since the PDF is made up of images, kindle renders the pages as images. Hence, the text in the pages was not selectable.

Solution:
Use OCR.

Ref:
http://ubuntuforums.org/showthread.php?t=880471

Steps:

  1. pdftoppm generates a 100MB ppm per page. Should ideally iterate per page and delete
  2. convert ppm to tif: tesseract accepts tif
  3. Use tesseract for OCR: generates txt
  4. Append all txt to output file
  5. Create pdf out of the txt.


Script:
#!/bin/sh
mkdir tmp
cp $@ tmp
cd tmppdftoppm * -f 1 -l 10 -r 600 ocrbookfor i in *.ppm; do     convert "$i" "`basename "$i" .ppm`.tif";     tesseract "$i" "`basename "$i" .tif`" -l eng;    cat "`basename "$i" .txt`" >> pdf-ocr-output.txt;    echo "[pagebreak]" >> pdf-ocr-output.txt;done mv pdf-ocr-output.txt ..rm *cd ..rmdir tmp



Sunday, August 31, 2014

android studio The project is using an unsupported version of the Android Gradle plug-in 0.10

Using Android Studio
Projects generated using libGDX


vim build.gradle
locate line
buildscript {...
    classpath 'com.android.tools.build:gradle:0.10+'
}

Change it to

classpath 'com.android.tools.build:gradle:0.12+'


That's it.

Tuesday, August 12, 2014

Oracle JDK : how to create deb

http://www.santasoft.it/en/blog/20120910_flash-install-oracle-java-7-jdk-debian-way :

sudo apt-get install java-package

Download JDK tar.gz from Oracle's website:
http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html

fakeroot make-jpkg jdk-7u60-linux-x64.tar.gz

A deb file is created in the same directory. Install the created deb file using
sudo dpkg -i oracle-j2sdk1.7_1.7.0+update60_amd64.deb

 sudo update-alternatives --config java

Other references:
https://wiki.debian.org/JavaPackage

oracle-jdk-7 403 forbidden or The request or reply is too large


While trying to install cloudera manager I keep getting 403 Forbidden when it tries to install oracle-jdk -7.
When  I tried downloading the file manually I kept getting "The request or reply is too large."

1. One potential solution:
http://askubuntu.com/questions/453367/download-failed-oracle-jdk-7-is-not-installed

after getting error 403, use ls:
ls /var/cache/oracle-jdk7-installer/

download the JDK tar.gz from someplace else on web (google archive filename) and compare its md5 checksum against the one given on oracle's website.

sudo cp ~/Downloads/jdk-7u55-linux-x64.tar.gz /var/cache/oracle-jdk7-installer/

sudo apt-get install oracle-java7-installer

2. Found out why it was happening:
The local network went through squid proxy that had a 100MB limit. The JDK archive is 130+ MB
Moved to another network and it started installing :P

Friday, June 20, 2014

Mysql select filter by partition - partition by date

Mysql 5.6 supports PARTITION clause in the query itself.
Mysql 5.5 relies on something called as partition pruning.

>show create table mytable
`id` int(11) NOT NULL AUTO_INCREMENT,
`created_at` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
...

PARTITION BY LIST (DAYOFMONTH(created_at))
(PARTITION day1 VALUES IN (1) ENGINE = InnoDB,
 PARTITION day2 VALUES IN (2) ENGINE = InnoDB,
...
PARTITION day31 VALUES IN (31) ENGINE = InnoDB)


explain partitions select count(*) from mytable where created_at = '2014-06-19'
explain partitions select count(*) from mytable where creation_date between '2014-06-19' and '2014-06-19'
Uses partition day19
Returns only those rows that have created_at  exactly equal to 2014-06-19 00:00:00'

explain partitions select count(*) from mytable where date(created_at) = '2014-06-19'
Correctly returns all rows of 19th June
But uses all partitions day1 to day31. Apparently this happens if you use any function on the LHS the where clause.


explain partitions  select count(*) from mytable where timestamp >= '2014-05-13' and time
stamp <= '2014-05-14';
Uses all partitions. Fails to prune partitions. All rows of 13th May.

Obviously I am screwed. Relying on indices for any performance improvement.
Apparently designing partitions on date has this drawback where your actual cell value is more granular than your partition definition. Here my actual value are timestamps accurate to seconds while my partition is defined on day of month(1-31).

This was used to leverage faster purging of data by truncating day-wise partitions. However leveraging any form of performance improvement on select queries seems impossible.

Thursday, June 13, 2013

linux screen set session title and window after creating a session

http://stackoverflow.com/questions/3202111/set-a-name-for-screens-with-the-screen-command

Session title:
"Ctrl+A" -> ":"
type in "sessionname ABCD"
Visible when you do screen -ls

Window title:
"Ctrl+A" -> "Shift+A"
type in a title
Visible when you do CtlrA->Shift[double quote]