Archive for the ‘Linux’ Category

Detecting a videos dimensions using PHP and FFMPEG

Posted on March 16th, 2009 in Linux, PHP | 3 Comments »

Here is a method, that I pulled out of one of my video conversion classes.

All you need to do is set the location of ffmpeg ($this->ffmpeg)

It will then parse and return the video dimensions. More video methods to follow :)


/**
* Get the dimensions of a video file
*
* @param unknown_type $video
* @return array(width,height)
* @author Jamie Scott
*/
function __get_video_dimensions($video = false) {

if (file_exists ( $video )) {
$command = $this->ffmpeg . ' -i ' . $video . ' -vstats 2>&1';
$output = shell_exec ( $command );

$result = ereg ( '[0-9]?[0-9][0-9][0-9]x[0-9][0-9][0-9][0-9]?', $output, $regs );

if (isset ( $regs [0] )) {
$vals = (explode ( 'x', $regs [0] ));
$width = $vals [0] ? $vals [0] : null;
$height = $vals [1] ? $vals [1] : null;
return array ('width' => $width, 'height' => $height );
} else {
return false;
}
} else {

return false;
}

}

Share/Save/Bookmark

Converting a PDF file to text. Indexing a PDF file using poppler

Posted on March 10th, 2009 in Linux | 4 Comments »

This is another problem that I came across, as unoconv doesn’t let you convert a PDF to anything.

I found some more cool tools for handling PDF files.

The package is called poppler and can be installed on linux or if you are using OSX there is a macport.

Poppler is a PDF rendering library based xpdf

To install on linux run the command

yum install poppler-utils

On the mac install macports then run

sudo port install poppler

Now you have the following binaries at your disposal.

pdftotext, pdftohtml

You can generate a text version of a pdf by running

pdftotext -enc UTF-8 -eol unix input.pdf output.txt

I’m writing a PHP script that will talk to these binaries. I will upload soon.

Here is the man page for the binary:

NAME
pdftotext – Portable Document Format (PDF) to text converter (version 3.00)

SYNOPSIS
pdftotext [options] [PDF-file [text-file]]

DESCRIPTION
Pdftotext converts Portable Document Format (PDF) files to plain text.

Pdftotext reads the PDF file, PDF-file, and writes a text file, text-file. If text-file is not specified, pdftotext con-
verts file.pdf to file.txt. If text-file is ´-’, the text is sent to stdout.

CONFIGURATION FILE
Pdftotext reads a configuration file at startup. It first tries to find the user’s private config file, ~/.xpdfrc. If
that doesn’t exist, it looks for a system-wide config file, /etc/xpdf/xpdfrc. See the xpdfrc(5) man page for details.

OPTIONS
Many of the following options can be set with configuration file commands. These are listed in square brackets with the
description of the corresponding command line option.

-f number
Specifies the first page to convert.

-l number
Specifies the last page to convert.

-layout
Maintain (as best as possible) the original physical layout of the text. The default is to ´undo’ physical layout
(columns, hyphenation, etc.) and output the text in reading order.

-raw Keep the text in content stream order. This is a hack which often “undoes” column formatting, etc. Use of raw
mode is no longer recommended.

-htmlmeta
Generate a simple HTML file, including the meta information. This simply wraps the text in

 and

and
prepends the meta headers.

-enc encoding-name
Sets the encoding to use for text output. The encoding-name must be defined with the unicodeMap command (see
xpdfrc(5)). The encoding name is case-sensitive. This defaults to “Latin1″ (which is a built-in encoding).
[config file: textEncoding]

-eol unix | dos | mac
Sets the end-of-line convention to use for text output. [config file: textEOL]

-nopgbrk
Don’t insert page breaks (form feed characters) between pages. [config file: textPageBreaks]

-opw password
Specify the owner password for the PDF file. Providing this will bypass all security restrictions.

-upw password
Specify the user password for the PDF file.

-q Don’t print any messages or errors. [config file: errQuiet]

-cfg config-file
Read config-file in place of ~/.xpdfrc or the system-wide config file.

-v Print copyright and version information.

-h Print usage information. (-help and –help are equivalent.)

BUGS
Some PDF files contain fonts whose encodings have been mangled beyond recognition. There is no way (short of OCR) to
extract text from these files.

EXIT CODES
The Xpdf tools use the following exit codes:

0 No error.

1 Error opening a PDF file.

2 Error opening an output file.

3 Error related to PDF permissions.

99 Other error.

AUTHOR
The pdftotext software and documentation are copyright 1996-2004 Glyph & Cog, LLC.

SEE ALSO
xpdf(1), pdftops(1), pdfinfo(1), pdffonts(1), pdftoppm(1), pdfimages(1), xpdfrc(5)
http://www.foolabs.com/xpdf/

Share/Save/Bookmark

I’ve been asked to rewrite hostingspeeds.com v3

Posted on March 6th, 2009 in Linux | No Comments »

Taken from hostingspeeds.com

‘February 1, 2009 – “Thanks for your email. That’s great news.” winning coder Jamie Scott and owner of smudge-it.co.uk replied today when he found out he won. Jamie will code hostingspeeds.com into a full featured web 2.0 hosting speed measuring service for webmasters and web hosts with plugins for Wordpress, Joomla and other great blog and portal scripts ppl use these days.

Jamie Scott has won because of his programming skills, his experience with web hosting business and his understanding of what web 2.0 is all about. I thank all of you that offered to code v3.’ :)

I have already started rewriting the graphing code, so that the data is stored in RRD databases and there will be pretty graphs using RRDtool.

Example speed graph

The system will be fully scalable, by using a worker node system to poll the information and will encorporate a complex advertising and server monitoring system.

Share/Save/Bookmark

Converting a Doc to PDF, txt or HTML using PHP and Linux

Posted on March 6th, 2009 in Cakephp, Linux, PHP | 5 Comments »

This has been an issue that has bothered me for a while. I finally found a solution that worked and doesn’t kill your server in the process.

I give to two words. OpenOffice, or is that one ?

This is what I’m running for this test:

OS: CentOS release 5.2 (Final)
PHP: PHP 5.2.8
Openoffice 1.2.3

Firstly I installed several programs using yum. You will need to use DAG’s repo:

rpm -Uhv http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS//rpmforge-release-0.3.6-1.el5.rf.x86_64.rpm

yum install unoconv openoffice.org-headless openoffice.org-writer

unoconv is a handy tool that can be run as a demon and talk to the open office binary, via the command line.

In order to run the commands via apache you need to change the apache home directory and make it writable.

mkdir /home/apache
chown apache:apache /home/apache
usermod -d /home/apache apache
chmd 755 /home/apache

Now the apache user can create the hidden .openoffice.org2.0 directory.

With the setup out of the wa,y we need to start the open office deamon.

I did this as root but you could start this as apache.

unoconv –listener &

This basically creates the following deamon

soffice.bin -nologo -nodefault -accept=socket,host=localhost,port=2002;urp;StarOffice.ComponentContext

You can now send requests to port 2002 using unoconv

/usr/bin/unoconv --server localhost --port 2002 --stdout -f pdf input.doc

This will output the PDF file to the stdout.

Here is a cakephp component that I wrote to talk to unoconv. Please note this is very alpha and has only had a small amount of testing but works :) If you want to use it you must create these directories in your cake install.

‘TMP_FOLDER’, TMP . ‘filegenerator/’
ROOT . ‘/uploads/generatedpdfs/’
ROOT . ‘/uploads/docfiles/’

It can be used via a form upload


$this->Filegenerator = new FilegeneratorComponent ($this->params["form"]['uploaddocfile']);
// if the filegenerator did all it's magic ok then process
if($this->Filegenerator){

// returns the text version of the PDF
$text = $this->Filegenerator->convertDocToTxt();
// returns the html of the PDF
$html = $this->Filegenerator->convertDocToHtml();
// returns the generated pdf file
$pdf = $this->Filegenerator->convertDocToPdf($doc_id);

}

The component called filegenerator.php


<?php
/**
* Class Used to convert files.
*@author jamiescott.net
*/
class FilegeneratorComponent extends Object {

// input folder types
private $allowable_files = array ('application/msword' => 'doc' );
// variable set if the constuctor loaded correctly.
private $pass = false;
// store the file info from constuctor reference
private $fileinfo;

/**
* Enter description here...
*
* @param array $fileinfo
* Expected :
* (
[name] => test.doc
[type] => application/msword
[tmp_name] => /Applications/MAMP/tmp/php/php09PYNO
[error] => 0
[size] => 79360
)
*
*
* @return unknown
*/
function __construct($fileinfo) {

// folder to process all the files etc
define ( 'TMP_FOLDER', TMP . 'filegenerator/' . $this->generatefoldername () . '/' );

// where unoconv is installed
define ( 'UNOCONV_PATH', '/usr/bin/unoconv' );
// where to store pdf files
define ( 'PDFSTORE', ROOT . '/uploads/generatedpdfs/' );
// where to store doc files
define ( 'DOCSTORE', ROOT . '/uploads/docfiles/' );
// apache home dir
define ( 'APACHEHOME', '/home/apache' );
// set some shell enviroment vars
putenv ( "HOME=".APACHEHOME );
putenv ( "PWD=".APACHEHOME );

// check the file info is passed the tmp file is there and the correct file type is set
// and the tmp folder could be created
if (is_array ( $fileinfo ) &amp;amp;&amp;amp; file_exists ( $fileinfo ['tmp_name'] ) &amp;amp;&amp;amp; in_array ( $fileinfo ['type'], array_keys ( $this->allowable_files ) ) &amp;amp;&amp;amp; $this->createtmp ()) {

// bass by reference
$this->fileinfo = &amp;amp;$fileinfo;
// the constuctor ran ok
$this->pass = true;
// return true to the instantiation
return true;

} else {
// faild to instantiate
return false;

}

}

/**
*      * takes the file set in the constuctor and turns it into a pdf
* stores it in /uploads/docfiles and returns the filename
*
* @return filename if pdf was generated
*/
function convertDocToPdf($foldername=false) {

if ($this->pass) {

// generate a random name
$output_pdf_name = $this->generatefoldername () . '.pdf';

// move it to the tmp folder for processing
if (! copy ( $this->fileinfo ['tmp_name'], TMP_FOLDER . 'input.doc' ))
die ( 'Error copying the doc file' );

$command = UNOCONV_PATH;
$args = ' --server localhost --port 2002 --stdout -f pdf ' . TMP_FOLDER . 'input.doc';

$run = $command . $args;

//echo $run; die;
$pdf = shell_exec ( $run );
$end_of_line = strpos ( $pdf, "\n" );
$start_of_file = substr ( $pdf, 0, $end_of_line );

if (! eregi ( '%PDF', $start_of_file ))
die ( 'Error Generating the PDF file' );

if(!file_exists(PDFSTORE.$foldername)){
mkdir(PDFSTORE.$foldername);
}

// file saved
if(!$this->_createandsave($pdf, PDFSTORE.'/'.$foldername.'/', $output_pdf_name)){
die('Error Saving The PDF');
}

return $output_pdf_name;

}

}

/**
* Return a text version of the Doc
*
* @return unknown
*/
function convertDocToTxt() {

if ($this->pass) {

// move it to the tmp folder for processing
if (! copy ( $this->fileinfo ['tmp_name'], TMP_FOLDER . 'input.doc' ))
die ( 'Error copying the doc file' );

$command = UNOCONV_PATH;
$args = ' --server localhost --port 2002 --stdout -f txt ' . TMP_FOLDER . 'input.doc';

$run = $command . $args;

//echo $run; die;
$txt = shell_exec ( $run );

// guess that if there is less than this characters probably an error
if (strlen($txt) < 10)
die ( 'Error Generating the TXT' );

// return the txt from the PDF
return $txt;

}

}

/**
* Convert the do to heml and return the html
*
* @return unknown
*/
function convertDocToHtml() {

if ($this->pass) {

// move it to the tmp folder for processing
if (! copy ( $this->fileinfo ['tmp_name'], TMP_FOLDER . 'input.doc' ))
die ( 'Error copying the doc file' );

$command = UNOCONV_PATH;
$args = ' --server localhost --port 2002 --stdout -f html ' . TMP_FOLDER . 'input.doc';

$run = $command . $args;

//echo $run; die;
$html= shell_exec ( $run );
$end_of_line = strpos ( $html, "\n" );
$start_of_file = substr ( $html, 0, $end_of_line );

if (! eregi ( 'HTML', $start_of_file ))
die ( 'Error Generating the HTML' );

// return the txt from the PDF
return $html;

}

}
/**
* Create file and store data
*
* @param unknown_type $data
* @param unknown_type $location
* @return unknown
*/
function _createandsave($data, $location, $file) {

if (is_writable ( $location )) {

// In our example we're opening $filename in append mode.
// The file pointer is at the bottom of the file hence
// that's where $somecontent will go when we fwrite() it.
if (! $handle = fopen ( $location.$file, 'w' )) {
trigger_error("Cannot open file ($location$file)");
return false;
}

// Write $somecontent to our opened file.
if (fwrite ( $handle, $data ) === FALSE) {
trigger_error("Cannot write to file ($location$file)");
return false;
}

fclose ( $handle );
return true;

} else {
trigger_error("The file $location.$file is not writable");
return false;
}

}

function __destruct() {

// remove the tmp folder

if (file_exists ( TMP_FOLDER ) &amp;amp;&amp;amp; strlen ( TMP_FOLDER ) > 4)
$this->removetmp ();

}

/**
* Create the tmp directory to hold and process the files
*
* @return unknown
*/
function createtmp() {

if (is_writable ( TMP )) {

if (mkdir ( TMP_FOLDER ))
return true;

} else {

return false;
}

return false;

}

/**
* Delete the tmp dir
*
* @return unknown
*/
function removetmp() {

if (strlen ( TMP_FOLDER ) > 3 &amp;amp;&amp;amp; file_exists ( TMP_FOLDER )) {

if ($this->recursive_remove_directory ( TMP_FOLDER ))
return true;

}

return false;
}

/**
* Return a rendom string for the folder name
*
* @return unknown
*/
function generatefoldername() {

return md5 ( microtime () );

}

/**
* Recursivly delete directroy or empty it
*
* @param unknown_type $directory
* @param unknown_type $empty
* @return unknown
*/
function recursive_remove_directory($directory, $empty = FALSE) {
// if the path has a slash at the end we remove it here
if (substr ( $directory, - 1 ) == '/') {
$directory = substr ( $directory, 0, - 1 );
}

// if the path is not valid or is not a directory ...
if (! file_exists ( $directory ) || ! is_dir ( $directory )) {
// ... we return false and exit the function
return FALSE;

// ... if the path is not readable
} elseif (! is_readable ( $directory )) {
// ... we return false and exit the function
return FALSE;

// ... else if the path is readable
} else {

// we open the directory
$handle = opendir ( $directory );

// and scan through the items inside
while ( FALSE !== ($item = readdir ( $handle )) ) {
// if the filepointer is not the current directory
// or the parent directory
if ($item != '.' &amp;amp;&amp;amp; $item != '..') {
// we build the new path to delete
$path = $directory . '/' . $item;

// if the new path is a directory
if (is_dir ( $path )) {
// we call this function with the new path
recursive_remove_directory ( $path );

// if the new path is a file
} else {
// we remove the file
unlink ( $path );
}
}
}
// close the directory
closedir ( $handle );

// if the option to empty is not set to true
if ($empty == FALSE) {
// try to delete the now empty directory
if (! rmdir ( $directory )) {
// return false if not possible
return FALSE;
}
}
// return success
return TRUE;
}
}
}
?>

Share/Save/Bookmark

Installing PHP 5.2.5, Suhosin, PHP-Eaccelerator on Centos 4 with YUM

Posted on November 17th, 2007 in Linux, PHP | 15 Comments »

This is currently no longer supported as I’m now using Centos 5

You will find similar, but updated rpms here http://blog.famillecollet.com/

I’ve created a repo containing the new PHP 5.2.5 rpms and other extensions.

My yum repo currently only supports i386 architectures and centos / el4.

To upgrade to php 5.2.5 please run:

wget http://www.smudge-it.co.uk/pub/yum/RPM-GPG-KEY-smudge
rpm –import RPM-GPG-KEY-smudge

Add my repository by creating the file
vi /etc/yum.repos.d/smudge.repo

[smudgeit]
name=Smudge IT RPMS for Centos 4 – $basearch
baseurl=http://www.smudge-it.co.uk/pub/yum/centos/4/$basearch/
enabled=1
gpgcheck=1

Then run yum update php.

I have the following files in my repo:

pcre-6.6-1.1.i386.rpm php-gd-5.2.5-1.i386.rpm php-pdo-5.2.5-1.i386.rpm
pcre-devel-6.6-1.1.i386.rpm php-imap-5.2.5-1.i386.rpm php-pear-1.6.1-2.noarch.rpm
php-5.2.5-1.i386.rpm php-ldap-5.2.5-1.i386.rpm php-pgsql-5.2.5-1.i386.rpm
php-bcmath-5.2.5-1.i386.rpm php-mbstring-5.2.5-1.i386.rpm php-snmp-5.2.5-1.i386.rpm
php-cli-5.2.5-1.i386.rpm php-mcrypt-5.2.5-1.i386.rpm php-soap-5.2.5-1.i386.rpm
php-common-5.2.5-1.i386.rpm php-mhash-5.2.5-1.i386.rpm php-suhosin-0.9.20-1.i386.rpm
php-dba-5.2.5-1.i386.rpm php-mssql-5.2.5-1.i386.rpm php-tidy-5.2.5-1.i386.rpm
php-devel-5.2.5-1.i386.rpm php-mysql-5.2.5-1.i386.rpm php-xml-5.2.5-1.i386.rpm
php-eaccelerator-5.2.5_0.9.5.1-1.i386.rpm php-ncurses-5.2.5-1.i386.rpm php-xmlrpc-5.2.5-1.i386.rpm
php-embedded-5.2.5-1.i386.rpm php-odbc-5.2.5-1.i386.rpm

You can install other extensions such as:

yum install php-eaccelerator
yum install php-suhosin

Edit /etc/php.d/suhosin.ini to->

extension=suhosin.so
suhosin.session.encrypt = Off

Now you have a more secure php installation.

*** Updated sqlite2 packages. If you were having issues installing php-pdo then do

yum clean all

and then

yum install sqlite2

Share/Save/Bookmark

Email alerts from Dell Poweredge using omreport

Posted on November 14th, 2007 in Dell Poweredge, Linux | No Comments »

Dell Poweredge Servers come with tools to monitor the hardware and driver updates. They have called it, OpenManage Server Administrator and this is how you can get it running. (if using yum and Centos)

wget -q -O – http://linux.dell.com/repo/hardware/bootstrap.cgi | bash

yum install srvadmin-all

srvadmin-services.sh start.

If all goes fine, you should be able to view a web interface via port 1311.

https://localhost:1311

I then had to log in with my root user name and password. I was pleased with the amount of features incorporated in Open Manage.

To setup the alerts you can click on Alerts Management > Then on the individual sensor. I chose to select, Execute application, so that I can have the software email me a complete list of errors detected.

Here is the script I use to email me:

#!/bin/bash

ps -ef >/tmp/psout.txt 2>&1

omreport system alertlog > /tmp/alertmsg.txt 2>&1

mail -s “Server Alert” root </tmp/alertmsg.txt> /tmp/mailout.txt 2>&1

Obviously substitute root with your email. Now when the server detects, will receive a printout of processes running at the time of the incident and a copy of the log file.

Share/Save/Bookmark

Minimal services on Redhat / Centos

Posted on November 14th, 2007 in Linux | 2 Comments »

On completing a minimal install of Centos, you often find that there are quite a few unwanted services started by default.

I like to disable the folowing with the commands:

/sbin/chkconfig xfs off
/sbin/chkconfig isdn off
/sbin/chkconfig gpm off
/sbin/chkconfig pcmcia off
/sbin/chkconfig sendmail off
/sbin/chkconfig cups off
/sbin/chkconfig portmap off
/sbin/chkconfig nfslock off
/sbin/chkconfig netfs off
/sbin/chkconfig rpcgssd off
/sbin/chkconfig rpcidmapd off
/sbin/chkconfig autofs off
/sbin/chkconfig lm_sensors off

Share/Save/Bookmark